# Forecasting Model Development (Kaggle Edition)
## Multi-Agent Economic Forecasting System - Notebook 3

**Objective**: Develop and evaluate forecasting models for economic indicators using statistical methods and the Forecasting Specialist agent.

### ‚ö†Ô∏è Kaggle Setup:
1. **Dataset**: Ensure your updated `src` (with the fixed BEAClient/StatisticalTools) is attached.
2. **Secrets**: Ensure `BEA_API_KEY` and `GOOGLE_API_KEY` are set in Add-ons -> Secrets.

### What You'll Learn:
- ARIMA model implementation on **Real GDP Data**
- Exponential smoothing methods for time series forecasting
- Ensemble forecasting techniques for improved accuracy
- Model performance evaluation and comparison
- Using the Forecasting Specialist agent for automated forecasting

## 1. Setup and Data Preparation

In [1]:
# Install required packages
!pip install -q pandas numpy matplotlib seaborn plotly requests python-dotenv statsmodels scikit-learn google-adk

import os
import sys
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import asyncio
import warnings
warnings.filterwarnings('ignore')

# --- KAGGLE PATH FIX ---
# Auto-detect the folder containing 'src' and add it to path
found_path = None
for root, dirs, files in os.walk('/kaggle/input'):
    if 'src' in dirs:
        found_path = root
        break

if found_path:
    sys.path.append(found_path)
    print(f"‚úÖ Added path: {found_path}")
else:
    print("‚ö†Ô∏è 'src' folder not found. Please check Input Dataset.")

# Import components (Using src. prefix if needed)
try:
    from src.tools.statistical_tools import StatisticalTools
    from src.tools.bea_client import BEAClient
    from google.adk.models.google_llm import Gemini
    from google.genai import types
    print("‚úÖ Custom components imported successfully")
except ImportError as e:
    print(f"‚ùå Import Error: {e}")
    print("Trying without 'src.' prefix...")
    from tools.statistical_tools import StatisticalTools
    from tools.bea_client import BEAClient

# Setup visualization
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

‚úÖ Added path: /kaggle/input/src-kaggle-ready-notebook-03
‚úÖ Custom components imported successfully


In [2]:
# --- ROBUST AGENT DEFINITION (Fixes Import Errors) ---
# We define the Agent manually here to prevent crashes from Google SDK version mismatches

class ForecastingSpecialistAgent:
    """
    Agent responsible for time series forecasting.
    Manually defined for Kaggle stability.
    """
    def __init__(self, model):
        self.model = model
        self.stat_tools = StatisticalTools()
        
        # Simple wrapper for tools so the notebook can inspect them
        self.tools = [
            type('Tool', (), {'name': 'forecast_gdp'}),
            type('Tool', (), {'name': 'build_arima_model'}),
            type('Tool', (), {'name': 'generate_ensemble_forecast'})
        ]
        self.agent = type('obj', (), {'tools': self.tools})

    async def forecast_gdp(self, data, horizon=4):
        try:
            df = pd.DataFrame(data) if not isinstance(data, pd.DataFrame) else data
            result = self.stat_tools.forecast_arima(df, periods=horizon)
            if result['status'] == 'success':
                next_val = result['forecasts'][0]['point_forecast']
                return {
                    'status': 'success',
                    'horizon': horizon,
                    'next_quarter_prediction': f"{next_val:,.2f}",
                    'confidence': 0.95
                }
            return {'status': 'error', 'message': 'Forecasting failed'}
        except Exception as e:
            return {'status': 'error', 'message': str(e)}

    async def build_arima_model(self, data):
        try:
            df = pd.DataFrame(data) if not isinstance(data, pd.DataFrame) else data
            return self.stat_tools.build_arima_model(df)
        except Exception as e:
            return {'status': 'error', 'message': str(e)}

    async def generate_ensemble_forecast(self, data):
        try:
            df = pd.DataFrame(data) if not isinstance(data, pd.DataFrame) else data
            # Use a shorter period for demo
            return self.stat_tools.ensemble_forecast(df, periods=4)
        except Exception as e:
            return {'status': 'error', 'message': str(e)}

# --- INITIALIZE COMPONENTS ---
stat_tools = StatisticalTools()

# Load Secrets
from kaggle_secrets import UserSecretsClient
try:
    user_secrets = UserSecretsClient()
    bea_api_key = user_secrets.get_secret("BEA_API_KEY")
    google_api_key = user_secrets.get_secret("GOOGLE_API_KEY")
    print("‚úÖ API keys loaded from Kaggle Secrets")
except:
    bea_api_key = os.getenv('BEA_API_KEY')
    google_api_key = os.getenv('GOOGLE_API_KEY')
    print("‚ö†Ô∏è Secrets not found, checking env vars")

model = None
if google_api_key:
    model = Gemini(model="gemini-2.0-flash-exp", api_key=google_api_key)

# Initialize Agent (works even without model for math tools)
forecasting_agent = ForecastingSpecialistAgent(model)
print("ü§ñ Forecasting Specialist Agent initialized")

‚úÖ API keys loaded from Kaggle Secrets
ü§ñ Forecasting Specialist Agent initialized


## 2. Load and Prepare Forecasting Data

In [3]:
# --- DATA CLEANING HELPER (Updated with 30-Year Filter) ---
def clean_economic_data(df):
    """Clean and transform economic data for analysis"""
    clean_df = df.copy()

    # 1. FILTER: Keep only the main 'Gross domestic product' line (LineNumber 1)
    if 'LineNumber' in clean_df.columns:
        clean_df = clean_df[clean_df['LineNumber'].astype(str) == '1']

    # 2. Handle string numbers with commas
    if 'DataValue' in clean_df.columns:
        clean_df['DataValue'] = clean_df['DataValue'].astype(str).str.replace(',', '', regex=False)
        clean_df['DataValue'] = pd.to_numeric(clean_df['DataValue'], errors='coerce')

    # 3. Handle Quarter Dates (e.g., 1947Q1)
    if 'TimePeriod' in clean_df.columns:
        clean_df['TimePeriod'] = clean_df['TimePeriod'].astype(str)
        clean_df['TimePeriod'] = (clean_df['TimePeriod']
                                 .str.replace('Q1', '-01-01')
                                 .str.replace('Q2', '-04-01')
                                 .str.replace('Q3', '-07-01')
                                 .str.replace('Q4', '-10-01'))
        clean_df['TimePeriod'] = pd.to_datetime(clean_df['TimePeriod'], errors='coerce')

    # 4. FILTER: Limit to Last 30 Years
    if 'TimePeriod' in clean_df.columns:
        # Calculate cutoff date (Today - 30 years)
        cutoff_date = pd.Timestamp.now() - pd.DateOffset(years=30)
        clean_df = clean_df[clean_df['TimePeriod'] >= cutoff_date]
        print(f"   Date Filter Applied: {cutoff_date.date()} to Present")

    clean_df = clean_df.dropna(subset=['DataValue', 'TimePeriod']).sort_values('TimePeriod')
    return clean_df

# --- FETCH REAL DATA ---
print("üì• Fetching REAL GDP Data for Forecasting...")
try:
    if not bea_api_key:
        raise ValueError("Missing BEA API Key")
        
    bea_client = BEAClient(bea_api_key)
    raw_gdp = bea_client.get_gdp_data()
    
    if raw_gdp.get('BEAAPI', {}).get('Results', {}).get('Data'):
        gdp_df = pd.DataFrame(raw_gdp['BEAAPI']['Results']['Data'])
        forecast_data = clean_economic_data(gdp_df)
        forecast_data['Series'] = 'GDP'
        print(f"‚úÖ Successfully loaded {len(forecast_data)} Real GDP records (Last 30 Years)")
    else:
        raise ValueError("No data returned from API")

except Exception as e:
    print(f"‚ùå Failed to fetch real data: {e}")
    print("‚ö†Ô∏è Falling back to synthetic data...")
    # Fallback Generator
    dates = pd.date_range(end=pd.Timestamp.now(), periods=120, freq='Q') # 30 years * 4 quarters
    forecast_data = pd.DataFrame({
        'TimePeriod': dates,
        'DataValue': np.linspace(100, 200, len(dates)) + np.random.normal(0, 2, len(dates)),
        'Series': 'GDP_Synthetic'
    })

# Split into training and test sets (80/20 split)
split_point = int(len(forecast_data) * 0.8)
train_data = forecast_data.iloc[:split_point]
test_data = forecast_data.iloc[split_point:]

print(f"\nüìà Data split for model evaluation:")
print(f"   Training: {len(train_data)} quarters")
print(f"   Testing:  {len(test_data)} quarters")

forecast_data.tail()

INFO:src.tools.bea_client:Requesting BEA Table: T10105


üì• Fetching REAL GDP Data for Forecasting...
   Date Filter Applied: 1995-11-29 to Present
‚úÖ Successfully loaded 118 Real GDP records (Last 30 Years)

üìà Data split for model evaluation:
   Training: 94 quarters
   Testing:  24 quarters


Unnamed: 0,TableName,SeriesCode,LineNumber,LineDescription,TimePeriod,METRIC_NAME,CL_UNIT,UNIT_MULT,DataValue,NoteRef,Series
309,T10105,A191RC,1,Gross domestic product,2024-04-01,Current Dollars,Level,6,29147044,T10105,GDP
310,T10105,A191RC,1,Gross domestic product,2024-07-01,Current Dollars,Level,6,29511664,T10105,GDP
311,T10105,A191RC,1,Gross domestic product,2024-10-01,Current Dollars,Level,6,29825182,T10105,GDP
312,T10105,A191RC,1,Gross domestic product,2025-01-01,Current Dollars,Level,6,30042113,T10105,GDP
313,T10105,A191RC,1,Gross domestic product,2025-04-01,Current Dollars,Level,6,30485729,T10105,GDP


In [4]:
# Visualize the complete dataset
fig = go.Figure()

# Training data
fig.add_trace(go.Scatter(
    x=train_data['TimePeriod'], y=train_data['DataValue'],
    mode='lines', name='Training Data',
    line=dict(color='#1f77b4', width=3)
))

# Test data
fig.add_trace(go.Scatter(
    x=test_data['TimePeriod'], y=test_data['DataValue'],
    mode='lines', name='Test Data',
    line=dict(color='#ff7f0e', width=3)
))

# Split point
split_date = train_data['TimePeriod'].max()

# --- FIX: Convert Timestamp to numeric (milliseconds) ---
# This bypasses the Pandas arithmetic error
split_date_numeric = split_date.timestamp() * 1000

fig.add_vline(x=split_date_numeric, line_dash="dash", line_color="red",
              annotation_text="Train/Test Split", annotation_position="top left")

fig.update_layout(
    title='Economic Data for Forecasting (Train/Test Split)',
    xaxis_title='Time Period',
    yaxis_title='GDP Value',
    height=500,
    showlegend=True
)

fig.show()

## 3. ARIMA Forecasting

In [5]:
# Build and evaluate ARIMA model
print("üîÆ Developing ARIMA Forecasting Model...")

# First, build the ARIMA model on training data
arima_model = stat_tools.build_arima_model(train_data, auto_select=True, max_order=3)

if arima_model['status'] == 'success':
    print("‚úÖ ARIMA Model Built Successfully:")
    print(f"   Best Order: {arima_model.get('best_order', 'N/A')}")
    
    # Safely get summary dictionary
    summary = arima_model.get('summary', {})
    print(f"   AIC: {summary.get('aic', 0):.2f}")
    print(f"   BIC: {summary.get('bic', 0):.2f}")
    print(f"   Log-Likelihood: {summary.get('log_likelihood', 0):.2f}")

    # Show key parameters
    print(f"\nüìä Model Parameters:")
    params = arima_model.get('parameters', {})
    for param, value in list(params.items())[:5]:  # Show first 5 parameters
        print(f"   {param}: {value:.4f}")

    # Model accuracy
    accuracy = arima_model.get('forecast_accuracy', {})
    print(f"\nüéØ Training Accuracy:")
    
    # --- FIX: Manual checks to prevent crashing on 'N/A' ---
    mae = accuracy.get('mae')
    rmse = accuracy.get('rmse')
    mape = accuracy.get('mape')
    r2 = accuracy.get('r2')

    # Only format if the value is a number (float/int), otherwise print 'N/A'
    if isinstance(mae, (int, float)):
        print(f"   MAE: {mae:.2f}")
    else:
        print(f"   MAE: N/A")

    if isinstance(rmse, (int, float)):
        print(f"   RMSE: {rmse:.2f}")
    else:
        print(f"   RMSE: N/A")

    if isinstance(mape, (int, float)):
        print(f"   MAPE: {mape:.2f}%")
    else:
        print(f"   MAPE: N/A")

    if isinstance(r2, (int, float)):
        print(f"   R¬≤: {r2:.3f}")
    else:
        print(f"   R¬≤: N/A")

else:
    print("‚ùå ARIMA model building failed")
    print(arima_model)

üîÆ Developing ARIMA Forecasting Model...
‚úÖ ARIMA Model Built Successfully:
   Best Order: (1, 1, 1)
   AIC: 2399.28
   BIC: 2406.88
   Log-Likelihood: -1196.64

üìä Model Parameters:
   ar.L1: 1.0000
   ma.L1: -0.9989
   sigma2: 7747266695.8774

üéØ Training Accuracy:
   MAE: 149965.76
   RMSE: 816867.57
   MAPE: N/A
   R¬≤: N/A


In [6]:
# Generate forecasts using ARIMA
print("\nüìà Generating ARIMA Forecasts...")
arima_forecast = stat_tools.forecast_arima(train_data, periods=len(test_data))

if arima_forecast['status'] == 'success':
    print("‚úÖ ARIMA Forecasts Generated:")
    print(f"   Forecast Periods: {len(arima_forecast['forecasts'])}")

    # Show first few forecasts
    print(f"\nüîÆ Sample Forecasts:")
    for i, forecast in enumerate(arima_forecast['forecasts'][:3]):
        print(f"   Period {forecast['period_ahead']}: {forecast['point_forecast']:.2f} "
              f"(CI: {forecast['confidence_lower']:.2f} - {forecast['confidence_upper']:.2f})")

    # Calculate test accuracy
    actual_values = test_data['DataValue'].values
    predicted_values = [f['point_forecast'] for f in arima_forecast['forecasts']]

    from sklearn.metrics import mean_absolute_error, mean_squared_error
    test_mae = mean_absolute_error(actual_values, predicted_values)
    test_rmse = np.sqrt(mean_squared_error(actual_values, predicted_values))

    print(f"\nüéØ Test Set Accuracy:")
    print(f"   MAE: {test_mae:.2f}")
    print(f"   RMSE: {test_rmse:.2f}")
else:
    print("‚ùå ARIMA forecasting failed")
    print(arima_forecast)


üìà Generating ARIMA Forecasts...
‚úÖ ARIMA Forecasts Generated:
   Forecast Periods: 24

üîÆ Sample Forecasts:
   Period 1: 21543053.05 (CI: 21370041.51 - 21716064.59)
   Period 2: 21688163.78 (CI: 21442661.18 - 21933666.38)
   Period 3: 21833270.20 (CI: 21531582.24 - 22134958.16)

üéØ Test Set Accuracy:
   MAE: 2732021.25
   RMSE: 3326818.86


In [7]:
# Visualize ARIMA forecasts
if arima_forecast['status'] == 'success':
    fig = go.Figure()

    # Training data
    fig.add_trace(go.Scatter(
        x=train_data['TimePeriod'], y=train_data['DataValue'],
        mode='lines', name='Training Data',
        line=dict(color='#1f77b4', width=3)
    ))

    # Test data (actual)
    fig.add_trace(go.Scatter(
        x=test_data['TimePeriod'], y=test_data['DataValue'],
        mode='lines', name='Actual Test Data',
        line=dict(color='#2ca02c', width=3)
    ))

    # Forecasts
    # NOTE: Fix date handling for real data
    forecast_dates = pd.to_datetime([f['period'] for f in arima_forecast['forecasts']])
    forecast_values = [f['point_forecast'] for f in arima_forecast['forecasts']]
    confidence_lower = [f['confidence_lower'] for f in arima_forecast['forecasts']]
    confidence_upper = [f['confidence_upper'] for f in arima_forecast['forecasts']]

    fig.add_trace(go.Scatter(
        x=forecast_dates, y=forecast_values,
        mode='lines', name='ARIMA Forecast',
        line=dict(color='#ff7f0e', width=3, dash='dash')
    ))

    # Confidence interval
    fig.add_trace(go.Scatter(
        x=pd.concat([pd.Series(forecast_dates), pd.Series(forecast_dates)[::-1]]),
        y=pd.concat([pd.Series(confidence_upper), pd.Series(confidence_lower)[::-1]]),
        fill='toself',
        fillcolor='rgba(255, 127, 14, 0.2)',
        line=dict(color='rgba(255,255,255,0)'),
        name='95% Confidence Interval'
    ))

    fig.update_layout(
        title='ARIMA Model Forecast vs Actual',
        xaxis_title='Time Period',
        yaxis_title='GDP Value',
        height=500,
        showlegend=True
    )

    fig.show()
else:
    print("‚ùå Cannot visualize forecasts - ARIMA forecasting failed")

## 4. Ensemble Forecasting

In [8]:
# Generate ensemble forecast combining multiple methods
print("üîÑ Developing Ensemble Forecast...")
ensemble_forecast = stat_tools.ensemble_forecast(train_data, periods=len(test_data))

if ensemble_forecast['status'] == 'success':
    print("‚úÖ Ensemble Forecast Generated:")
    print(f"   Methods used: {', '.join(ensemble_forecast['methods_used'])}")

    # Show model weights
    print(f"\n‚öñÔ∏è Model Weights:")
    for method, weight in ensemble_forecast['weights'].items():
        print(f"   {method}: {weight:.3f}")

    # Calculate ensemble accuracy
    ensemble_predictions = ensemble_forecast['ensemble_forecast']
    ensemble_mae = mean_absolute_error(actual_values, ensemble_predictions)
    ensemble_rmse = np.sqrt(mean_squared_error(actual_values, ensemble_predictions))

    print(f"\nüéØ Ensemble Test Accuracy:")
    print(f"   MAE: {ensemble_mae:.2f}")
    print(f"   RMSE: {ensemble_rmse:.2f}")

    # Compare with individual methods
    if arima_forecast['status'] == 'success':
        improvement_mae = ((test_mae - ensemble_mae) / test_mae) * 100
        improvement_rmse = ((test_rmse - ensemble_rmse) / test_rmse) * 100

        print(f"\nüìä Improvement vs ARIMA:")
        print(f"   MAE improvement: {improvement_mae:.1f}%")
        print(f"   RMSE improvement: {improvement_rmse:.1f}%")
else:
    print("‚ùå Ensemble forecasting failed")
    print(ensemble_forecast)

üîÑ Developing Ensemble Forecast...
‚úÖ Ensemble Forecast Generated:
   Methods used: ARIMA, Moving Average, Naive

‚öñÔ∏è Model Weights:
   ARIMA: 0.500
   Moving Average: 0.300
   Naive: 0.200

üéØ Ensemble Test Accuracy:
   MAE: 3651612.59
   RMSE: 4416973.08

üìä Improvement vs ARIMA:
   MAE improvement: -33.7%
   RMSE improvement: -32.8%


In [9]:
# Visualize ensemble forecast comparison
if (arima_forecast['status'] == 'success' and
    ensemble_forecast['status'] == 'success'):

    fig = go.Figure()

    # Training data
    fig.add_trace(go.Scatter(
        x=train_data['TimePeriod'], y=train_data['DataValue'],
        mode='lines', name='Training Data',
        line=dict(color='#1f77b4', width=2),
        opacity=0.7
    ))

    # Test data (actual)
    fig.add_trace(go.Scatter(
        x=test_data['TimePeriod'], y=test_data['DataValue'],
        mode='lines', name='Actual',
        line=dict(color='#2ca02c', width=4)
    ))

    # ARIMA forecast
    fig.add_trace(go.Scatter(
        x=forecast_dates, y=forecast_values,
        mode='lines', name='ARIMA Forecast',
        line=dict(color='#ff7f0e', width=3, dash='dash')
    ))

    # Ensemble forecast
    fig.add_trace(go.Scatter(
        x=forecast_dates, y=ensemble_predictions,
        mode='lines', name='Ensemble Forecast',
        line=dict(color='#d62728', width=3, dash='dot')
    ))

    fig.update_layout(
        title='Forecast Method Comparison: ARIMA vs Ensemble',
        xaxis_title='Time Period',
        yaxis_title='GDP Value',
        height=500,
        showlegend=True
    )

    fig.show()
else:
    print("‚ùå Cannot visualize forecast comparison")

## 5. Model Performance Evaluation

In [10]:
# Comprehensive model evaluation
def evaluate_forecast_models(actual, predictions_dict):
    """Evaluate multiple forecasting models"""

    from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

    results = {}

    for model_name, predictions in predictions_dict.items():
        if len(predictions) == len(actual):
            mae = mean_absolute_error(actual, predictions)
            rmse = np.sqrt(mean_squared_error(actual, predictions))
            mape = np.mean(np.abs((actual - predictions) / actual)) * 100
            r2 = r2_score(actual, predictions)

            results[model_name] = {
                'MAE': mae,
                'RMSE': rmse,
                'MAPE': mape,
                'R2': r2
            }

    return results

# Collect predictions from all models
all_predictions = {}

if arima_forecast['status'] == 'success':
    all_predictions['ARIMA'] = [f['point_forecast'] for f in arima_forecast['forecasts']]

if ensemble_forecast['status'] == 'success':
    all_predictions['Ensemble'] = ensemble_forecast['ensemble_forecast']

# Add simple baseline models
# 1. Naive forecast (last value)
naive_forecast = [train_data['DataValue'].iloc[-1]] * len(test_data)
all_predictions['Naive'] = naive_forecast

# 2. Moving average
ma_forecast = [train_data['DataValue'].tail(4).mean()] * len(test_data)
all_predictions['Moving Average'] = ma_forecast

# Evaluate all models
if all_predictions:
    evaluation_results = evaluate_forecast_models(actual_values, all_predictions)

    print("üìä FORECAST MODEL COMPARISON")
    print("=" * 60)

    # Create comparison table
    comparison_data = []
    for model_name, metrics in evaluation_results.items():
        comparison_data.append([
            model_name,
            f"{metrics['MAE']:.2f}",
            f"{metrics['RMSE']:.2f}",
            f"{metrics['MAPE']:.2f}%",
            f"{metrics['R2']:.3f}"
        ])

    comparison_df = pd.DataFrame(comparison_data,
                                columns=['Model', 'MAE', 'RMSE', 'MAPE', 'R¬≤'])

    print(comparison_df.to_string(index=False))

    # Identify best model
    best_model = min(evaluation_results.items(), key=lambda x: x[1]['RMSE'])
    print(f"\nüèÜ Best Model: {best_model[0]} (RMSE: {best_model[1]['RMSE']:.2f})")
else:
    print("‚ùå No predictions available for evaluation")

üìä FORECAST MODEL COMPARISON
         Model        MAE       RMSE   MAPE     R¬≤
         ARIMA 2732021.25 3326818.86  9.91% -0.105
      Ensemble 3651612.59 4416973.08 13.23% -0.948
         Naive 4440304.00 5355437.32 16.08% -1.864
Moving Average 4753257.90 5634467.69 17.29% -2.170

üèÜ Best Model: ARIMA (RMSE: 3326818.86)


In [11]:
# Visualize model performance comparison
if evaluation_results:
    models = list(evaluation_results.keys())
    rmse_values = [evaluation_results[model]['RMSE'] for model in models]
    mae_values = [evaluation_results[model]['MAE'] for model in models]

    fig = make_subplots(rows=1, cols=2,
                       subplot_titles=('RMSE Comparison', 'MAE Comparison'))

    # RMSE plot
    fig.add_trace(
        go.Bar(x=models, y=rmse_values, name='RMSE',
               marker_color=['#1f77b4' if model != best_model[0] else '#ff7f0e'
                           for model in models]),
        row=1, col=1
    )

    # MAE plot
    fig.add_trace(
        go.Bar(x=models, y=mae_values, name='MAE',
               marker_color=['#1f77b4' if model != best_model[0] else '#ff7f0e'
                           for model in models]),
        row=1, col=2
    )

    fig.update_layout(
        title_text="Forecast Model Performance Comparison",
        showlegend=False,
        height=400
    )

    fig.show()
else:
    print("‚ùå No evaluation results to visualize")

## 6. Using Forecasting Specialist Agent

In [12]:
# Test the Forecasting Specialist Agent
async def test_forecasting_agent():
    """Test the forecasting specialist agent's capabilities"""

    print("üß™ Testing Forecasting Specialist Agent...")

    # Convert data to list of dictionaries for the agent
    train_data_dict = train_data.to_dict('records')

    # Test GDP forecasting
    print("\nüî∏ Testing GDP forecasting...")
    forecast_result = await forecasting_agent.forecast_gdp(train_data_dict, horizon=8)
    if forecast_result['status'] == 'success':
        print(f"   Forecast horizon: {forecast_result.get('horizon', 'N/A')} periods")
        print(f"   Next quarter prediction: {forecast_result.get('next_quarter_prediction', 'N/A')}")
        print(f"   Confidence: {forecast_result.get('confidence', 0):.2f}")

    # Test ARIMA model building
    print("\nüî∏ Testing ARIMA model building...")
    arima_result = await forecasting_agent.build_arima_model(train_data_dict)
    if arima_result['status'] == 'success':
        print(f"   AIC: {arima_result.get('summary', {}).get('aic', 'N/A'):.2f}")
        print(f"   Model built successfully")

    # Test ensemble forecasting
    print("\nüî∏ Testing ensemble forecasting...")
    ensemble_result = await forecasting_agent.generate_ensemble_forecast(train_data_dict)
    if ensemble_result['status'] == 'success':
        print(f"   Model weights: {ensemble_result.get('weights', {})}")
        print(f"   Combined prediction calculated")

# Run the agent tests
await test_forecasting_agent()

üß™ Testing Forecasting Specialist Agent...

üî∏ Testing GDP forecasting...
   Forecast horizon: 8 periods
   Next quarter prediction: 21,543,053.05
   Confidence: 0.95

üî∏ Testing ARIMA model building...
   AIC: 2399.28
   Model built successfully

üî∏ Testing ensemble forecasting...
   Model weights: {'ARIMA': 0.5, 'Moving Average': 0.3, 'Naive': 0.2}
   Combined prediction calculated


## 7. Future Forecast Generation

In [13]:
# Generate future forecasts using the best model
print("üîÆ Generating Future Forecasts...")

future_periods = 12  # Forecast 3 years into the future

# Use ensemble method for future forecasts
future_forecast = stat_tools.ensemble_forecast(forecast_data, periods=future_periods)

if future_forecast['status'] == 'success':
    print("‚úÖ Future Forecasts Generated")

    # Create future dates
    last_date = forecast_data['TimePeriod'].max()
    future_dates = pd.date_range(start=last_date + pd.DateOffset(months=3),
                                periods=future_periods, freq='Q')

    print(f"\nüìÖ Forecast Period: {future_dates[0].strftime('%Y-%m')} to {future_dates[-1].strftime('%Y-%m')}")

    # Display key forecasts
    print(f"\nüîÆ Key Future Predictions:")
    for i, (date, prediction) in enumerate(zip(future_dates[:6], future_forecast['ensemble_forecast'][:6])):
        print(f"   {date.strftime('%Y-%m')}: {prediction:.2f}")

    # Calculate growth projections
    current_value = forecast_data['DataValue'].iloc[-1]
    one_year_growth = ((future_forecast['ensemble_forecast'][3] - current_value) / current_value) * 100
    two_year_growth = ((future_forecast['ensemble_forecast'][7] - current_value) / current_value) * 100

    print(f"\nüìà Projected Growth:")
    print(f"   1-year growth: {one_year_growth:.2f}%")
    print(f"   2-year growth: {two_year_growth:.2f}%")
else:
    print("‚ùå Future forecasting failed")
    print(future_forecast)

üîÆ Generating Future Forecasts...
‚úÖ Future Forecasts Generated

üìÖ Forecast Period: 2025-09 to 2028-06

üîÆ Key Future Predictions:
   2025-09: 30425757.56
   2025-12: 30521651.47
   2026-03: 30617543.65
   2026-06: 30713434.09
   2026-09: 30809322.79
   2026-12: 30905209.75

üìà Projected Growth:
   1-year growth: 0.75%
   2-year growth: 2.01%


In [14]:
# Visualize historical data with future forecasts
if future_forecast['status'] == 'success':
    fig = go.Figure()

    # Historical data
    fig.add_trace(go.Scatter(
        x=forecast_data['TimePeriod'], y=forecast_data['DataValue'],
        mode='lines', name='Historical Data',
        line=dict(color='#1f77b4', width=3)
    ))

    # Future forecasts
    fig.add_trace(go.Scatter(
        x=future_dates, y=future_forecast['ensemble_forecast'],
        mode='lines+markers', name='Future Forecast',
        line=dict(color='#ff7f0e', width=3, dash='dash'),
        marker=dict(size=6)
    ))

    # Current point
    fig.add_trace(go.Scatter(
        x=[last_date], y=[current_value],
        mode='markers', name='Current',
        marker=dict(color='red', size=10, symbol='star')
    ))

    fig.update_layout(
        title='Economic Forecast: Historical Data and Future Projections',
        xaxis_title='Time Period',
        yaxis_title='GDP Value',
        height=500,
        showlegend=True
    )

    fig.show()
else:
    print("‚ùå Cannot visualize future forecasts")

## 8. Summary and Next Steps

In [15]:
print("üéØ NOTEBOOK 3 SUMMARY")
print("=" * 50)

# Summary of forecasting activities
forecasting_activities = []
if arima_forecast['status'] == 'success':
    forecasting_activities.append("ARIMA Modeling")
if ensemble_forecast['status'] == 'success':
    forecasting_activities.append("Ensemble Forecasting")
if future_forecast['status'] == 'success':
    forecasting_activities.append("Future Projections")

print(f"‚úÖ Forecasting activities completed: {len(forecasting_activities)}")
for activity in forecasting_activities:
    print(f"   ‚Ä¢ {activity}")

print(f"\nüìä Models evaluated: {len(evaluation_results) if 'evaluation_results' in locals() else 0}")
if 'best_model' in locals():
    print(f"üèÜ Best performing model: {best_model[0]}")

print(f"\nüîÆ Future forecasts: {future_periods} quarters ({future_periods//4} years)")

print("\nüîú Next Steps:")
print("   1. Proceed to Notebook 4: Multi-Agent System Demo")
print("   2. Implement real-time forecasting with live data")
print("   3. Add more sophisticated models (Prophet, LSTM)")
print("   4. Implement forecast uncertainty quantification")

print("\nüí° Production Insights:")
print("   - Ensemble methods typically outperform individual models")
print("   - Regular model retraining improves forecast accuracy")
print("   - Confidence intervals provide crucial context for decisions")
print("   - The Forecasting Specialist Agent automates complex modeling tasks")

üéØ NOTEBOOK 3 SUMMARY
‚úÖ Forecasting activities completed: 3
   ‚Ä¢ ARIMA Modeling
   ‚Ä¢ Ensemble Forecasting
   ‚Ä¢ Future Projections

üìä Models evaluated: 4
üèÜ Best performing model: ARIMA

üîÆ Future forecasts: 12 quarters (3 years)

üîú Next Steps:
   1. Proceed to Notebook 4: Multi-Agent System Demo
   2. Implement real-time forecasting with live data
   3. Add more sophisticated models (Prophet, LSTM)
   4. Implement forecast uncertainty quantification

üí° Production Insights:
   - Ensemble methods typically outperform individual models
   - Regular model retraining improves forecast accuracy
   - Confidence intervals provide crucial context for decisions
   - The Forecasting Specialist Agent automates complex modeling tasks
