# <center><strong> Welcome to Data Gyani 

## Different Forecasting Methods using Facebook's Prophet Model

### What are we learning?

<table border="1" cellpadding="10">
  <thead>
    <tr>
      <th>Forecasting Method</th>
      <th>How It Works</th>
      <th>Best For</th>
      <th>Shortcomings</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Simple Forecasting</td>
      <td>
        Model is trained once using historical data and predicts multiple future steps without retraining.
      </td>
      <td>
        Short-term predictions; stationary data; short forecasting horizons.
      </td>
      <td>
        Error accumulation for long-term forecasts; no model adjustment for past mistakes.
      </td>
    </tr>
    <tr>
      <td>Recursive Forecasting</td>
      <td>
        Model predicts one time step ahead, and the prediction is fed back to predict subsequent steps in a recursive loop.
      </td>
      <td>
        Slow-moving trends; consistent relationships between data points over time.
      </td>
      <td>
        Error accumulation due to compounding prediction errors; poor performance over long horizons.
      </td>
    </tr>
    <tr>
      <td>Direct-Recursive Hybrid Forecasting</td>
      <td>
        A mix of direct forecasting for the first few steps and recursive forecasting for remaining steps.
      </td>
      <td>
        Medium to long-term forecasting, balancing trend capture and extension of forecasts.
      </td>
      <td>
        Reduces but does not eliminate error accumulation; increased model complexity.
      </td>
    </tr>
    <tr>
      <td>Rolling Window Forecasting</td>
      <td>
        Model is trained on a fixed-size window of recent data; window shifts as new data becomes available.
      </td>
      <td>
        Non-stationary data; situations with concept drift; when recent data is more relevant.
      </td>
      <td>
        High computational cost due to frequent retraining; window size tuning is critical.
      </td>
    </tr>
  </tbody>
</table>


In [2]:
# !pip install yfinance
# !pip install prophet

In [3]:
import yfinance as yf
import pandas as pd
import numpy as np
import datetime
from prophet import Prophet
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error
import plotly.graph_objs as go
from plotly.subplots import make_subplots
import warnings

warnings.filterwarnings("ignore")


In [4]:
# Define the stock symbol, start date, and end date
stock_symbol = 'AAPL'  # Using Apple share prices Just for example
start_date = '2010-01-01'
end_date = datetime.datetime.now().date()

# Fetch the stock data
stock_data = yf.download(stock_symbol, start=start_date, end=end_date)

# Display the data
stock_data.head()

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,7.6225,7.660714,7.585,7.643214,6.454504,493729600
2010-01-05,7.664286,7.699643,7.616071,7.656429,6.465665,601904800
2010-01-06,7.656429,7.686786,7.526786,7.534643,6.362819,552160000
2010-01-07,7.5625,7.571429,7.466071,7.520714,6.351056,477131200
2010-01-08,7.510714,7.571429,7.466429,7.570714,6.393281,447610800


## Simple Prophet Model

#### 1. Process data for Prophet model

In [5]:

stock_data = stock_data.reset_index()  # Reset index to move Date to a column
prophet_data = stock_data[['Date', 'Close']].rename(columns={'Date': 'ds', 'Close': 'y'})  # Format for Prophet
prophet_data.tail()

Unnamed: 0,ds,y
3689,2024-08-30,229.0
3690,2024-09-03,222.770004
3691,2024-09-04,220.850006
3692,2024-09-05,222.380005
3693,2024-09-06,220.820007


#### 2. Split data into train and validation

In [6]:

validation_size = 30  # Validation period size - Choose as per your requirement
train_data = prophet_data[:-validation_size]  # Training data
validation_data = prophet_data[-validation_size:]  # Validation data

#printing shape of our split
print("Shape of train_data:", train_data.shape)
print("Shape of validation_data:", validation_data.shape)

Shape of train_data: (3664, 2)
Shape of validation_data: (30, 2)


#### 3. Create Prophet model and train on training set

In [7]:
model = Prophet()
model.fit(train_data)

23:29:54 - cmdstanpy - INFO - Chain [1] start processing
23:29:55 - cmdstanpy - INFO - Chain [1] done processing


<prophet.forecaster.Prophet at 0x1e981ca73d0>

#### 4. Check Model fitment on training data

In [8]:
# Predict on training data for model fitment using visualization
train_forecast = model.predict(train_data[['ds']])

# Evaluation metrics
rmse = round(np.sqrt(mean_squared_error(train_data['y'], train_forecast['yhat'])),2)
mae = round(mean_absolute_error(train_data['y'], train_forecast['yhat']),2)
mape = round((1- np.mean(np.abs((train_data['y'] - train_forecast['yhat']) / train_data['y'])))* 100,2)
mda = round((np.mean((np.sign(train_data['y'].diff()) == np.sign(train_forecast['yhat'].diff())).astype(int)) * 100),2)

# Print evaluation metrics
print(f"RMSE of Training: {rmse}")
print(f"MAE of Training: {mae}")
print(f"Accuracy(1-MAPE) of Training: {mape:.2f}%")
print(f"MDA of Training: {mda:.2f}%")

# Create a Plotly figure to visualize the model fit on training data
fig = make_subplots()

# Add actual data trace (training data)
fig.add_trace(go.Scatter(
    x=train_data['ds'], 
    y=train_data['y'], 
    mode='lines', 
    name='Actual (Training)',
    line=dict(color='red')
))

# Add forecasted data trace (training fitment)
fig.add_trace(go.Scatter(
    x=train_forecast['ds'], 
    y=train_forecast['yhat'], 
    mode='lines', 
    name='Forecast (Training Fit)',
    line=dict(color='blue', dash='dash')
))

# Update layout
fig.update_layout(
    title="Prophet Model Fit on Training Data",
    xaxis_title="Date",
    yaxis_title="Close Price",
    legend=dict(x=0.01, y=0.99),
    hovermode="x unified"
)

# Show the plot
fig.show()


RMSE of Training: 7.64
MAE of Training: 4.72
Accuracy(1-MAPE) of Training: 91.04%
MDA of Training: 51.97%


#### 5. Forecast for validation period

In [9]:
future = model.make_future_dataframe(periods=60)  # Create a dataframe to hold forecast
forecast_1 = model.predict(future)

# Extract the forecasted values for the validation period
forecast_validation_1 = forecast_1[['ds', 'yhat']].iloc[-validation_size:]


In [10]:
# Align forecast with actual trading days in validation set
forecast_filtered_1 = forecast_1[forecast_1['ds'].isin(validation_data['ds'])]

# Merge with the actual validation data for comparison
validation_data_1 = validation_data.merge(forecast_filtered_1, on='ds', how='left')

# Drop any rows with NaN values
validation_data_1.dropna(inplace=True)
validation_data_1[["ds", "y", "yhat"]].tail(10)


Unnamed: 0,ds,y,yhat
20,2024-08-23,226.839996,195.936366
21,2024-08-26,227.179993,196.446714
22,2024-08-27,228.029999,196.408473
23,2024-08-28,226.490005,196.430744
24,2024-08-29,229.789993,196.370836
25,2024-08-30,229.0,196.302662
26,2024-09-03,222.770004,196.208799
27,2024-09-04,220.850006,196.094691
28,2024-09-05,222.380005,195.905065
29,2024-09-06,220.820007,195.71587


In [11]:
# Evaluation metrics
rmse = round(np.sqrt(mean_squared_error(validation_data_1['y'], validation_data_1['yhat'])), 2)
mae = round(mean_absolute_error(validation_data_1['y'], validation_data_1['yhat']), 2)
mape = round((1 - np.mean(np.abs((validation_data_1['y'] - validation_data_1['yhat']) / validation_data_1['y'])))*100, 2)

# Mean Directional Accuracy (MDA)
mda = round(np.mean((np.sign(validation_data_1['y'].diff()) == np.sign(validation_data_1['yhat'].diff())).astype(int)) * 100, 2)

# Print evaluation metrics
print(f"RMSE of Validation: {rmse}")
print(f"MAE of Validation: {mae}")
print(f"Accuracy(1-MAPE) of Validation: {mape:.2f}%")
print(f"MDA of Validation: {mda:.2f}%")

# Plot the actual vs forecasted values for the validation period
def plot_forecast_vs_actual(validation_data_1):
    fig = go.Figure()

    # Add the actual closing prices to the plot
    fig.add_trace(go.Scatter(x=validation_data_1['ds'], 
                             y=validation_data_1['y'], 
                             mode='lines', 
                             name='Actual',
                             line=dict(color='red')))

    # Add the forecasted values (yhat) to the plot
    fig.add_trace(go.Scatter(x=validation_data_1['ds'], 
                             y=validation_data_1['yhat'], 
                             mode='lines', 
                             name='Forecast',
                             line=dict(color='blue', dash='dash')))

    # Set the layout of the plot
    fig.update_layout(title='Forecast vs Actuals for Validation Period',
                      xaxis_title='Date',
                      yaxis_title='Close Price',
                      legend_title='Legend')

    # Display the plot
    fig.show()

# Call the function to plot the forecast vs actual values
plot_forecast_vs_actual(validation_data_1)

RMSE of Validation: 26.77
MAE of Validation: 26.29
Accuracy(1-MAPE) of Validation: 88.17%
MDA of Validation: 56.67%


## Simple Recursive Method for forecasting using Prophet
This is simple and most commonly used forecasting technique where we use model to make one-step-ahead forecasts and then uses the forecasted values as inputs for future predictions.
#### Let's Dive in!!

In [12]:
# Train the Prophet model
def train_prophet(train_df):
    recursive_model = Prophet()
    return recursive_model.fit(train_df)

# Make predictions with recursive technique 
def make_predictions(recursive_model, df, forecast_period):
    # Create a future dataframe for trading days only (business days)
    future = pd.date_range(start=df['ds'].max(), periods=forecast_period + 1, freq='B')[1:]
    future = pd.DataFrame(future, columns=['ds'])
    predictions = pd.DataFrame()  # Initialize as an empty dataframe

    for i in range(forecast_period):
        forecast = recursive_model.predict(future.head(1))  # Only forecast the next point
        predictions = pd.concat([predictions, forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']]])

        next_point = pd.DataFrame({
            'ds': [future['ds'].iloc[0]],
            'y': [forecast['yhat'].iloc[0]]
        })
        df = pd.concat([df, next_point]).reset_index(drop=True)
        # print(df)

        # Recreate the future dataframe with business days (trading days)
        future = pd.date_range(start=df['ds'].max(), periods=2, freq='B')[1:]
        future = pd.DataFrame(future, columns=['ds'])

    return predictions


In [13]:
# Train Prophet model on training data
recursive_model = train_prophet(train_data)

23:29:58 - cmdstanpy - INFO - Chain [1] start processing
23:29:59 - cmdstanpy - INFO - Chain [1] done processing


In [14]:

# # Predict on training data for model fitment
# train_forecast_2 = recursive_model.predict(train_data[['ds']])

# # Evaluation metrics for training set
# rmse_train = round(np.sqrt(mean_squared_error(train_data['y'], train_forecast_2['yhat'])), 2)
# mae_train = round(mean_absolute_error(train_data['y'], train_forecast_2['yhat']), 2)
# mape_train = round((1 - np.mean(np.abs((train_data['y'] - train_forecast_2['yhat']) / train_data['y'])))*100, 2)
# mda_train = round(np.mean((np.sign(train_data['y'].diff()) == np.sign(train_forecast_2['yhat'].diff())).astype(int)) * 100, 2)

# # Print evaluation metrics for training set
# print(f"RMSE of Training: {rmse_train}")
# print(f"MAE of Training: {mae_train}")
# print(f"Accuracy(1-MAPE) of Training: {mape_train:.2f}%")
# print(f"MDA of Training: {mda_train:.2f}%")

# # Create a Plotly figure to visualize the model fit on training data
# fig = make_subplots()

# # Add actual data trace (training data)
# fig.add_trace(go.Scatter(
#     x=train_data['ds'], 
#     y=train_data['y'], 
#     mode='lines', 
#     name='Actual (Training)',
#     line=dict(color='red')
# ))

# # Add forecasted data trace (training fitment)
# fig.add_trace(go.Scatter(
#     x=train_forecast['ds'], 
#     y=train_forecast['yhat'], 
#     mode='lines', 
#     name='Forecast (Training Fit)',
#     line=dict(color='blue', dash='dash')
# ))

# # Update layout
# fig.update_layout(
#     title="Prophet Model Fit on Training Data",
#     xaxis_title="Date",
#     yaxis_title="Close Price",
#     legend=dict(x=0.01, y=0.99),
#     hovermode="x unified"
# )

# # Show the plot
# fig.show()

In [15]:
# Make predictions for the validation data using the recursive technique
forecast_period = validation_size  # Using the validation size for the forecast period
forecast_2 = make_predictions(recursive_model, train_data, forecast_period)

In [16]:
# Align forecast with actual trading days in the validation set
forecast_filtered_2 = forecast_2[forecast_2['ds'].isin(validation_data['ds'])]

# Merge forecast with the actual validation data for comparison
validation_data_2 = validation_data.merge(forecast_filtered_2, on='ds', how='left')

# Drop any rows with NaN values,if any
validation_data_2.dropna(inplace=True)

validation_data_2.tail(10)

Unnamed: 0,ds,y,yhat,yhat_lower,yhat_upper
19,2024-08-22,224.529999,195.870157,186.004518,205.427793
20,2024-08-23,226.839996,195.936366,186.691153,205.702194
21,2024-08-26,227.179993,196.446714,186.859588,206.682606
22,2024-08-27,228.029999,196.408473,186.799718,206.102923
23,2024-08-28,226.490005,196.430744,186.579182,206.681869
24,2024-08-29,229.789993,196.370836,186.595265,206.262313
25,2024-08-30,229.0,196.302662,186.434747,205.686834
26,2024-09-03,222.770004,196.208799,186.409658,205.840639
27,2024-09-04,220.850006,196.094691,187.129478,206.109512
28,2024-09-05,222.380005,195.905065,186.224732,205.552784


In [17]:

# Evaluation metrics for validation set
rmse_2 = round(np.sqrt(mean_squared_error(validation_data_2['y'], validation_data_2['yhat'])), 2)
mae_2 = round(mean_absolute_error(validation_data_2['y'], validation_data_2['yhat']), 2)
mape_2 = round((1 - np.mean(np.abs((validation_data_2['y'] - validation_data_2['yhat']) / validation_data_2['y'])))*100, 2)
mda_2 = round(np.mean((np.sign(validation_data_2['y'].diff()) == np.sign(validation_data_2['yhat'].diff())).astype(int)) * 100, 2)

# Print evaluation metrics
print(f"RMSE of Validation: {rmse_2}")
print(f"MAE of Validation: {mae_2}")
print(f"Accuracy(1-MAPE) of Validation: {mape_2:.2f}%")
print(f"MDA of Validation: {mda_2:.2f}%")

# Plot the actual vs forecasted values for the validation period
def plot_forecast_vs_actual(validation_data_2):
    fig = go.Figure()

    # Add the actual closing prices to the plot
    fig.add_trace(go.Scatter(x=validation_data_2['ds'], 
                             y=validation_data_2['y'], 
                             mode='lines', 
                             name='Actual',
                             line=dict(color='red')))

    # Add the forecasted values (yhat) to the plot
    fig.add_trace(go.Scatter(x=validation_data_2['ds'], 
                             y=validation_data_2['yhat'], 
                             mode='lines', 
                             name='Forecast',
                             line=dict(color='blue', dash='dash')))

    # Set the layout of the plot
    fig.update_layout(title='Forecast vs Actuals for Validation Period',
                      xaxis_title='Date',
                      yaxis_title='Close Price',
                      legend_title='Legend')

    # Display the plot
    fig.show()

# Call the function to plot the forecast vs actual values
plot_forecast_vs_actual(validation_data_2)


RMSE of Validation: 26.83
MAE of Validation: 26.33
Accuracy(1-MAPE) of Validation: 88.16%
MDA of Validation: 55.17%


## Direct-Recursive Hybrid

In [18]:

# Train the Prophet model
def train_prophet(train_df):
    recursive_direct_model = Prophet()
    return recursive_direct_model.fit(train_df)

# Train a Linear Regression model (for fine-tuning)- This is where hybrid gets initialized
def train_direct_model(train_df):
    X = np.arange(len(train_df)).reshape(-1, 1)  # Create an index feature
    y = train_df['y'].values
    direct_model = LinearRegression()
    direct_model.fit(X, y)
    return direct_model

# Make predictions with recursive technique combined with a direct fine-tuning model
def make_hybrid_predictions(recursive_direct_model, direct_model, df, forecast_period):
    future = pd.date_range(start=df['ds'].max(), periods=forecast_period + 1, freq='B')[1:]
    future = pd.DataFrame(future, columns=['ds'])
    predictions = pd.DataFrame()  # Initialize as an empty dataframe

    for i in range(forecast_period):
        forecast = recursive_direct_model.predict(future.head(1))  # Only forecast the next point
        next_point = pd.DataFrame({
            'ds': [future['ds'].iloc[0]],
            'y': [forecast['yhat'].iloc[0]]
        })
        
        # Add fine-tuning using the direct model
        X_future = np.array([[len(df) + i]])  # Use the updated index for fine-tuning
        fine_tuned_value = direct_model.predict(X_future)[0]
        next_point['y'] = (next_point['y'] + fine_tuned_value) / 2  # Average both predictions

        predictions = pd.concat([predictions, forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']]])
        df = pd.concat([df, next_point]).reset_index(drop=True)

        # Recreate the future dataframe with business days (trading days)
        future = pd.date_range(start=df['ds'].max(), periods=2, freq='B')[1:]
        future = pd.DataFrame(future, columns=['ds'])

    return predictions


In [19]:

# Train Prophet model on training data
recursive_model = train_prophet(train_data)

# Train Linear Regression (Direct Model) on training data
direct_model = train_direct_model(train_data)

# Predict on training data for model fitment
train_forecast_3 = recursive_model.predict(train_data[['ds']])



23:30:00 - cmdstanpy - INFO - Chain [1] start processing
23:30:01 - cmdstanpy - INFO - Chain [1] done processing


In [20]:

# # Evaluation metrics for training set
# rmse_train_3 = round(np.sqrt(mean_squared_error(train_data['y'], train_forecast_3['yhat'])), 2)
# mae_train_3 = round(mean_absolute_error(train_data['y'], train_forecast_3['yhat']), 2)
# mape_train_3 = round((1 - np.mean(np.abs((train_data['y'] - train_forecast_3['yhat']) / train_data['y'])))*100, 2)
# mda_train_3 = round(np.mean((np.sign(train_data['y'].diff()) == np.sign(train_forecast_3['yhat'].diff())).astype(int)) * 100, 2)

# # Print evaluation metrics for training set
# print(f"RMSE of Training: {rmse_train_3}")
# print(f"MAE of Training: {mae_train_3}")
# print(f"Accuracy(1-MAPE) of Training: {mape_train_3:.2f}%")
# print(f"MDA of Training: {mda_train_3:.2f}%")

# # Create a Plotly figure to visualize the model fit on training data
# fig = make_subplots()

# # Add actual data trace (training data)
# fig.add_trace(go.Scatter(
#     x=train_data['ds'], 
#     y=train_data['y'], 
#     mode='lines', 
#     name='Actual (Training)',
#     line=dict(color='red')
# ))

# # Add forecasted data trace (training fitment)
# fig.add_trace(go.Scatter(
#     x=train_forecast_3['ds'], 
#     y=train_forecast_3['yhat'], 
#     mode='lines', 
#     name='Forecast (Training Fit)',
#     line=dict(color='blue', dash='dash')
# ))

# # Update layout
# fig.update_layout(
#     title="Prophet Model Fit on Training Data",
#     xaxis_title="Date",
#     yaxis_title="Close Price",
#     legend=dict(x=0.01, y=0.99),
#     hovermode="x unified"
# )

# # Show the plot
# fig.show()



In [21]:
# Make predictions for the validation data using the hybrid technique
forecast_period = validation_size  # Using same code as above
forecast_3 = make_hybrid_predictions(recursive_model, direct_model, train_data, forecast_period)

# Align forecast with actual trading days in the validation set
forecast_filtered_3 = forecast_3[forecast_3['ds'].isin(validation_data['ds'])]

# Merge forecast with the actual validation data for comparison
validation_data_3 = validation_data.merge(forecast_filtered_3, on='ds', how='left')

# Drop any rows with NaN values, if any
validation_data_3.dropna(inplace=True)

validation_data_3.tail(10)


Unnamed: 0,ds,y,yhat,yhat_lower,yhat_upper
19,2024-08-22,224.529999,195.870157,185.665453,205.224761
20,2024-08-23,226.839996,195.936366,186.482981,205.316795
21,2024-08-26,227.179993,196.446714,186.323636,206.038231
22,2024-08-27,228.029999,196.408473,186.945375,205.69866
23,2024-08-28,226.490005,196.430744,187.091143,206.444837
24,2024-08-29,229.789993,196.370836,186.628986,205.772571
25,2024-08-30,229.0,196.302662,187.336354,206.638673
26,2024-09-03,222.770004,196.208799,186.138505,205.47503
27,2024-09-04,220.850006,196.094691,186.245118,206.146483
28,2024-09-05,222.380005,195.905065,185.701852,205.538711


In [22]:

# Evaluation metrics for validation set
rmse_3 = round(np.sqrt(mean_squared_error(validation_data_3['y'], validation_data_3['yhat'])), 2)
mae_3 = round(mean_absolute_error(validation_data_3['y'], validation_data_3['yhat']), 2)
mape_3 = round((1 - np.mean(np.abs((validation_data_3['y'] - validation_data_3['yhat']) / validation_data_3['y'])))*100, 2)
mda_3 = round(np.mean((np.sign(validation_data_3['y'].diff()) == np.sign(validation_data_3['yhat'].diff())).astype(int)) * 100, 2)

# Print evaluation metrics
print(f"RMSE of Validation: {rmse_3}")
print(f"MAE of Validation: {mae_3}")
print(f"Accuracy(1-MAPE) of Validation: {mape_3:.2f}%")
print(f"MDA of Validation: {mda_3:.2f}%")

# Plot the actual vs forecasted values for the validation period
def plot_forecast_vs_actual(validation_data_3):
    fig = go.Figure()

    # Add the actual closing prices to the plot
    fig.add_trace(go.Scatter(x=validation_data_3['ds'], 
                             y=validation_data_3['y'], 
                             mode='lines', 
                             name='Actual',
                             line=dict(color='red')))

    # Add the forecasted values (yhat) to the plot
    fig.add_trace(go.Scatter(x=validation_data_3['ds'], 
                             y=validation_data_3['yhat'], 
                             mode='lines', 
                             name='Forecast',
                             line=dict(color='blue', dash='dash')))

    # Set the layout of the plot
    fig.update_layout(title='Forecast vs Actuals for Validation Period',
                      xaxis_title='Date',
                      yaxis_title='Close Price',
                      legend_title='Legend')

    # Display the plot
    fig.show()

# Call the function to plot the forecast vs actual values
plot_forecast_vs_actual(validation_data_3)


RMSE of Validation: 26.83
MAE of Validation: 26.33
Accuracy(1-MAPE) of Validation: 88.16%
MDA of Validation: 55.17%


### Rolling Window Method

In [23]:

# Define window size
window_size = 1440  # choose right Rolling window for training


# Train Prophet model on the rolling window data
def train_prophet(train_df):
    model = Prophet()
    return model.fit(train_df)

# Function for rolling window prediction
def rolling_window_forecast(data, window_size, validation_size):
    rolling_predictions = pd.DataFrame()
    
    for i in range(validation_size):
        # Define rolling window range
        train_end = len(data) - validation_size + i
        train_start = train_end - window_size
        train_data = data.iloc[train_start:train_end]
        
        # Train the model on the current window
        model = train_prophet(train_data)
        
        # Forecast the next day
        future = pd.DataFrame({'ds': [data['ds'].iloc[train_end]]})
        forecast = model.predict(future)
        
        # Store predictions
        rolling_predictions = pd.concat([rolling_predictions, forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']]])
        
        print(f"Rolling Window {i+1}/{validation_size} | Train Start: {train_start}, Train End: {train_end}")
        
    return rolling_predictions



In [24]:
# Apply rolling window forecasting
rolling_predictions = rolling_window_forecast(prophet_data, window_size, validation_size)

# Combine rolling predictions with validation data
validation_data = prophet_data[-validation_size:].reset_index(drop=True)
validation_data_4 = validation_data.merge(rolling_predictions, on='ds', how='left')

# Drop NaN values if any
validation_data_4.dropna(inplace=True)
validation_data_4.tail()

23:30:04 - cmdstanpy - INFO - Chain [1] start processing
23:30:04 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 1/30 | Train Start: 2224, Train End: 3664


23:30:04 - cmdstanpy - INFO - Chain [1] start processing
23:30:05 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 2/30 | Train Start: 2225, Train End: 3665


23:30:05 - cmdstanpy - INFO - Chain [1] start processing
23:30:05 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 3/30 | Train Start: 2226, Train End: 3666


23:30:06 - cmdstanpy - INFO - Chain [1] start processing
23:30:06 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 4/30 | Train Start: 2227, Train End: 3667


23:30:06 - cmdstanpy - INFO - Chain [1] start processing
23:30:06 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 5/30 | Train Start: 2228, Train End: 3668


23:30:07 - cmdstanpy - INFO - Chain [1] start processing
23:30:07 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 6/30 | Train Start: 2229, Train End: 3669


23:30:08 - cmdstanpy - INFO - Chain [1] start processing
23:30:08 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 7/30 | Train Start: 2230, Train End: 3670


23:30:08 - cmdstanpy - INFO - Chain [1] start processing
23:30:08 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 8/30 | Train Start: 2231, Train End: 3671


23:30:09 - cmdstanpy - INFO - Chain [1] start processing
23:30:09 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 9/30 | Train Start: 2232, Train End: 3672


23:30:09 - cmdstanpy - INFO - Chain [1] start processing
23:30:10 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 10/30 | Train Start: 2233, Train End: 3673


23:30:10 - cmdstanpy - INFO - Chain [1] start processing
23:30:10 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 11/30 | Train Start: 2234, Train End: 3674


23:30:11 - cmdstanpy - INFO - Chain [1] start processing
23:30:11 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 12/30 | Train Start: 2235, Train End: 3675


23:30:11 - cmdstanpy - INFO - Chain [1] start processing
23:30:12 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 13/30 | Train Start: 2236, Train End: 3676


23:30:12 - cmdstanpy - INFO - Chain [1] start processing
23:30:13 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 14/30 | Train Start: 2237, Train End: 3677


23:30:13 - cmdstanpy - INFO - Chain [1] start processing
23:30:13 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 15/30 | Train Start: 2238, Train End: 3678


23:30:14 - cmdstanpy - INFO - Chain [1] start processing
23:30:14 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 16/30 | Train Start: 2239, Train End: 3679


23:30:14 - cmdstanpy - INFO - Chain [1] start processing
23:30:14 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 17/30 | Train Start: 2240, Train End: 3680


23:30:15 - cmdstanpy - INFO - Chain [1] start processing
23:30:15 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 18/30 | Train Start: 2241, Train End: 3681


23:30:16 - cmdstanpy - INFO - Chain [1] start processing
23:30:16 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 19/30 | Train Start: 2242, Train End: 3682


23:30:16 - cmdstanpy - INFO - Chain [1] start processing
23:30:17 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 20/30 | Train Start: 2243, Train End: 3683


23:30:17 - cmdstanpy - INFO - Chain [1] start processing
23:30:17 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 21/30 | Train Start: 2244, Train End: 3684


23:30:18 - cmdstanpy - INFO - Chain [1] start processing
23:30:18 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 22/30 | Train Start: 2245, Train End: 3685


23:30:18 - cmdstanpy - INFO - Chain [1] start processing
23:30:18 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 23/30 | Train Start: 2246, Train End: 3686


23:30:19 - cmdstanpy - INFO - Chain [1] start processing
23:30:19 - cmdstanpy - INFO - Chain [1] done processing
23:30:19 - cmdstanpy - INFO - Chain [1] start processing


Rolling Window 24/30 | Train Start: 2247, Train End: 3687


23:30:20 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 25/30 | Train Start: 2248, Train End: 3688


23:30:20 - cmdstanpy - INFO - Chain [1] start processing
23:30:20 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 26/30 | Train Start: 2249, Train End: 3689


23:30:21 - cmdstanpy - INFO - Chain [1] start processing
23:30:21 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 27/30 | Train Start: 2250, Train End: 3690


23:30:21 - cmdstanpy - INFO - Chain [1] start processing
23:30:22 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 28/30 | Train Start: 2251, Train End: 3691


23:30:22 - cmdstanpy - INFO - Chain [1] start processing
23:30:22 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 29/30 | Train Start: 2252, Train End: 3692


23:30:23 - cmdstanpy - INFO - Chain [1] start processing
23:30:23 - cmdstanpy - INFO - Chain [1] done processing


Rolling Window 30/30 | Train Start: 2253, Train End: 3693


Unnamed: 0,ds,y,yhat,yhat_lower,yhat_upper
25,2024-08-30,229.0,215.051785,204.854171,225.29975
26,2024-09-03,222.770004,213.993626,203.498762,224.165151
27,2024-09-04,220.850006,213.887316,204.05732,224.232198
28,2024-09-05,222.380005,212.938028,203.792189,223.071181
29,2024-09-06,220.820007,212.428324,202.715547,222.399894


In [25]:


# Evaluation metrics for validation set
rmse_4 = round(np.sqrt(mean_squared_error(validation_data_4['y'], validation_data_4['yhat'])), 2)
mae_4 = round(mean_absolute_error(validation_data_4['y'], validation_data_4['yhat']), 2)
mape_4 = round((1 - np.mean(np.abs((validation_data_4['y'] - validation_data_4['yhat']) / validation_data_4['y'])))*100, 2)
mda_4 = round(np.mean((np.sign(validation_data_4['y'].diff()) == np.sign(validation_data_4['yhat'].diff())).astype(int)) * 100, 2)

# Print evaluation metrics
print(f"RMSE of Validation: {rmse_4}")
print(f"MAE of Validation: {mae_4}")
print(f"Accuracy(1-MAPE) of Validation: {mape_4:.2f}%")
print(f"MDA of Validation: {mda_4:.2f}%")

# Plot the actual vs forecasted values for the validation period
def plot_forecast_vs_actual(validation_data_4):
    fig = go.Figure()

    # Add the actual closing prices to the plot
    fig.add_trace(go.Scatter(x=validation_data_4['ds'], 
                             y=validation_data_4['y'], 
                             mode='lines', 
                             name='Actual',
                             line=dict(color='red')))

    # Add the forecasted values (yhat) to the plot
    fig.add_trace(go.Scatter(x=validation_data_4['ds'], 
                             y=validation_data_4['yhat'], 
                             mode='lines', 
                             name='Forecast',
                             line=dict(color='blue', dash='dash')))

    # Set the layout of the plot
    fig.update_layout(title='Forecast vs Actuals for Validation Period',
                      xaxis_title='Date',
                      yaxis_title='Close Price',
                      legend_title='Legend')

    # Display the plot
    fig.show()

# Call the function to plot the forecast vs actual values
plot_forecast_vs_actual(validation_data_4)


RMSE of Validation: 11.83
MAE of Validation: 11.16
Accuracy(1-MAPE) of Validation: 94.99%
MDA of Validation: 53.33%


#### Evaluation Metrics from Simple Model

In [26]:

# RMSE of Validation: 26.77
# MAE of Validation: 26.29
# Accuracy(1-MAPE) of Validation: 88.17%
# MDA of Validation: 56.67%

#### Evaluation Metrics from Recursive Model

In [27]:

# RMSE of Validation: 26.83
# MAE of Validation: 26.33
# Accuracy(1-MAPE) of Validation: 88.16%
# MDA of Validation: 55.17%

#### Evaluation Metrics from Direct-Recursive Model

In [28]:

# RMSE of Validation: 26.83
# MAE of Validation: 26.33
# Accuracy(1-MAPE) of Validation: 88.16%
# MDA of Validation: 55.17%


#### Evaluation Metrics from Rolling Window Method

In [1]:
# RMSE of Validation: 11.83
# MAE of Validation: 11.16
# Accuracy(1-MAPE) of Validation: 94.99%
# MDA of Validation: 53.33% 

Time Series Data & Basic Modeling techniques- https://medium.com/@datagyani/how-to-analyze-time-series-data-dbb1567ffc0d


Let me know in the comments if you're interested in a video on extensive testing techniques for forecasting models and how to generate future forecasts using various inference methods.

#### Topic of upcoming videos 

<strong>How to optimize Prophet model output using Hyperparameters and regressors extracted from the series? How can we use regressors like hyperparameter for forecasting? </strong>

<center>**************************** <em> It will definitely boost your model performance drastically. </em>   ****************************


## <center><strong> Don't Forget to Like, Subscribe and share 