# Prophet Model for Spare Part Demand Forecasting

This notebook implements Facebook Prophet for long-term demand forecasting (30-90 days).

## Objectives
1. Load and prepare data for Prophet
2. Train Prophet model with seasonality
3. Generate forecasts
4. Evaluate model performance
5. Cross-validation
6. Save model for deployment

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
from prophet import Prophet
from prophet.diagnostics import cross_validation, performance_metrics
from prophet.plot import plot_plotly, plot_components_plotly
from sklearn.metrics import mean_absolute_error, mean_squared_error
import pickle
import warnings
warnings.filterwarnings('ignore')

print('Libraries loaded successfully!')

Libraries loaded successfully!


## 1. Load and Prepare Data

In [2]:
# Load daily aggregated demand
df = pd.read_csv('../data/processed/daily_demand.csv', parse_dates=['date'])
print(f'Loaded {len(df)} rows')
df.head()

Loaded 730 rows


Unnamed: 0,date,demand_quantity,revenue
0,2022-01-01,5809,15363979.92
1,2022-01-02,5773,15295649.75
2,2022-01-03,14097,37141288.42
3,2022-01-04,9586,25126363.5
4,2022-01-05,9496,24857942.96


In [3]:
# Prepare data for Prophet (requires 'ds' and 'y' columns)
prophet_df = df[['date', 'demand_quantity']].copy()
prophet_df.columns = ['ds', 'y']
prophet_df['ds'] = pd.to_datetime(prophet_df['ds'])

print(f'Prophet data shape: {prophet_df.shape}')
print(f'Date range: {prophet_df["ds"].min()} to {prophet_df["ds"].max()}')
prophet_df.head()

Prophet data shape: (730, 2)
Date range: 2022-01-01 00:00:00 to 2023-12-31 00:00:00


Unnamed: 0,ds,y
0,2022-01-01,5809
1,2022-01-02,5773
2,2022-01-03,14097
3,2022-01-04,9586
4,2022-01-05,9496


In [4]:
# Visualize the time series
fig = px.line(prophet_df, x='ds', y='y', title='Daily Demand Time Series')
fig.update_layout(xaxis_title='Date', yaxis_title='Demand')
fig.show()

## 2. Train-Test Split

In [5]:
# Split data: use last 30 days for testing
test_days = 30
train_df = prophet_df[:-test_days]
test_df = prophet_df[-test_days:]

print(f'Training set: {len(train_df)} days ({train_df["ds"].min()} to {train_df["ds"].max()})')
print(f'Test set: {len(test_df)} days ({test_df["ds"].min()} to {test_df["ds"].max()})')

Training set: 700 days (2022-01-01 00:00:00 to 2023-12-01 00:00:00)
Test set: 30 days (2023-12-02 00:00:00 to 2023-12-31 00:00:00)


## 3. Train Prophet Model

In [6]:
# Initialize Prophet model
model = Prophet(
    seasonality_mode='multiplicative',  # Works better for demand data
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False,
    changepoint_prior_scale=0.05,  # Flexibility of trend
    seasonality_prior_scale=10.0
)

# Add country holidays (India)
model.add_country_holidays(country_name='IN')

print('Prophet model initialized with:')
print('- Multiplicative seasonality')
print('- Yearly + Weekly seasonality')
print('- Indian holidays')

Prophet model initialized with:
- Multiplicative seasonality
- Yearly + Weekly seasonality
- Indian holidays


In [7]:
# Fit the model
print('Training Prophet model...')
model.fit(train_df)
print('Model trained successfully!')

Training Prophet model...


14:42:55 - cmdstanpy - INFO - Chain [1] start processing
14:42:56 - cmdstanpy - INFO - Chain [1] done processing


Model trained successfully!


## 4. Generate Forecast

In [8]:
# Create future dataframe for prediction
future = model.make_future_dataframe(periods=test_days + 30, freq='D')  # +30 for future forecast
print(f'Forecast dataframe: {len(future)} days')
future.tail()

Forecast dataframe: 760 days


Unnamed: 0,ds
755,2024-01-26
756,2024-01-27
757,2024-01-28
758,2024-01-29
759,2024-01-30


In [9]:
# Generate predictions
forecast = model.predict(future)
print(f'Forecast generated: {len(forecast)} predictions')
forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(10)

Forecast generated: 760 predictions


Unnamed: 0,ds,yhat,yhat_lower,yhat_upper
750,2024-01-21,5692.858889,4285.710225,7159.139293
751,2024-01-22,10393.02529,8967.18665,11840.078519
752,2024-01-23,10195.598116,8783.22049,11664.091011
753,2024-01-24,10106.218932,8662.161938,11564.205428
754,2024-01-25,10129.658324,8617.939398,11586.928265
755,2024-01-26,10751.001682,9320.40724,12241.267445
756,2024-01-27,5663.333491,4227.841221,7138.100375
757,2024-01-28,5727.30015,4154.652509,7227.032177
758,2024-01-29,10442.618237,8896.383326,12043.549419
759,2024-01-30,10265.330098,8858.981948,11831.272127


In [10]:
# Interactive forecast plot
fig = plot_plotly(model, forecast)
fig.update_layout(title='Prophet Demand Forecast')
fig.show()

In [11]:
# Plot components (trend, seasonality)
fig = plot_components_plotly(model, forecast)
fig.show()

## 5. Model Evaluation

In [12]:
# Get predictions for test period
test_forecast = forecast[forecast['ds'].isin(test_df['ds'])][['ds', 'yhat', 'yhat_lower', 'yhat_upper']]
test_forecast = test_forecast.merge(test_df, on='ds')

print('Test Period Predictions vs Actuals:')
test_forecast[['ds', 'y', 'yhat', 'yhat_lower', 'yhat_upper']].head(10)

Test Period Predictions vs Actuals:


Unnamed: 0,ds,y,yhat,yhat_lower,yhat_upper
0,2023-12-02,5668,5014.865287,3485.412851,6422.543489
1,2023-12-03,5572,5052.788139,3545.516896,6535.444067
2,2023-12-04,9216,9742.810164,8231.86577,11171.477894
3,2023-12-05,9257,9538.201585,8198.273904,11132.047607
4,2023-12-06,9216,9444.41643,7868.874339,10967.460839
5,2023-12-07,9355,9465.086041,8101.930797,10912.897445
6,2023-12-08,10103,10617.331971,9125.692444,12037.294121
7,2023-12-09,5644,4992.53539,3537.022953,6340.91075
8,2023-12-10,5691,5051.481312,3592.313415,6545.909646
9,2023-12-11,9192,9759.571828,8319.540304,11373.332542


In [13]:
# Calculate metrics
y_true = test_forecast['y']
y_pred = test_forecast['yhat']

mae = mean_absolute_error(y_true, y_pred)
rmse = np.sqrt(mean_squared_error(y_true, y_pred))
mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100

print('='*50)
print('PROPHET MODEL EVALUATION METRICS')
print('='*50)
print(f'MAE  (Mean Absolute Error):     {mae:.2f}')
print(f'RMSE (Root Mean Squared Error): {rmse:.2f}')
print(f'MAPE (Mean Absolute % Error):   {mape:.2f}%')
print('='*50)

PROPHET MODEL EVALUATION METRICS
MAE  (Mean Absolute Error):     464.64
RMSE (Root Mean Squared Error): 491.74
MAPE (Mean Absolute % Error):   6.20%


In [14]:
# Visualize actual vs predicted
fig = go.Figure()

fig.add_trace(go.Scatter(x=test_forecast['ds'], y=test_forecast['y'],
                         mode='lines+markers', name='Actual', line=dict(color='blue')))

fig.add_trace(go.Scatter(x=test_forecast['ds'], y=test_forecast['yhat'],
                         mode='lines+markers', name='Predicted', line=dict(color='orange')))

fig.add_trace(go.Scatter(x=test_forecast['ds'], y=test_forecast['yhat_upper'],
                         mode='lines', name='Upper Bound', line=dict(dash='dash', color='lightgray')))

fig.add_trace(go.Scatter(x=test_forecast['ds'], y=test_forecast['yhat_lower'],
                         mode='lines', name='Lower Bound', line=dict(dash='dash', color='lightgray'),
                         fill='tonexty', fillcolor='rgba(128,128,128,0.2)'))

fig.update_layout(title='Prophet: Actual vs Predicted (Test Period)',
                  xaxis_title='Date', yaxis_title='Demand')
fig.show()

## 6. Cross-Validation

In [15]:
# Perform cross-validation
print('Running cross-validation (this may take a few minutes)...')

cv_results = cross_validation(
    model,
    initial='365 days',   # Initial training period
    period='30 days',     # Spacing between cutoff dates
    horizon='30 days'     # Forecast horizon
)

print(f'Cross-validation complete: {len(cv_results)} predictions')
cv_results.head()

Seasonality has period of 365.25 days which is larger than initial window. Consider increasing initial.


Running cross-validation (this may take a few minutes)...


  0%|          | 0/11 [00:00<?, ?it/s]

14:43:03 - cmdstanpy - INFO - Chain [1] start processing
14:43:04 - cmdstanpy - INFO - Chain [1] done processing
14:43:06 - cmdstanpy - INFO - Chain [1] start processing
14:43:06 - cmdstanpy - INFO - Chain [1] done processing
14:43:07 - cmdstanpy - INFO - Chain [1] start processing
14:43:08 - cmdstanpy - INFO - Chain [1] done processing
14:43:10 - cmdstanpy - INFO - Chain [1] start processing
14:43:10 - cmdstanpy - INFO - Chain [1] done processing
14:43:12 - cmdstanpy - INFO - Chain [1] start processing
14:43:12 - cmdstanpy - INFO - Chain [1] done processing
14:43:14 - cmdstanpy - INFO - Chain [1] start processing
14:43:14 - cmdstanpy - INFO - Chain [1] done processing
14:43:16 - cmdstanpy - INFO - Chain [1] start processing
14:43:16 - cmdstanpy - INFO - Chain [1] done processing
14:43:18 - cmdstanpy - INFO - Chain [1] start processing
14:43:19 - cmdstanpy - INFO - Chain [1] done processing
14:43:21 - cmdstanpy - INFO - Chain [1] start processing
14:43:21 - cmdstanpy - INFO - Chain [1]

Cross-validation complete: 330 predictions


Unnamed: 0,ds,yhat,yhat_lower,yhat_upper,y,cutoff
0,2023-01-06,10070.037244,8643.831682,11461.299782,15518,2023-01-05
1,2023-01-07,5061.096425,3744.535646,6434.769867,5776,2023-01-05
2,2023-01-08,5054.84424,3731.451076,6462.521996,5770,2023-01-05
3,2023-01-09,9070.27002,7704.468682,10410.946571,9479,2023-01-05
4,2023-01-10,8880.995911,7567.276124,10175.206384,9537,2023-01-05


In [16]:
# Calculate performance metrics from CV
cv_metrics = performance_metrics(cv_results)
cv_metrics

Unnamed: 0,horizon,mse,rmse,mae,mape,mdape,smape,coverage
0,3 days,1668410.0,1291.669593,657.041842,0.062218,0.045538,0.065884,0.939394
1,4 days,2038968.0,1427.924214,737.893797,0.066868,0.04312,0.071577,0.909091
2,5 days,2753088.0,1659.243256,866.284027,0.072923,0.04312,0.079533,0.878788
3,6 days,2783799.0,1668.472094,890.93334,0.073494,0.042944,0.07965,0.878788
4,7 days,1467962.0,1211.59472,648.174708,0.057508,0.035,0.060121,0.939394
5,8 days,729651.2,854.196247,494.485485,0.047074,0.026469,0.047686,0.969697
6,9 days,3192688.0,1786.809499,931.902022,0.073191,0.031227,0.079335,0.878788
7,10 days,2700569.0,1643.340744,830.821445,0.068972,0.03888,0.074378,0.909091
8,11 days,2698872.0,1642.824456,840.606987,0.071024,0.03888,0.076583,0.909091
9,12 days,666690.6,816.51122,511.383233,0.052886,0.038407,0.054554,0.969697


In [17]:
# Plot CV metrics over horizon
fig = px.line(cv_metrics, x='horizon', y=['mape', 'mae', 'rmse'],
              title='Cross-Validation Metrics by Forecast Horizon')
fig.update_layout(yaxis_title='Error', xaxis_title='Forecast Horizon')
fig.show()

## 7. Save Model

In [18]:
# Save the trained model
import os
os.makedirs('../models', exist_ok=True)

model_path = '../models/prophet_model.pkl'
with open(model_path, 'wb') as f:
    pickle.dump(model, f)

print(f'Model saved to: {model_path}')

Model saved to: ../models/prophet_model.pkl


In [19]:
# Save metrics for comparison
metrics = {
    'model': 'Prophet',
    'mae': mae,
    'rmse': rmse,
    'mape': mape,
    'cv_mape_mean': cv_metrics['mape'].mean()
}

metrics_df = pd.DataFrame([metrics])
metrics_df.to_csv('../models/prophet_metrics.csv', index=False)
print('Metrics saved!')
metrics_df

Metrics saved!


Unnamed: 0,model,mae,rmse,mape,cv_mape_mean
0,Prophet,464.641237,491.740794,6.2035,0.070244


## 8. Future Forecast (Next 30 Days)

In [20]:
# Get the future predictions (beyond test data)
future_forecast = forecast[forecast['ds'] > prophet_df['ds'].max()][['ds', 'yhat', 'yhat_lower', 'yhat_upper']]
future_forecast.columns = ['Date', 'Predicted_Demand', 'Lower_Bound', 'Upper_Bound']

print('Next 30 Days Forecast:')
future_forecast

Next 30 Days Forecast:


Unnamed: 0,Date,Predicted_Demand,Lower_Bound,Upper_Bound
730,2024-01-01,10021.679593,8552.116836,11533.67869
731,2024-01-02,9847.12944,8367.807827,11311.680205
732,2024-01-03,9781.248044,8265.543061,11268.938956
733,2024-01-04,9827.81112,8340.885003,11365.599611
734,2024-01-05,11003.960299,9560.511616,12471.830248
735,2024-01-06,5401.949829,3938.541689,6869.007889
736,2024-01-07,5481.382989,3938.584595,6933.70855
737,2024-01-08,10207.778821,8762.815376,11529.578003
738,2024-01-09,10035.46644,8476.199949,11581.746952
739,2024-01-10,9969.232355,8479.484491,11562.050936


In [21]:
# Save future forecast
future_forecast.to_csv('../data/processed/prophet_forecast.csv', index=False)
print('Future forecast saved to: ../data/processed/prophet_forecast.csv')

Future forecast saved to: ../data/processed/prophet_forecast.csv


## Summary

| Metric | Value |
|--------|-------|
| Model | Prophet |
| Training Period | ~700 days |
| Test Period | 30 days |
| MAE | See above |
| RMSE | See above |
| MAPE | See above |

**Notes:**
- Prophet captures yearly and weekly seasonality well
- Indian holidays are included
- Best for long-term forecasts (30-90 days)
- Model saved for deployment

In [22]:
print('Prophet Model Training Complete!')

Prophet Model Training Complete!
