# Daily Trends Prediction - Neural Prophet Model

This notebook trains a Neural Prophet model to forecast daily interest values.

**Algorithm**: Neural Prophet (Facebook's time series forecasting)
- Built on PyTorch
- Combines traditional time series decomposition with neural networks
- Handles seasonality, trends, and holidays automatically
- Easy to use API similar to Prophet

**Target**: Forecast interest_value (0-100) for future days

## 1. Setup and Imports

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from neuralprophet import NeuralProphet
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import warnings
warnings.filterwarnings('ignore')

# Set style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (14, 6)

print("Libraries imported successfully")

## 2. Load Processed Data

In [None]:
# Load processed daily trends data
data_path = '../data/processed/daily_trends_processed_latest.csv'

df = pd.read_csv(data_path)
df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['keyword', 'category', 'date'])

print(f"Data shape: {df.shape}")
print(f"Date range: {df['date'].min()} to {df['date'].max()}")
print(f"Keywords: {df['keyword'].nunique()}")

df.head()

## 3. Prepare Data for Neural Prophet

In [None]:
# Neural Prophet requires specific column names: 'ds' (date) and 'y' (target)
# We'll train a model for a specific keyword/category

sample_keyword = df['keyword'].iloc[0]
sample_category = df['category'].iloc[0]

# Filter data
keyword_data = df[
    (df['keyword'] == sample_keyword) & 
    (df['category'] == sample_category)
].copy()

# Prepare for Neural Prophet
prophet_df = keyword_data[['date', 'interest_value']].copy()
prophet_df.columns = ['ds', 'y']
prophet_df = prophet_df.sort_values('ds').reset_index(drop=True)

print(f"Training model for: {sample_keyword} ({sample_category})")
print(f"Data points: {len(prophet_df)}")
print(f"Date range: {prophet_df['ds'].min()} to {prophet_df['ds'].max()}")

prophet_df.head()

## 4. Train/Test Split

In [None]:
# Time-based split (80% train, 20% test)
split_idx = int(len(prophet_df) * 0.8)

train_df = prophet_df[:split_idx].copy()
test_df = prophet_df[split_idx:].copy()

print(f"Train set: {len(train_df)} days")
print(f"Test set: {len(test_df)} days")
print(f"Train period: {train_df['ds'].min()} to {train_df['ds'].max()}")
print(f"Test period: {test_df['ds'].min()} to {test_df['ds'].max()}")

## 5. Initialize and Configure Neural Prophet

In [None]:
# Initialize Neural Prophet model
model = NeuralProphet(
    # Growth settings
    growth='linear',  # or 'discontinuous' for trend changes
    
    # Seasonality
    yearly_seasonality=True,
    weekly_seasonality=True,
    daily_seasonality=False,
    
    # Neural network settings
    n_lags=30,  # Use 30 days of autoregressive lags
    n_forecasts=7,  # Forecast 7 days ahead
    
    # Training settings
    epochs=100,
    batch_size=32,
    learning_rate=0.01,
    
    # Regularization
    trend_reg=1,
    
    # Other settings
    loss_func='MSE',
    normalize='auto'
)

print("Neural Prophet model initialized")
print(f"Autoregressive lags: {model.n_lags}")
print(f"Forecast horizon: {model.n_forecasts} days")

## 6. Train Model

In [None]:
# Train model
print("Training Neural Prophet model...")
metrics = model.fit(train_df, freq='D', validation_df=test_df)

print("\nModel training completed")
print(f"Final training loss: {metrics['Loss'].iloc[-1]:.4f}")

## 7. Training Metrics Visualization

In [None]:
# Plot training metrics
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Loss
axes[0].plot(metrics['Loss'], label='Training Loss', linewidth=2)
if 'Loss_val' in metrics.columns:
    axes[0].plot(metrics['Loss_val'], label='Validation Loss', linewidth=2)
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_title('Training and Validation Loss')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# MAE
if 'MAE' in metrics.columns:
    axes[1].plot(metrics['MAE'], label='Training MAE', linewidth=2)
if 'MAE_val' in metrics.columns:
    axes[1].plot(metrics['MAE_val'], label='Validation MAE', linewidth=2)
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('MAE')
axes[1].set_title('Training and Validation MAE')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 8. Make Predictions

In [None]:
# Make predictions on test set
forecast = model.predict(test_df)

print(f"Forecast shape: {forecast.shape}")
print(f"Forecast columns: {forecast.columns.tolist()}")

forecast.head()

## 9. Model Evaluation

In [None]:
# Extract actual and predicted values
# Neural Prophet returns yhat1 for 1-step ahead forecast
actual = forecast['y'].values
predicted = forecast['yhat1'].values

# Remove NaN values
mask = ~np.isnan(actual) & ~np.isnan(predicted)
actual = actual[mask]
predicted = predicted[mask]

# Calculate metrics
mae = mean_absolute_error(actual, predicted)
rmse = np.sqrt(mean_squared_error(actual, predicted))
r2 = r2_score(actual, predicted)
mape = np.mean(np.abs((actual - predicted) / (actual + 1e-10))) * 100

print(f"\nTest Set Metrics:")
print(f"  MAE:  {mae:.4f}")
print(f"  RMSE: {rmse:.4f}")
print(f"  R2:   {r2:.4f}")
print(f"  MAPE: {mape:.2f}%")

test_metrics = {'MAE': mae, 'RMSE': rmse, 'R2': r2, 'MAPE': mape}

## 10. Visualize Predictions

In [None]:
# Plot forecast
fig = model.plot(forecast)
plt.title(f'Neural Prophet Forecast: {sample_keyword} ({sample_category})', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

In [None]:
# Scatter plot: Actual vs Predicted
plt.figure(figsize=(10, 6))
plt.scatter(actual, predicted, alpha=0.5, s=30)
plt.plot([0, 100], [0, 100], 'r--', lw=2, label='Perfect Prediction')
plt.xlabel('Actual Interest Value', fontsize=12)
plt.ylabel('Predicted Interest Value', fontsize=12)
plt.title(f'Actual vs Predicted (R2 = {r2:.4f})', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 11. Component Analysis

In [None]:
# Plot components (trend, seasonality)
fig_comp = model.plot_components(forecast)
plt.tight_layout()
plt.show()

In [None]:
# Plot parameters (if available)
fig_param = model.plot_parameters()
plt.tight_layout()
plt.show()

## 12. Future Forecast

In [None]:
# Make future forecast (30 days ahead)
future = model.make_future_dataframe(prophet_df, periods=30, n_historic_predictions=True)
forecast_future = model.predict(future)

print(f"Future forecast shape: {forecast_future.shape}")
print(f"Forecast extends to: {forecast_future['ds'].max()}")

# Plot future forecast
fig = model.plot(forecast_future)
plt.title(f'30-Day Future Forecast: {sample_keyword}', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

## 13. Save Model and Artifacts

In [None]:
# Save model
model.save('../models/neuralprophet_daily_model.pt')
print(f"Model saved to: ../models/neuralprophet_daily_model.pt")

# Save forecast
forecast_future.to_csv('../models/neuralprophet_forecast.csv', index=False)
print(f"Forecast saved to: ../models/neuralprophet_forecast.csv")

# Save metrics
import json
metrics_dict = {
    'test': test_metrics,
    'n_lags': model.n_lags,
    'n_forecasts': model.n_forecasts,
    'keyword': sample_keyword,
    'category': sample_category
}

with open('../models/neuralprophet_daily_metrics.json', 'w') as f:
    json.dump(metrics_dict, f, indent=2)

print(f"Metrics saved to: ../models/neuralprophet_daily_metrics.json")

## 14. Load Saved Model (Example)

In [None]:
# Example: Load saved model for inference
# loaded_model = NeuralProphet.load('../models/neuralprophet_daily_model.pt')
# new_forecast = loaded_model.predict(new_data)

print("To load model: NeuralProphet.load('path/to/model.pt')")

## 15. Summary

### Model Performance:
- Algorithm: Neural Prophet
- Autoregressive Lags: 30 days
- Forecast Horizon: 7 days
- Test RMSE: Check output above
- Test R2: Check output above

### Advantages:
- Automatic seasonality detection
- Handles trends and changepoints
- Easy to interpret (decomposition into components)
- Built-in uncertainty quantification
- Fast training with PyTorch

### Limitations:
- Requires regular time series (no gaps)
- May need tuning for specific patterns
- Less flexible than pure deep learning

### Next Steps:
1. Add custom seasonalities (monthly, quarterly)
2. Include external regressors (holidays, events)
3. Tune hyperparameters (n_lags, learning_rate, etc.)
4. Train models for all keywords/categories
5. Compare with LightGBM and LSTM results