# Price Forecast Notebook

This notebook reproduces the data loading, feature engineering, and prediction logic from the Way2Agri Price Forecast Django application. It allows for interactive analysis and visualization of the price forecasting model.

## 1. Setup and Imports

First, we import the necessary libraries. We need `os` for file paths, `pandas` for data manipulation, `numpy` for numerical operations, `joblib` to load our trained model, and `matplotlib` for plotting.

In [None]:
import os
import pandas as pd
import numpy as np
import joblib
import matplotlib.pyplot as plt

# Set plot style
plt.style.use('seaborn-v0_8-whitegrid')

## 2. Define Paths and Load Data

We define the paths to our data file, model, and scaler. Then, we load the historical price data from `price_data.csv`. If the file doesn't exist, we create some synthetic demo data to ensure the notebook can run.

In [None]:
# Define paths relative to the project root
ROOT = '.' # Assuming the notebook is in the project root
MODEL_PATH = os.path.join(ROOT, 'price_model.pkl')
SCALER_PATH = os.path.join(ROOT, 'scaler.pkl')
CSV_PATH = os.path.join(ROOT, 'price_data.csv')

# Load dataset or create a demo dataframe
if os.path.exists(CSV_PATH):
    df = pd.read_csv(CSV_PATH, parse_dates=['date'])
    # Rename column for consistency if needed
    if 'avg_monthly_price' in df.columns:
        df = df.rename(columns={'avg_monthly_price': 'price'})
else:
    print("price_data.csv not found. Creating demo data.")
    rng = pd.date_range(start='2015-01-01', periods=120, freq='MS')
    trend = np.linspace(1000, 2000, len(rng))
    seasonal = 80 * np.sin(2 * np.pi * (rng.month - 1) / 12)
    df = pd.DataFrame({'date': rng, 'price': trend + seasonal})

print("Data loaded successfully.")
df.head()

## 3. Data Cleaning and Preparation

We perform basic data cleaning by sorting the data by date, removing duplicates, and filling any missing price values using time-based interpolation.

In [None]:
df = df.sort_values('date').drop_duplicates('date').reset_index(drop=True)

# Interpolate missing values
if df['price'].isnull().any():
    df['price'] = df['price'].interpolate(method='time').ffill().bfill()

print("Data cleaned. Shape of dataframe:", df.shape)
df.tail()

## 4. Feature Engineering

We create features that the model expects for training and prediction. This includes a time trend `t` and cyclical features for the month (`month_sin`, `month_cos`).

In [None]:
df['month'] = df['date'].dt.month
df['t'] = np.arange(len(df))
df['month_sin'] = np.sin(2 * np.pi * (df['month'] - 1) / 12)
df['month_cos'] = np.cos(2 * np.pi * (df['month'] - 1) / 12)

FEATURES = ['t', 'month_sin', 'month_cos']

print("Features created:")
df[['date', 'price'] + FEATURES].tail()

## 5. Load Model and Scaler

We load the pre-trained machine learning model and the feature scaler from their respective `.pkl` files. If they don't exist, we'll proceed in a 'demo' mode without a real model.

In [None]:
model = None
scaler = None

if os.path.exists(MODEL_PATH):
    try:
        model = joblib.load(MODEL_PATH)
        print("Model loaded successfully.")
    except Exception as e:
        print(f"Error loading model: {e}")
else:
    print("Model file (price_model.pkl) not found.")

if os.path.exists(SCALER_PATH):
    try:
        scaler = joblib.load(SCALER_PATH)
        print("Scaler loaded successfully.")
    except Exception as e:
        print(f"Error loading scaler: {e}")
else:
    print("Scaler file (scaler.pkl) not found.")

## 6. Make Predictions

### Historical Predictions
We use the model to predict on the historical data. This helps us evaluate its performance by comparing predicted values against actuals. If the model or scaler is not available, we fall back to a simple approach.

In [None]:
X_hist = df[FEATURES].values
if scaler:
    try:
        X_hist = scaler.transform(X_hist)
    except Exception as e:
        print(f"Could not scale historical features: {e}")

if model:
    try:
        df['predicted'] = model.predict(X_hist)
    except Exception as e:
        print(f"Model prediction failed on historical data: {e}")
        df['predicted'] = df['price'] # Fallback
else:
    # If no model, predicted is same as actual for history
    df['predicted'] = df['price']

df[['date', 'price', 'predicted']].tail()

### Future Forecast
Next, we generate features for the next 12 months and use the model to forecast future prices. If no model is loaded, we'll use a simple linear trend extrapolation for demonstration.

In [None]:
horizon = 12
last_t = df['t'].iloc[-1]
future_dates = pd.date_range(start=df['date'].iloc[-1] + pd.DateOffset(months=1), periods=horizon, freq='MS')

future = pd.DataFrame({'date': future_dates})
future['month'] = future['date'].dt.month
future['t'] = np.arange(last_t + 1, last_t + 1 + len(future))
future['month_sin'] = np.sin(2 * np.pi * (future['month'] - 1) / 12)
future['month_cos'] = np.cos(2 * np.pi * (future['month'] - 1) / 12)

X_future = future[FEATURES].values
if scaler:
    X_future = scaler.transform(X_future)

if model:
    future_preds = model.predict(X_future)
else:
    # Demo forecasting: simple linear trend
    print("No model found. Using simple linear trend for forecast.")
    n_trend = min(12, len(df))
    x_trend = df['t'].values[-n_trend:]
    y_trend = df['price'].values[-n_trend:]
    try:
        coeff = np.polyfit(x_trend, y_trend, 1)
        slope = coeff[0]
    except Exception:
        slope = 0.0
    last_price = df['price'].iloc[-1]
    future_preds = [last_price + slope * (i + 1) for i in range(len(future_dates))]

future['forecast'] = future_preds

print("Future forecast created:")
future[['date', 'forecast']]

## 7. Evaluate and Visualize Results

### Calculate Metrics
We calculate Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) on the last 12 months of historical data to get a sense of the model's accuracy.

In [None]:
n_eval = min(12, len(df))
y_true = df['price'].values[-n_eval:]
y_pred = df['predicted'].values[-n_eval:]

mae = np.mean(np.abs(y_true - y_pred))
rmse = np.sqrt(np.mean((y_true - y_pred) ** 2))

print(f"Evaluation on last {n_eval} months:")
print(f"MAE:  {mae:.2f}")
print(f"RMSE: {rmse:.2f}")

### Plot the Forecast
Finally, we visualize the historical actual prices, the model's historical predictions, and the future forecast in a single plot.

In [None]:
plt.figure(figsize=(15, 7))

# Plot historical actuals
plt.plot(df['date'], df['price'], label='Historical Actual Price', color='black', linewidth=2)

# Plot historical predictions (if model was available)
if model:
    plt.plot(df['date'], df['predicted'], label='Historical Predicted Price', color='green', linestyle='--')

# Plot forecast
plt.plot(future['date'], future['forecast'], label='Future Forecast', color='red', marker='o', linestyle='--')

plt.title('Price Forecast', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price (â‚¹)', fontsize=12)
plt.legend()
plt.grid(True)
plt.show()