<a href="https://colab.research.google.com/github/noobhacker02/CBT-CIP/blob/main/Project_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# ⏳ Task 3: Time Series Forecasting App - CipherByte Internship

As part of my internship at **CipherByte Technologies**, I built a time series forecasting web application using **SARIMAX** models. The app allows users to upload two datasets and forecasts the next 12 months of:
- 🚗 **Miles Traveled**
- 🍾 **Alcohol Sales**

## 🔧 Features
- 📂 Upload CSV files:
  - `Miles_Traveled.csv`
  - `Alcohol_Sales.csv`
- 🧠 Uses SARIMAX (Seasonal ARIMA with Exogenous variables) for forecasting.
- 📉 Forecasts next 12 months of trends for both metrics.
- 📊 Displays actual vs. predicted plots.
- 🧮 Provides model evaluation using:
  - **RMSE** (Root Mean Squared Error)
  - **MAE** (Mean Absolute Error)

## 💡 How It Works
1. Reads uploaded time-series CSVs with `DATE` column.
2. Preprocesses data to set monthly frequency (`MS`).
3. Splits into train and test sets (last 12 months as test).
4. Trains separate SARIMAX models on each metric.
5. Plots and compares predicted vs. actual values.
6. Outputs performance metrics and a combined forecast plot.

## 📦 Tech Stack
- `Python`
- `Pandas` – for data manipulation
- `Matplotlib` & `Seaborn` – for plotting
- `scikit-learn` – for error metrics
- `statsmodels` – SARIMAX model
- `Gradio` – for building the interactive web app

## 📷 Sample Output
- **Forecast Plot**:
  - 📘 Blue Line: Actual Miles Traveled
  - 🟡 Orange Line: Actual Alcohol Sales
  - Dashed Lines: Predicted values
- **Evaluation**:
  ```
  Miles Traveled
  RMSE: 123.45 | MAE: 110.23

  Alcohol Sales
  RMSE: 98.76 | MAE: 87.65
  ```

## 🧪 Use Case
Useful for trend forecasting, retail planning, logistics, and alcohol distribution insights.

## 👨‍💻 Developed By
**Talha Shaikh**  
🔗 [LinkedIn](https://www.linkedin.com/in/talha-s-145729339/)  
📌 Project for **#CipherByteTech** Internship

---

> “Forecasting the future—one line of code at a time.”
```



In [1]:
!pip install gradio pandas matplotlib seaborn scikit-learn statsmodels --quiet

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import mean_squared_error, mean_absolute_error
from statsmodels.tsa.statespace.sarimax import SARIMAX
from math import sqrt
import gradio as gr

# Set plot aesthetics
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

# Evaluation function
def evaluate_model(y_true, y_pred):
    rmse = sqrt(mean_squared_error(y_true, y_pred))
    mae = mean_absolute_error(y_true, y_pred)
    return rmse, mae

# Forecasting function (manual SARIMAX config)
def forecast_time_series(miles_file, alcohol_file):
    miles_df = pd.read_csv(miles_file.name, parse_dates=['DATE'])
    alcohol_df = pd.read_csv(alcohol_file.name, parse_dates=['DATE'])

    miles_df.columns = ['Date', 'Miles']
    alcohol_df.columns = ['Date', 'Sales']

    miles_df.set_index('Date', inplace=True)
    alcohol_df.set_index('Date', inplace=True)

    miles_df = miles_df.asfreq('MS')
    alcohol_df = alcohol_df.asfreq('MS')

    def split_data(df, test_size=12):
        return df.iloc[:-test_size], df.iloc[-test_size:]

    miles_train, miles_test = split_data(miles_df)
    alcohol_train, alcohol_test = split_data(alcohol_df)

    # Basic SARIMAX model — these values can be tuned
    miles_model = SARIMAX(miles_train, order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
    miles_results = miles_model.fit(disp=False)
    miles_forecast = miles_results.get_forecast(steps=12).predicted_mean

    alcohol_model = SARIMAX(alcohol_train, order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
    alcohol_results = alcohol_model.fit(disp=False)
    alcohol_forecast = alcohol_results.get_forecast(steps=12).predicted_mean

    miles_rmse, miles_mae = evaluate_model(miles_test['Miles'], miles_forecast)
    alcohol_rmse, alcohol_mae = evaluate_model(alcohol_test['Sales'], alcohol_forecast)

    # Plotting
    fig, ax = plt.subplots(2, 1, figsize=(14, 10))

    ax[0].plot(miles_df, label="Actual")
    ax[0].plot(miles_forecast.index, miles_forecast, label="Predicted", linestyle='--')
    ax[0].set_title("Miles Traveled - Forecast")
    ax[0].legend()

    ax[1].plot(alcohol_df, label="Actual", color='orange')
    ax[1].plot(alcohol_forecast.index, alcohol_forecast, label="Predicted", linestyle='--', color='red')
    ax[1].set_title("Alcohol Sales - Forecast")
    ax[1].legend()

    plt.tight_layout()
    plot_path = "forecast_plot.png"
    plt.savefig(plot_path)
    plt.close()

    result_text = (
        f"**Miles Traveled**\nRMSE: {miles_rmse:.2f} | MAE: {miles_mae:.2f}\n\n"
        f"**Alcohol Sales**\nRMSE: {alcohol_rmse:.2f} | MAE: {alcohol_mae:.2f}"
    )

    return result_text, plot_path

# Gradio Interface
interface = gr.Interface(
    fn=forecast_time_series,
    inputs=[
        gr.File(label="Upload Miles_Traveled.csv"),
        gr.File(label="Upload Alcohol_Sales.csv")
    ],
    outputs=[
        gr.Markdown(label="Evaluation Results"),
        gr.Image(type="filepath", label="Forecast Plot")
    ],
    title="Time Series Forecasting App - CipherByte Internship",
    description="Upload time-series CSVs to forecast Miles Traveled and Alcohol Sales for the next 12 months using SARIMAX. Developed by Talha Shaikh | [LinkedIn](https://www.linkedin.com/in/talha-s-145729339/) | #cipherbytetech"

)

interface.launch()


[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.9/46.9 MB[0m [31m19.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m322.2/322.2 kB[0m [31m21.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m95.2/95.2 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.5/11.5 MB[0m [31m101.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.4/62.4 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25hIt looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in cola

