# Market Forecast Workflow

This notebook documents the pipeline and contains runnable cells to preprocess data, train models, generate forecasts, and visualize results.

Make sure you have installed dependencies (see `requirements.txt`) and that you're in the repo root.

In [None]:
# Setup: add repo root to path and imports
import os, sys
repo_root = os.path.abspath(os.path.join('..')) if os.path.exists('..') else os.getcwd()
sys.path.insert(0, repo_root)
print('Repo root:', repo_root)
print('Python version:', sys.version)


## 1) Inspect raw files

Drop your Excel/CSV files into `data/raw/` before running the next cell.

In [None]:
import glob
raw_files = glob.glob('data/raw/*')
raw_files[:20]


## 2) Run data processing
This will read spreadsheets and write cleaned CSVs to `data/processed/`.

In [None]:
!python src/data_processing.py --input_dir data/raw --output_dir data/processed
print('Processed files:')
!ls -l data/processed || true


## 3) Train models
This will train Prophet and LightGBM models per processed series and save them under `models/`.

In [None]:
!python src/train_model.py --data_dir data/processed --models_dir models
echo 'Models saved to models/'
!ls -l models || true


## 4) Generate forecasts
Produces CSV forecasts and PNGs under `reports/`.

In [None]:
!python src/forecast.py --models_dir models --horizon 12 --out_dir reports
!ls -l reports || true


## 5) Quick visual: load a forecast CSV and plot


In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import glob
pf_files = glob.glob('reports/*__prophet_forecast.csv')
if pf_files:
    df = pd.read_csv(pf_files[0])
    df['ds'] = pd.to_datetime(df['ds'])
    plt.figure(figsize=(10,4))
    plt.plot(df['ds'], df['yhat'], marker='o')
    plt.title('Sample Prophet Forecast')
    plt.grid(True)
    plt.show()
else:
    print('No forecast CSVs found in reports/')


## 6) Streamlit dashboard (run from terminal)
Open a terminal and run:
```
streamlit run src/streamlit_app.py
```
Then open `http://localhost:8501`.

## 7) How to rerun on new monthly data
1. Drop new Excel into `data/raw/`.
2. Re-run the data processing & training cells (or `./run_all.sh`).
3. Check `reports/` and open Streamlit.