# ðŸ“¤ 03 â€” Load Final Dataset

This notebook represents the **load stage** of the pipeline.

Here we prepare the final, analytics-ready dataset
that will be consumed by the Streamlit dashboard.

### Input
- Latest file in `../data/processed/`

### Output
- `../data/final/crypto_final.json`

In [None]:
import json
import pandas as pd
from pathlib import Path

: 

## ðŸ“¥ Load Latest Processed JSON File

In [None]:
PROCESSED_DIR = Path("../data/processed")
processed_path = sorted(PROCESSED_DIR.glob("*.json"))[-1]

df = pd.read_json(processed_path)
processed_path

## ðŸ§® Final Preparation

This stage is intentionally lightweight.
Additional features or aggregations can be added later if needed.

In [None]:
df = df.sort_values("timestamp").reset_index(drop=True)
df.head()

## ðŸ’¾ Save Final Dataset

This file is the **single source** used by the Streamlit dashboard.

In [None]:
FINAL_DIR = Path("../data/final")
FINAL_DIR.mkdir(parents=True, exist_ok=True)

final_path = FINAL_DIR / "crypto_final.json"

with open(final_path, "w") as f:
    json.dump(df.to_dict(orient="records"), f, indent=4)

final_path

# ðŸŽ‰ Pipeline Completed Successfully

You now have a fully functioning **JSON-based ETL pipeline**:

1. Extract â†’ raw JSON  
2. Transform â†’ processed JSON  
3. Load â†’ final JSON  

You may now launch the dashboard:

streamlit run dashboard/dashboard.py