# 📊 Pipeline Architecture

This document outlines the step-by-step architecture followed in the cryptocurrency liquidity prediction project.

---

## 1. Data Ingestion
- Source: `dataset.csv`
- Method: CSV file uploaded to Google Colab
- Libraries Used: `pandas`

## 2. Data Cleaning & Preprocessing
- Dropped missing values using `dropna()`
- Converted `date` column to datetime format
- Sorted data by date

## 3. Feature Engineering
- Created `volatility` as: (high - low) / open
- Added rolling features:
  - 7-day rolling mean of `close`
  - 14-day rolling std dev of `volatility`
  - `liquidity_ratio = volume / close`
- Dropped rows with NaN due to rolling functions

## 4. Feature Selection
- Selected features: `open`, `high`, `low`, `close`, `volume`, `rolling_mean_close`, `rolling_volatility`, `liquidity_ratio`
- Target: `volatility`

## 5. Data Splitting
- Split into training and testing sets using `train_test_split`

## 6. Feature Scaling
- Scaled features using `StandardScaler`

## 7. Model Training
- Algorithm: `RandomForestRegressor`
- Trained on scaled features

## 8. Model Evaluation
- Metrics: RMSE, MAE, R² Score
- Printed evaluation metrics on test data

---

## Tools and Libraries
- Python
- Pandas, NumPy, Scikit-learn
- Matplotlib, Seaborn
