# ML-Driven-Web-Platform-for-Cryptocurrency-Price-Forecasting - 5 Coins
## Coins: ADA, BNB, BTC, DOGE, ETH
This notebook fetches historical data from Binance, calculates technical indicators, and prepares the data for Machine Learning.

### 1. Imports and Data Fetching Function
This cell imports necessary libraries (`pandas`, `requests`, etc.) and defines the `get_binance_ohlcv` function.
- **`get_binance_ohlcv`**: Connects to the Binance API to download historical OHLCV (Open, High, Low, Close, Volume) data for a given symbol and time range. It handles pagination (fetching 1000 rows at a time) to get the full history.

In [5]:
import pandas as pd
import requests
import time
import os

# Create data directory if it doesn't exist
os.makedirs('data', exist_ok=True)

def get_binance_ohlcv(symbol, interval, start_time, end_time):
    url = "https://api.binance.com/api/v3/klines"
    all_data = []
    start = start_time

    while True:
        params = {
            "symbol": symbol,
            "interval": interval,
            "startTime": start,
            "endTime": end_time,
            "limit": 1000
        }
        try:
            response = requests.get(url, params=params)
            if response.status_code != 200:
                print(f"Error fetching data for {symbol}: {response.text}")
                break
            data = response.json()
        except Exception as e:
            print(f"Exception fetching data for {symbol}: {e}")
            break
            
        if not data:
            break
        all_data.extend(data)
        start = data[-1][0] + 1
        time.sleep(0.2)

    if not all_data:
        return pd.DataFrame()

    df = pd.DataFrame(all_data, columns=[
        "open_time","open","high","low","close","volume","close_time",
        "quote_asset_volume","trades","taker_buy_volume",
        "taker_buy_quote_volume","ignore"
    ])

    df["open_time"] = pd.to_datetime(df["open_time"], unit='ms')
    df["close_time"] = pd.to_datetime(df["close_time"], unit='ms')
    for col in ["open","high","low","close","volume"]:
        df[col] = df[col].astype(float)

    if "ignore" in df.columns:
        df = df.drop(columns=["ignore"])

    return df

### 2. Data Processing & Feature Engineering Function
This cell defines the `process_coin` function, which is the core logic of the notebook.
- **Fetches Data**: Calls `get_binance_ohlcv` to get data from Jan 2020 to Jan 2025.
- **Calculates Indicators**: Computes technical indicators like SMA, EMA, RSI, MACD, Bollinger Bands, and OBV.
- **Generates ML Features**: Creates lagged features (past values) and rolling volatility to help machine learning models predict future movements.
- **Saves Data**: Exports the final processed DataFrame to a CSV file in the `data/` folder.

In [6]:
def process_coin(symbol_name, ticker):
    print(f"Processing {symbol_name} ({ticker})...")
    
    # Parameters
    interval = "1h"
    start = int(pd.Timestamp("2020-01-01").timestamp() * 1000)
    end   = int(pd.Timestamp("2025-01-01").timestamp() * 1000)
    
    # Fetch Data
    df = get_binance_ohlcv(ticker, interval, start, end)
    
    if df.empty:
        print(f"No data found for {ticker}")
        return
        
    print(f"Fetched {len(df)} rows for {ticker}")

    # Technical Indicators
    df['SMA_20'] = df['close'].rolling(20).mean()
    df['EMA_20'] = df['close'].ewm(span=20, adjust=False).mean()

    # RSI
    delta = df['close'].diff()
    gain = delta.clip(lower=0)
    loss = -1 * delta.clip(upper=0)
    avg_gain = gain.rolling(14).mean()
    avg_loss = loss.rolling(14).mean()
    rs = avg_gain / avg_loss
    df['RSI_14'] = 100 - (100 / (1 + rs))

    # MACD
    ema12 = df['close'].ewm(span=12, adjust=False).mean()
    ema26 = df['close'].ewm(span=26, adjust=False).mean()
    df['MACD'] = ema12 - ema26
    df['MACD_signal'] = df['MACD'].ewm(span=9, adjust=False).mean()

    # Bollinger Bands
    df['BBM'] = df['close'].rolling(20).mean()
    df['BBU'] = df['BBM'] + 2 * df['close'].rolling(20).std()
    df['BBL'] = df['BBM'] - 2 * df['close'].rolling(20).std()

    # OBV
    df['OBV'] = ( (df['close'].diff() > 0) * df['volume']).cumsum() - ((df['close'].diff() < 0) * df['volume']).cumsum()

    # ML Features (Lag Features, Returns, Volatility)
    # Lagged close
    for lag in [1, 3, 6, 12, 24]:
        df[f'close_lag_{lag}'] = df['close'].shift(lag)

    # Lagged volume
    for lag in [1, 3, 6, 12, 24]:
        df[f'volume_lag_{lag}'] = df['volume'].shift(lag)

    # Returns
    df['return_1h'] = df['close'].pct_change(1)
    df['return_3h'] = df['close'].pct_change(3)
    df['return_6h'] = df['close'].pct_change(6)

    # Rolling volatility
    for window in [3, 6, 12, 24]:
        df[f'vol_{window}h'] = df['close'].rolling(window).std()

    df['target_next_close'] = df['close'].shift(-1)
    df['target_up_down'] = (df['target_next_close'] > df['close']).astype(int)

    df = df.dropna().reset_index(drop=True)
    
    # Clean Data (Remove zeros)
    cols_to_check = ["volume", "quote_asset_volume", "trades"]
    mask = (df[cols_to_check] == 0).any(axis=1)
    if mask.sum() > 0:
        print(f"Removing {mask.sum()} rows with zeros in {cols_to_check}")
        df = df[~mask]

    # Save to CSV
    output_path = f"data/{symbol_name}_ML_ready.csv"
    df.to_csv(output_path, index=False)
    print(f"Saved processed data to {output_path}. Shape: {df.shape}")

### 3. Execution for Each Coin
The following cells call `process_coin` for each of the 5 selected cryptocurrencies. This triggers the data fetching and processing pipeline for each one.

In [7]:
# 1. ADA (Cardano)
process_coin("ADA", "ADAUSDT")

Processing ADA (ADAUSDT)...
Fetched 43817 rows for ADAUSDT
Saved processed data to data/ADA_ML_ready.csv. Shape: (43792, 40)


In [8]:
# 2. BNB (Binance Coin)
process_coin("BNB", "BNBUSDT")

Processing BNB (BNBUSDT)...
Fetched 43817 rows for BNBUSDT
Saved processed data to data/BNB_ML_ready.csv. Shape: (43792, 40)


In [9]:
# 3. BTC (Bitcoin)
process_coin("BTC", "BTCUSDT")

Processing BTC (BTCUSDT)...
Fetched 43817 rows for BTCUSDT
Saved processed data to data/BTC_ML_ready.csv. Shape: (43792, 40)


In [10]:
# 4. DOGE (Dogecoin)
process_coin("DOGE", "DOGEUSDT")

Processing DOGE (DOGEUSDT)...
Fetched 43817 rows for DOGEUSDT
Saved processed data to data/DOGE_ML_ready.csv. Shape: (43792, 40)


In [11]:
# 5. ETH (Ethereum)
process_coin("ETH", "ETHUSDT")

Processing ETH (ETHUSDT)...
Fetched 43817 rows for ETHUSDT
Saved processed data to data/ETH_ML_ready.csv. Shape: (43792, 40)
