# ETL Pipeline: Download Historical Open Interest Data

## 📊 Overview
This notebook implements an ETL (Extract, Transform, Load) pipeline for downloading historical open interest (OI) data from cryptocurrency exchanges. It handles rate limiting, batch processing, and data caching for efficient data collection.

## 🎯 Objectives
1. **Data Extraction**: Download historical open interest data from exchange APIs
2. **Batch Processing**: Handle multiple trading pairs and timeframes efficiently
3. **Rate Limit Management**: Respect exchange API limits to avoid throttling
4. **Data Caching**: Store downloaded data locally for future use
5. **Visualization**: Preview downloaded data with interactive charts

## 📋 Prerequisites
- Exchange API access (e.g., Binance Perpetual)
- Network connection for API calls
- Sufficient disk space for data storage (~50MB per pair/interval)

## ⚠️ Important Notes on Rate Limiting
Open Interest API endpoints typically have lower rate limits:
- **Batch Size**: Keep `BATCH_OI_REQUEST` very low (2-5) to avoid hitting limits
- **Sleep Time**: Use longer `SLEEP_REQUEST` (5+ seconds) for safety
- **Monitor**: Watch for 429 (Too Many Requests) errors and adjust accordingly

## 📈 Expected Outputs
- Cached OI data in `app/data/cache/oi/`
- Hourly interval data (1h is recommended for OI)
- Interactive charts showing OI trends for data validation

In [1]:
from core.data_sources.clob import CLOBDataSource
import warnings

warnings.filterwarnings("ignore")

# Main class to access central limit order book connectors
clob = CLOBDataSource()

# Open Interest config
CONNECTOR_NAME = "binance_perpetual"
INTERVAL = "1h"  # 1h is recommended for OI data

DAYS = 7  # Number of days of historical data to download

# Rate limits config (more conservative for OI)
BATCH_OI_REQUEST = 20  # Number of trading pairs to request in each batch
SLEEP_REQUEST = 5  # Seconds to wait between batches



In [2]:
# Get trading rules and pairs
trading_rules = await clob.get_trading_rules(CONNECTOR_NAME)
trading_pairs = trading_rules.filter_by_quote_asset("USDT").get_all_trading_pairs()

print(f"Found {len(trading_pairs)} trading pairs")
print(f"First 10 pairs: {trading_pairs[:10]}")

# Since all Binance perpetual pairs support OI data, download all pairs
# You can limit this for faster testing by changing trading_pairs to trading_pairs[:50]
selected_pairs = trading_pairs  # Use all pairs - change to [:50] for testing
print(f"Downloading OI data for {len(selected_pairs)} pairs")

# Download OI data for all pairs
print(f"\nDownloading {INTERVAL} OI data for last {DAYS} days...")
print(f"Batch size: {BATCH_OI_REQUEST}, Sleep time: {SLEEP_REQUEST}s")

all_oi_data = await clob.get_oi_batch_last_days(
    CONNECTOR_NAME, 
    selected_pairs,
    INTERVAL, 
    DAYS, 
    BATCH_OI_REQUEST,
    SLEEP_REQUEST,
)

print(f"\nCompleted downloading OI data for {len(all_oi_data)} trading pairs")

Found 508 trading pairs
First 10 pairs: ['FET-USDT', 'YFI-USDT', '1000XEC-USDT', 'OP-USDT', 'DEGO-USDT', 'PLUME-USDT', 'BTCDOM-USDT', 'AWE-USDT', 'OL-USDT', 'SXP-USDT']
Downloading OI data for 508 pairs

Downloading 1h OI data for last 7 days...
Batch size: 20, Sleep time: 5s
OI Batch 1/26
Start: 0, End: 20
OI Batch 2/26
Start: 20, End: 40
OI Batch 3/26
Start: 40, End: 60
OI Batch 4/26
Start: 60, End: 80
OI Batch 5/26
Start: 80, End: 100
OI Batch 6/26
Start: 100, End: 120
OI Batch 7/26
Start: 120, End: 140
OI Batch 8/26
Start: 140, End: 160
OI Batch 9/26
Start: 160, End: 180
OI Batch 10/26
Start: 180, End: 200
OI Batch 11/26
Start: 200, End: 220
OI Batch 12/26
Start: 220, End: 240
OI Batch 13/26
Start: 240, End: 260
OI Batch 14/26
Start: 260, End: 280
OI Batch 15/26
Start: 280, End: 300
OI Batch 16/26
Start: 300, End: 320
OI Batch 17/26
Start: 320, End: 340
OI Batch 18/26
Start: 340, End: 360
OI Batch 19/26
Start: 360, End: 380
OI Batch 20/26
Start: 380, End: 400
OI Batch 21/26
Start: 

In [3]:
all_oi_data[0]

Unnamed: 0_level_0,sum_open_interest,sum_open_interest_value,cmc_circulating_supply,trading_pair
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2025-09-25 15:00:00,74139552,43028777.33758944,2372625449.18820953,FET-USDT
2025-09-25 16:00:00,74443260,42978376.4283516,2372625449.18820953,FET-USDT
2025-09-25 17:00:00,75027797,42615788.696,2372625449.18820953,FET-USDT
2025-09-25 18:00:00,76032496,42276984.38254656,2372625449.18820953,FET-USDT
2025-09-25 19:00:00,76361983,43559353.81316818,2372625449.18820953,FET-USDT
...,...,...,...,...
2025-10-02 10:00:00,70469899,41809791.0767,2373167815.11867237,FET-USDT
2025-10-02 11:00:00,70664643,41711023.78919532,2373167815.11867237,FET-USDT
2025-10-02 12:00:00,70904796,41771885.79211848,2373167815.11867237,FET-USDT
2025-10-02 13:00:00,70948397,41845364.5506,2373167815.11867237,FET-USDT
