# Bitcoin Options Analysis - Clean & Simple

Extract USD interest rates, BTC funding rates, and forward prices from Bitcoin options using put-call parity regression.

**Analysis Steps:**
- Load Deribit orderbook data
- Apply put-call parity regression across time
- Visualize rate spreads and trends

In [1]:
# Import libraries and setup
# Load essential libraries for data processing and analysis

import polars as pl
import numpy as np
import plotly.graph_objects as go
from datetime import datetime, timedelta
from IPython.display import display
import sys
import os

# Add project path
current_dir = os.getcwd()
project_root = os.path.dirname(current_dir) if current_dir.endswith('notebooks') else current_dir
if project_root not in sys.path:
    sys.path.append(project_root)

# Import project modules
from btc_options.data_managers.orderbook_deribit_md_manager import OrderbookDeribitMDManager
from btc_options.analytics.weight_least_square_regressor import WLSRegressor
from btc_options.analytics.nonlinear_minimization import NonlinearMinimization

pl.Config.set_tbl_rows(10)
print("✅ Setup complete!")

✅ Setup complete!


In [2]:
# Load data and initialize components
# Configure analysis date and load Deribit data

date_str = "20240301"
data_file = f'../data_orderbook/{date_str}.output.csv.gz'

print(f"📂 Loading: {data_file}")

try:
    # Load data
    df_raw = pl.scan_csv(data_file)
    test_load = df_raw.head(1).collect()
    print(f"✅ Loaded data with {test_load.shape[1]} columns")
    
    # Initialize pipeline
    symbol_manager = OrderbookDeribitMDManager(df_raw, date_str, level=0, normalize_volume=True)
    wls_regressor = WLSRegressor()
    nonlinear_minimizer = NonlinearMinimization()
    
    # Set parameters
    symbol_manager.price_widening_factor = 0.00025
    symbol_manager.future_spread_mult = 0.0005
    nonlinear_minimizer.future_spread_mult = 0.0020
    
    print(f"✅ Initialized with {len(symbol_manager.df_symbol)} symbols")
    print(f"📊 Expiries: {symbol_manager.opt_expiries}")
    
except Exception as e:
    print(f"❌ Error: {e}")
    print("💡 Ensure data files are in ../data_orderbook/ directory")

📂 Loading: ../data_orderbook/20240301.output.csv.gz
✅ Loaded data with 25 columns
✅ Loaded data with 25 columns
✅ Initialized with 1196 symbols
📊 Expiries: ['1MAR24', '2MAR24', '3MAR24', '4MAR24', '8MAR24', '15MAR24', '22MAR24', '29MAR24', '26APR24', '31MAY24', '28JUN24', '27SEP24', '27DEC24']
✅ Initialized with 1196 symbols
📊 Expiries: ['1MAR24', '2MAR24', '3MAR24', '4MAR24', '8MAR24', '15MAR24', '22MAR24', '29MAR24', '26APR24', '31MAY24', '28JUN24', '27SEP24', '27DEC24']


In [3]:
# Process market data
# Convert raw ticks to regular time intervals

print("🔄 Processing market data...")
df_conflated_md = symbol_manager.get_conflated_md(freq="1m", period="5m")
print(f"✅ Processed: {df_conflated_md.shape}")

🔄 Processing market data...
🔄 Converting order book depth data to BBO format (using level 0)...
✅ Converted 6594554 orderbook rows to 6468865 BBO rows
🔄 Converting order book depth data to BBO format (using level 0)...
✅ Converted 6594554 orderbook rows to 6468865 BBO rows
🔄 Normalizing volume from USD to BTC for futures and perpetuals...
✅ Normalized volumes for 9/9 futures/perpetual symbols
✅ Processed: (1269753, 23)
🔄 Normalizing volume from USD to BTC for futures and perpetuals...
✅ Normalized volumes for 9/9 futures/perpetual symbols
✅ Processed: (1269753, 23)


In [4]:
# Run time series regression analysis
# Extract rates and forward prices across multiple timestamps

print("🚀 RUNNING TIME SERIES ANALYSIS")
print("=" * 40)

# Configure analysis
time_interval_minutes = 5
minimum_strikes = 5
wls_regressor.set_printable(False)
nonlinear_minimizer.set_printable(False)

# Storage for results
results = []
successful_fits = {}

# Get time ranges
time_ranges = df_conflated_md.group_by('expiry').agg([
    pl.col('timestamp').first().alias('start_time'),
    pl.col('timestamp').last().alias('end_time')
]).to_dicts()
start_time_map = {item['expiry']: item for item in time_ranges}

# Process each expiry
for expiry in symbol_manager.opt_expiries:
    successful_fits[expiry] = {'successful': 0, 'total': 0}
    
    # Generate timestamps
    start_time = start_time_map[expiry]['start_time'] + timedelta(minutes=time_interval_minutes)
    if not symbol_manager.is_expiry_today(expiry):
        end_time = start_time.replace(hour=23, minute=59)
    else:
        end_time = start_time.replace(hour=7, minute=0)
    print(f"🔄 Processing: {expiry}; {start_time} to {end_time}")
    
    timestamps = pl.datetime_range(
        start=start_time, 
        end=end_time, 
        interval=f"{time_interval_minutes}m", 
        eager=True
    ).to_list()
    
    initial_guess = None
    
    for ts in timestamps:
        successful_fits[expiry]['total'] += 1
        
        try:
            # Create synthetic option data
            df_chain, df_synthetic = symbol_manager.create_option_synthetic(
                df_conflated_md, expiry=expiry, timestamp=ts
            )
            
            if not df_synthetic.is_empty() and len(df_synthetic) >= minimum_strikes:
                # Get initial guess from WLS if needed
                if initial_guess is None:
                    wls_result = wls_regressor.fit(df_synthetic)
                    initial_guess = (wls_result['r'], wls_result['q'])
                
                # Run constrained optimization
                tau = df_synthetic['tau'][0]
                s0 = df_synthetic['S'][0]
                const_guess = -s0 * np.exp(-initial_guess[1] * tau)
                coef_guess = np.exp(-initial_guess[0] * tau)
                
                result = nonlinear_minimizer.fit(df_synthetic, const_guess, coef_guess)
                
                # Store results
                results.append({
                    'expiry': expiry,
                    'timestamp': ts,
                    'r': result['r'],
                    'q': result['q'], 
                    'F': result['F'],
                    'S': df_synthetic['S'][0],
                    'tau': tau,
                    'r2': result['r2'],
                    'sse': result['sse'],
                    'data_points': len(df_synthetic)
                })
                
                initial_guess = (result['r'], result['q'])
                successful_fits[expiry]['successful'] += 1
                
        except Exception as e:
            print(f"   ❌ Error processing {expiry} at {ts}: {e}")
            continue
    
    success_rate = (successful_fits[expiry]['successful'] / 
                   successful_fits[expiry]['total'] * 100)
    print(f"   ✅ {successful_fits[expiry]['successful']} fits ({success_rate:.1f}% success)")

# Create results DataFrame
df_results = pl.DataFrame(results).with_columns(
    (pl.col('r') - pl.col('q')).alias('rate_spread')
)

print(f"\n📊 Total fits: {len(df_results)}")
print(f"📈 Avg rate spread: {df_results['rate_spread'].mean()*100:.3f}%")
print(f"📈 Avg R²: {df_results['r2'].mean():.4f}")

🚀 RUNNING TIME SERIES ANALYSIS
🔄 Processing: 1MAR24; 2024-03-01 00:10:00 to 2024-03-01 07:00:00
   ✅ 74 fits (89.2% success)
🔄 Processing: 2MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 74 fits (89.2% success)
🔄 Processing: 2MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 265 fits (92.7% success)
🔄 Processing: 3MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 265 fits (92.7% success)
🔄 Processing: 3MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 264 fits (92.3% success)
🔄 Processing: 4MAR24; 2024-03-01 08:11:00 to 2024-03-01 23:59:00
   ✅ 264 fits (92.3% success)
🔄 Processing: 4MAR24; 2024-03-01 08:11:00 to 2024-03-01 23:59:00


  warn("omni_normtest is not valid with less than 8 observations; %i "


   ✅ 155 fits (81.6% success)
🔄 Processing: 8MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ❌ Error processing 8MAR24 at 2024-03-01 03:25:00: Optimization failed: Inequality constraints incompatible
   ❌ Error processing 8MAR24 at 2024-03-01 06:20:00: Optimization failed: Positive directional derivative for linesearch
   ❌ Error processing 8MAR24 at 2024-03-01 03:25:00: Optimization failed: Inequality constraints incompatible
   ❌ Error processing 8MAR24 at 2024-03-01 06:20:00: Optimization failed: Positive directional derivative for linesearch


  {'type': 'ineq', 'fun': lambda p: -np.log(p[1]) - self.r_min * tau},
  {'type': 'ineq', 'fun': lambda p: np.log(p[1]) + self.r_max * tau},
  {'type': 'ineq', 'fun': lambda p: -np.log(p[1]) + np.log(-p[0] / spot) - self.minimum_rate * tau},
  {'type': 'ineq', 'fun': lambda p: np.log(p[1]) - np.log(-p[0] / spot) + self.maximum_rate * tau},


   ❌ Error processing 8MAR24 at 2024-03-01 12:00:00: Optimization failed: Inequality constraints incompatible
   ❌ Error processing 8MAR24 at 2024-03-01 22:55:00: Optimization failed: Inequality constraints incompatible
   ✅ 260 fits (90.9% success)
🔄 Processing: 15MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ❌ Error processing 8MAR24 at 2024-03-01 22:55:00: Optimization failed: Inequality constraints incompatible
   ✅ 260 fits (90.9% success)
🔄 Processing: 15MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 264 fits (92.3% success)
🔄 Processing: 22MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 264 fits (92.3% success)
🔄 Processing: 22MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 265 fits (92.7% success)
🔄 Processing: 29MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ✅ 265 fits (92.7% success)
🔄 Processing: 29MAR24; 2024-03-01 00:10:00 to 2024-03-01 23:59:00
   ❌ Error processing 29MAR24 at 2024-03-01 06:15:00: Optimization failed: Positive directiona

In [5]:
# Display sample results
# Show key metrics from regression analysis

print("📋 SAMPLE RESULTS (First 10):")
display_results = df_results.head(10).with_columns([
    (pl.col('r') * 100).round(3).alias('USD_Rate_%'),
    (pl.col('q') * 100).round(3).alias('BTC_Rate_%'),
    (pl.col('rate_spread') * 100).round(3).alias('Spread_%'),
    pl.col('F').round(2).alias('Forward_$'),
    pl.col('S').round(2).alias('Spot_$'),
    pl.col('r2').round(4).alias('R_Squared')
]).select([
    'expiry', 'timestamp', 'USD_Rate_%', 'BTC_Rate_%', 'Spread_%', 
    'Forward_$', 'Spot_$', 'R_Squared', 'data_points'
])

display(display_results)

📋 SAMPLE RESULTS (First 10):


expiry,timestamp,USD_Rate_%,BTC_Rate_%,Spread_%,Forward_$,Spot_$,R_Squared,data_points
str,datetime[μs],f64,f64,f64,f64,f64,f64,i64
"""1MAR24""",2024-03-01 00:10:00,9.495,9.995,-0.5,61336.45,61336.72,1.0,20
"""1MAR24""",2024-03-01 00:15:00,39.996,9.996,30.0,61273.7,61257.44,0.9999,18
"""1MAR24""",2024-03-01 00:20:00,25.119,9.995,15.124,61506.21,61498.07,1.0,20
"""1MAR24""",2024-03-01 00:25:00,32.173,9.995,22.178,61463.37,61451.57,1.0,17
"""1MAR24""",2024-03-01 00:30:00,16.786,10.0,6.786,61743.4,61739.81,0.9999,16
"""1MAR24""",2024-03-01 00:35:00,27.647,10.0,17.647,61593.72,61584.52,0.9999,16
"""1MAR24""",2024-03-01 00:40:00,19.574,10.0,9.574,61672.87,61667.93,0.9999,15
"""1MAR24""",2024-03-01 00:45:00,39.999,9.999,30.0,61629.68,61614.38,0.9996,11
"""1MAR24""",2024-03-01 00:50:00,28.914,9.999,18.915,61529.86,61520.34,0.9997,12
"""1MAR24""",2024-03-01 00:55:00,40.0,10.0,30.0,61615.08,61600.14,0.9997,12


In [6]:
# Multi-panel visualization
# Plot rate spread, USD rate, BTC rate, and basis across time for each expiry

if not df_results.is_empty() and len(df_results) > 1:
    from plotly.subplots import make_subplots
    
    print("📊 MULTI-PANEL VISUALIZATION")
    
    # Create 4-panel subplot
    fig = make_subplots(
        rows=4, cols=1,
        subplot_titles=['Rate Spread (r-q) %', 'USD Rate (r) %', 'BTC Rate (q) %', 'Basis (F/S-1)'],
        vertical_spacing=0.08,
        shared_xaxes=True
    )
    
    colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd']
    
    # Plot data for each expiry
    for i, expiry in enumerate(symbol_manager.get_sorted_expiries()):
        expiry_data = df_results.filter(pl.col('expiry') == expiry).sort('timestamp')
        
        if not expiry_data.is_empty():
            color = colors[i % len(colors)]
            times = expiry_data['timestamp'].to_list()
            
            # Calculate basis (F/S - 1)
            basis_data = ((expiry_data['F'] / expiry_data['S']) - 1).to_list()
            
            # Panel data
            panels_data = [
                (expiry_data['rate_spread'] * 100).to_list(),  # Rate spread
                (expiry_data['r'] * 100).to_list(),            # USD rate
                (expiry_data['q'] * 100).to_list(),            # BTC rate  
                basis_data                                     # Basis (F/S-1)
            ]
            
            # Add traces for each panel
            for panel, data in enumerate(panels_data, 1):
                fig.add_trace(go.Scatter(
                    x=times,
                    y=data,
                    mode='lines+markers',
                    name=expiry,
                    line=dict(color=color, width=2),
                    marker=dict(size=4),
                    showlegend=(panel == 1),  # Show legend only for first panel
                    legendgroup=expiry        # Group traces by expiry
                ), row=panel, col=1)
    
    # Update layout
    fig.update_layout(
        title="Bitcoin Options - Multi-Panel Time Series Analysis",
        height=800,
        template='plotly_white',
        legend=dict(orientation="v", y=0.7, x=1.1, xanchor="center", font=dict(size=10))
    )
    
    # Update y-axis titles
    fig.update_yaxes(title_text="Rate Spread %", row=1, col=1)
    fig.update_yaxes(title_text="USD Rate %", row=2, col=1)
    fig.update_yaxes(title_text="BTC Rate %", row=3, col=1)
    fig.update_yaxes(title_text="Basis", row=4, col=1)
    fig.update_xaxes(title_text="Time", row=4, col=1)
    
    fig.show()
    print("✅ Visualization complete!")
    
else:
    print("📊 Insufficient data for visualization")

📊 MULTI-PANEL VISUALIZATION


✅ Visualization complete!


In [12]:
# Term structure of basis across expiries
# Plot basis vs time-to-expiry for a specific datetime

if not df_results.is_empty():
    # Input datetime for term structure analysis
    analysis_datetime = datetime(2024, 3, 1, 12, 0, 0)  # Example: March 1, 2024 at 12:00
    
    print(f"📊 TERM STRUCTURE ANALYSIS at {analysis_datetime}")
    
    # Find data closest to the specified datetime
    tolerance_minutes = 1  # Look within 30 minutes
    target_time = analysis_datetime
    
    term_structure_data = []
    
    for expiry in symbol_manager.opt_expiries:
        # Get data for this expiry around the target time
        expiry_data = df_results.filter(
            (pl.col('expiry') == expiry) & 
            (pl.col('timestamp') >= target_time - timedelta(minutes=tolerance_minutes)) &
            (pl.col('timestamp') <= target_time + timedelta(minutes=tolerance_minutes))
        ).sort('timestamp')
        
        if not expiry_data.is_empty():
            # Take the closest timestamp to our target
            closest_data = expiry_data.head(1)
            
            # Calculate basis and time to expiry
            basis = (closest_data['F'][0] / closest_data['S'][0]) - 1
            tau = closest_data['tau'][0]  # Time to expiry in years
            actual_timestamp = closest_data['timestamp'][0]
            
            term_structure_data.append({
                'expiry': expiry,
                'time_to_expiry_days': tau * 365.25,  # Convert to days
                'basis': basis,
                'basis_pct': basis * 100,
                'forward_price': closest_data['F'][0],
                'spot_price': closest_data['S'][0],
                'timestamp': actual_timestamp,
                'r': closest_data['r'][0],
                'q': closest_data['q'][0]
            })
    
    if term_structure_data:
        # Create DataFrame and sort by time to expiry
        df_term = pl.DataFrame(term_structure_data).sort('time_to_expiry_days')
        
        print(f"Found {len(df_term)} expiries within {tolerance_minutes} minutes of target time")
        
        # Create term structure plot
        fig = go.Figure()
        
        fig.add_trace(go.Scatter(
            x=df_term['time_to_expiry_days'].to_list(),
            y=df_term['basis_pct'].to_list(),
            mode='lines+markers',
            name='Basis',
            line=dict(color='#1f77b4', width=3),
            marker=dict(size=8, color='#1f77b4'),
            text=df_term['expiry'].to_list(),
            textposition='top center',
            hovertemplate=(
                '<b>%{text}</b><br>' +
                'Days to Expiry: %{x:.1f}<br>' +
                'Basis: %{y:.3f}%<br>' +
                'Forward: $%{customdata[0]:.2f}<br>' +
                'Spot: $%{customdata[1]:.2f}<br>' +
                '<extra></extra>'
            ),
            customdata=list(zip(
                df_term['forward_price'].to_list(),
                df_term['spot_price'].to_list()
            ))
        ))
        
        # Add horizontal line at zero
        fig.add_hline(y=0, line_dash="dash", line_color="gray", opacity=0.7)
        
        fig.update_layout(
            title=f"Bitcoin Options - Basis Term Structure at {analysis_datetime.strftime('%Y-%m-%d %H:%M')}",
            xaxis_title="Time to Expiry (Days)",
            yaxis_title="Basis (%)",
            template='plotly_white',
            height=500,
            showlegend=False
        )
        
        # Add annotations for each point
        for i, row in enumerate(df_term.iter_rows(named=True)):
            fig.add_annotation(
                x=row['time_to_expiry_days'],
                y=row['basis_pct'],
                text=row['expiry'],
                showarrow=True,
                arrowhead=2,
                arrowsize=1,
                arrowwidth=1,
                arrowcolor='gray',
                ax=0,
                ay=-30,
                font=dict(size=10)
            )
        
        fig.show()
    else:
        print(f"❌ No data found within {tolerance_minutes} minutes of {analysis_datetime}")
        print("💡 Try adjusting the analysis_datetime or tolerance_minutes")
        
else:
    print("❌ No results available for term structure analysis")

📊 TERM STRUCTURE ANALYSIS at 2024-03-01 12:00:00
Found 8 expiries within 1 minutes of target time


In [13]:

# Display term structure table
print(f"📋 TERM STRUCTURE DATA as of {target_time}:")
display_term = df_term.with_columns([
    pl.col('time_to_expiry_days').round(1).alias('Days_to_Expiry'),
    pl.col('basis_pct').round(3).alias('Basis_%'),
    pl.col('forward_price').round(2).alias('Forward_$'),
    pl.col('spot_price').round(2).alias('Spot_$'),
    (pl.col('r') * 100).round(3).alias('USD_Rate_%'),
    (pl.col('q') * 100).round(3).alias('BTC_Rate_%')
]).select([
    'expiry', 'Days_to_Expiry', 'Basis_%', 'Forward_$', 'Spot_$', 
    'USD_Rate_%', 'BTC_Rate_%', 'timestamp'
])

display(display_term)

# Calculate some term structure metrics
if len(df_term) > 1:
    avg_basis = df_term['basis_pct'].mean()
    basis_slope = (df_term['basis_pct'][-1] - df_term['basis_pct'][0]) / (df_term['time_to_expiry_days'][-1] - df_term['time_to_expiry_days'][0])
    
    print(f"\n📊 TERM STRUCTURE METRICS:")
    print(f"   Average Basis: {avg_basis:.3f}%")
    print(f"   Basis Slope: {basis_slope:.6f}% per day")
    print(f"   Contango/Backwardation: {'Contango' if avg_basis > 0 else 'Backwardation'}")

print("✅ Term structure analysis complete!")


📋 TERM STRUCTURE DATA as of 2024-03-01 12:00:00:


expiry,Days_to_Expiry,Basis_%,Forward_$,Spot_$,USD_Rate_%,BTC_Rate_%,timestamp
str,f64,f64,f64,f64,f64,f64,datetime[μs]
"""2MAR24""",0.8,0.05,61934.32,61903.31,31.938,10.0,2024-03-01 12:00:00
"""3MAR24""",1.8,0.15,61996.12,61903.31,39.827,10.0,2024-03-01 12:00:00
"""4MAR24""",2.8,0.194,62056.74,61936.63,33.107,8.144,2024-03-01 12:01:00
"""15MAR24""",13.8,0.834,62419.66,61903.31,24.369,2.451,2024-03-01 12:00:00
"""22MAR24""",20.8,1.214,62655.08,61903.31,22.501,1.353,2024-03-01 12:00:00
"""29MAR24""",27.9,1.485,62822.4,61903.31,19.485,0.158,2024-03-01 12:00:00
"""31MAY24""",90.9,4.286,64556.52,61903.31,17.13,0.266,2024-03-01 12:00:00
"""28JUN24""",118.9,5.242,65148.59,61903.31,15.715,0.021,2024-03-01 12:00:00



📊 TERM STRUCTURE METRICS:
   Average Basis: 1.682%
   Basis Slope: 0.043973% per day
   Contango/Backwardation: Contango
✅ Term structure analysis complete!


## Analysis Complete!

**✅ Successfully extracted:**
- **USD Interest Rates (r)** and **BTC Funding Rates (q)** from options pricing
- **Rate Spread (r-q)** showing market dynamics
- **Forward Prices (F)** with no-arbitrage constraints

**📊 Pipeline:**
1. Loaded and processed Deribit orderbook data
2. Applied constrained put-call parity regression
3. Generated time series of financial rates

**🎯 Ready for trading analysis and risk management applications**