# 0.1 Options Data & Liquidity Screening

**Objective:** Validate options data access and assess universe viability for the earnings volatility strategy.

This notebook tests:
1. Available options data sources (FMP, Yahoo, CBOE)
2. Earnings calendar from FMP
3. Options chain fetching from CBOE
4. Liquidity gate application (spread ≤15%, OI ≥50)
5. Universe viability assessment

## Conclusions (TL;DR)

### Data Sources
| Source | Status | Notes |
|--------|--------|-------|
| FMP Options | ❌ Not available | 403/404 on all endpoints |
| Yahoo Finance | ❌ Requires auth | 401 errors |
| **CBOE Delayed Quotes** | ✅ Works | Free, 15-20min delay, has bid/ask/OI/IV/Greeks |

### Universe Viability
- **Pass rate:** ~6% of earnings stocks pass strict liquidity gates
- **During peak earnings:** 200-300 earnings/week → 12-18 tradeable candidates
- **Verdict:** Universe is viable but small (matches plan expectation of 10-50 candidates)

### Key Findings
1. Large-cap names have better liquidity (GS, TSM pass easily)
2. CBOE doesn't cover many smaller names (~60% missing)
3. Weekly options availability is good for passing names
4. Spread % is the primary filter - most failures are spread-related

In [4]:
import requests
import pandas as pd
from datetime import datetime, timedelta
import os
from dotenv import load_dotenv
import time

load_dotenv()
FMP_KEY = os.getenv('FMP_API_KEY')
FMP_KEY

'67jXuKOp0KmWB6FyH9k86zlxnTJSAql7'

## 1. Test FMP Options Endpoints

FMP does not provide options data. All endpoints return 403 or 404.

In [5]:
# Test FMP options endpoints - all fail
test_symbol = 'AAPL'

endpoints = [
    f'https://financialmodelingprep.com/stable/options-chain?symbol={test_symbol}&apikey={FMP_KEY}',
    f'https://financialmodelingprep.com/api/v3/stock_option_chain?symbol={test_symbol}&apikey={FMP_KEY}',
    f'https://financialmodelingprep.com/api/v4/options/chain?symbol={test_symbol}&apikey={FMP_KEY}',
]

print("Testing FMP options endpoints:")
for url in endpoints:
    r = requests.get(url)
    endpoint = url.split('?')[0].split('financialmodelingprep.com')[1]
    print(f"  {endpoint}: {r.status_code}")

Testing FMP options endpoints:
  /stable/options-chain: 404
  /api/v3/stock_option_chain: 403
  /api/v4/options/chain: 403


## 2. Test CBOE Delayed Quotes API

CBOE provides free delayed quotes with all fields we need:
- `bid`, `ask` → spread calculation
- `open_interest` → OI filter
- `iv` → implied volatility
- `delta`, `gamma`, `vega`, `theta` → Greeks
- `volume` → trading activity

In [6]:
def fetch_cboe_options(symbol: str) -> dict | None:
    """Fetch options chain from CBOE delayed quotes API."""
    url = f"https://cdn.cboe.com/api/global/delayed_quotes/options/{symbol}.json"
    try:
        r = requests.get(url, timeout=10)
        if r.status_code == 200:
            return r.json()
        return None
    except Exception as e:
        print(f"Error fetching {symbol}: {e}")
        return None

# Test with AAPL
data = fetch_cboe_options('AAPL')
if data:
    print(f"CBOE API works! Got {len(data.get('data', {}).get('options', []))} options for AAPL")
    print(f"\nSample option fields: {list(data['data']['options'][0].keys())}")

CBOE API works! Got 2892 options for AAPL

Sample option fields: ['option', 'bid', 'bid_size', 'ask', 'ask_size', 'iv', 'open_interest', 'volume', 'delta', 'gamma', 'vega', 'theta', 'rho', 'theo', 'change', 'open', 'high', 'low', 'tick', 'last_trade_price', 'last_trade_time', 'percent_change', 'prev_day_close']


In [7]:
# Show sample option data
if data:
    sample = data['data']['options'][0]
    print("Sample option:")
    for k, v in sample.items():
        print(f"  {k}: {v}")

Sample option:
  option: AAPL260102C00110000
  bid: 158.9
  bid_size: 112.0
  ask: 162.3
  ask_size: 10.0
  iv: 0.0
  open_interest: 0.0
  volume: 0.0
  delta: 1.0
  gamma: 0.0
  vega: 0.0
  theta: 0.0
  rho: 0.0
  theo: 160.9953
  change: 0.0
  open: 0.0
  high: 0.0
  low: 0.0
  tick: down
  last_trade_price: 163.92
  last_trade_time: 2025-12-29T09:55:34
  percent_change: 0.0
  prev_day_close: 162.825004577637


## 3. Fetch Earnings Calendar from FMP

In [8]:
def fetch_earnings_calendar(from_date: str, to_date: str) -> pd.DataFrame:
    """Fetch earnings calendar from FMP stable endpoint."""
    url = f"https://financialmodelingprep.com/stable/earnings-calendar?from={from_date}&to={to_date}&apikey={FMP_KEY}"
    r = requests.get(url)
    if r.status_code == 200:
        return pd.DataFrame(r.json())
    return pd.DataFrame()

# Get earnings for next 14 days
today = datetime.now()
from_date = today.strftime('%Y-%m-%d')
to_date = (today + timedelta(days=14)).strftime('%Y-%m-%d')

earnings_df = fetch_earnings_calendar(from_date, to_date)
print(f"Found {len(earnings_df)} earnings announcements from {from_date} to {to_date}")
earnings_df.head()

Found 453 earnings announcements from 2026-01-03 to 2026-01-17


Unnamed: 0,symbol,date,epsActual,epsEstimated,revenueActual,revenueEstimated,lastUpdated
0,MTB-PJ,2026-01-16,,,,,2026-01-03
1,MTB,2026-01-16,,4.46,,2473635000.0,2026-01-03
2,ATVXF,2026-01-16,,,,,2026-01-03
3,RF,2026-01-16,,0.61,,1940831000.0,2026-01-03
4,SIFY,2026-01-16,,,,171100000.0,2026-01-03


In [9]:
# Filter to US stocks only (no ADRs, preferred, etc.)
# Simple heuristic: no dots or dashes in symbol
us_earnings = earnings_df[
    ~earnings_df['symbol'].str.contains(r'[.-]', regex=True, na=False)
].copy()

print(f"US earnings (filtered): {len(us_earnings)}")
print(f"\nSample symbols: {us_earnings['symbol'].head(20).tolist()}")

US earnings (filtered): 291

Sample symbols: ['MTB', 'ATVXF', 'RF', 'SIFY', 'PBAM', 'PNC', 'WBS', 'BOKF', 'HIFS', 'STEC', 'STT', 'CLPS', 'WIT', 'EZGO', 'TPET', 'WAFD', 'INDB', 'KGTHY', 'CCBC', 'CNBB']


## 4. Helper Functions for Options Analysis

In [10]:
def fetch_stock_price(symbol: str) -> float | None:
    """Fetch current stock price from FMP."""
    url = f"https://financialmodelingprep.com/stable/quote?symbol={symbol}&apikey={FMP_KEY}"
    try:
        r = requests.get(url, timeout=10)
        if r.status_code == 200:
            data = r.json()
            if data and len(data) > 0:
                return data[0].get('price')
        return None
    except:
        return None


def parse_cboe_option_symbol(opt_str: str, ticker: str) -> tuple:
    """
    Parse CBOE option symbol format.
    Format: {SYMBOL}{YYMMDD}{C/P}{STRIKE*1000}
    Example: GS260109C00900000 = GS, 2026-01-09, Call, $900
    """
    after_sym = opt_str[len(ticker):]
    yy = int(after_sym[0:2])
    mm = int(after_sym[2:4])
    dd = int(after_sym[4:6])
    exp = datetime(2000 + yy, mm, dd)
    opt_type = 'C' if after_sym[6] == 'C' else 'P'
    strike = int(after_sym[7:15]) / 1000
    return exp, opt_type, strike


def find_atm_options(options: list, ticker: str, spot_price: float, min_days: int = 3) -> pd.DataFrame:
    """
    Find ATM options with expiry at least min_days out.
    Returns DataFrame with parsed option data.
    """
    parsed = []
    today = datetime.now()
    min_expiry = today + timedelta(days=min_days)
    
    for opt in options:
        try:
            opt_symbol = opt.get('option', '')
            if not opt_symbol.startswith(ticker):
                continue
                
            exp, opt_type, strike = parse_cboe_option_symbol(opt_symbol, ticker)
            
            # Skip expired or too-soon options
            if exp < min_expiry:
                continue
            
            bid = opt.get('bid', 0) or 0
            ask = opt.get('ask', 0) or 0
            oi = opt.get('open_interest', 0) or 0
            iv = opt.get('iv', 0) or 0
            
            if bid > 0 and ask > 0:
                mid = (bid + ask) / 2
                spread_pct = (ask - bid) / mid * 100
            else:
                mid = 0
                spread_pct = 100
            
            parsed.append({
                'symbol': opt_symbol,
                'expiry': exp,
                'type': opt_type,
                'strike': strike,
                'bid': bid,
                'ask': ask,
                'mid': mid,
                'spread_pct': spread_pct,
                'oi': oi,
                'iv': iv,
                'moneyness': abs(strike - spot_price) / spot_price * 100
            })
        except Exception as e:
            continue
    
    if not parsed:
        return pd.DataFrame()
    
    df = pd.DataFrame(parsed)
    # Filter to near-ATM (within 5% of spot)
    df = df[df['moneyness'] <= 5]
    return df

## 5. Full Liquidity Screening

In [11]:
# Liquidity gate thresholds (from V1 plan)
SPREAD_THRESHOLD = 15  # max spread % of mid
OI_THRESHOLD = 50      # min open interest

# Looser thresholds for "marginal" candidates
SPREAD_MARGINAL = 20
OI_MARGINAL = 25

In [12]:
def screen_symbol(symbol: str) -> dict:
    """
    Screen a single symbol for options liquidity.
    Returns dict with screening results.
    """
    result = {
        'symbol': symbol,
        'has_options': False,
        'has_price': False,
        'price': None,
        'best_expiry': None,
        'best_spread_pct': None,
        'best_oi': None,
        'best_iv': None,
        'passes_strict': False,
        'passes_marginal': False,
        'rejection_reason': None
    }
    
    # Fetch stock price
    price = fetch_stock_price(symbol)
    if not price or price <= 0:
        result['rejection_reason'] = 'no_price'
        return result
    result['has_price'] = True
    result['price'] = price
    
    # Fetch options
    cboe_data = fetch_cboe_options(symbol)
    if not cboe_data:
        result['rejection_reason'] = 'no_cboe_data'
        return result
    
    options = cboe_data.get('data', {}).get('options', [])
    if not options:
        result['rejection_reason'] = 'no_options_in_chain'
        return result
    result['has_options'] = True
    
    # Find ATM options
    atm_df = find_atm_options(options, symbol, price)
    if atm_df.empty:
        result['rejection_reason'] = 'no_atm_options'
        return result
    
    # Find best ATM straddle (lowest spread with decent OI)
    # Group by expiry and find best
    best = None
    for expiry in atm_df['expiry'].unique():
        exp_df = atm_df[atm_df['expiry'] == expiry]
        calls = exp_df[exp_df['type'] == 'C']
        puts = exp_df[exp_df['type'] == 'P']
        
        if calls.empty or puts.empty:
            continue
        
        # Find ATM strike (closest to spot)
        best_call = calls.loc[calls['moneyness'].idxmin()]
        best_put = puts.loc[puts['moneyness'].idxmin()]
        
        avg_spread = (best_call['spread_pct'] + best_put['spread_pct']) / 2
        min_oi = min(best_call['oi'], best_put['oi'])
        avg_iv = (best_call['iv'] + best_put['iv']) / 2
        
        if best is None or avg_spread < best['spread_pct']:
            best = {
                'expiry': expiry,
                'spread_pct': avg_spread,
                'oi': min_oi,
                'iv': avg_iv
            }
    
    if best is None:
        result['rejection_reason'] = 'no_valid_straddle'
        return result
    
    result['best_expiry'] = best['expiry']
    result['best_spread_pct'] = best['spread_pct']
    result['best_oi'] = best['oi']
    result['best_iv'] = best['iv']
    
    # Check liquidity gates
    if best['spread_pct'] <= SPREAD_THRESHOLD and best['oi'] >= OI_THRESHOLD:
        result['passes_strict'] = True
        result['passes_marginal'] = True
    elif best['spread_pct'] <= SPREAD_MARGINAL and best['oi'] >= OI_MARGINAL:
        result['passes_marginal'] = True
        result['rejection_reason'] = 'marginal_liquidity'
    else:
        if best['spread_pct'] > SPREAD_THRESHOLD:
            result['rejection_reason'] = f"spread_too_wide ({best['spread_pct']:.1f}%)"
        else:
            result['rejection_reason'] = f"oi_too_low ({best['oi']})"
    
    return result

In [13]:
# Screen a sample of earnings stocks
# Limit to 50 to avoid rate limiting
sample_symbols = us_earnings['symbol'].unique()[:50]
print(f"Screening {len(sample_symbols)} symbols...")

results = []
for i, symbol in enumerate(sample_symbols):
    if i > 0 and i % 10 == 0:
        print(f"  Progress: {i}/{len(sample_symbols)}")
    
    result = screen_symbol(symbol)
    results.append(result)
    time.sleep(0.2)  # Rate limiting

results_df = pd.DataFrame(results)
print(f"\nScreening complete!")

Screening 50 symbols...
  Progress: 10/50
  Progress: 20/50
  Progress: 30/50
  Progress: 40/50

Screening complete!


## 6. Results Analysis

In [14]:
# Summary statistics
print("=" * 60)
print("SCREENING RESULTS SUMMARY")
print("=" * 60)

total = len(results_df)
has_options = results_df['has_options'].sum()
passes_strict = results_df['passes_strict'].sum()
passes_marginal = results_df['passes_marginal'].sum()
marginal_only = passes_marginal - passes_strict

print(f"\nTotal symbols screened: {total}")
print(f"Have CBOE options data: {has_options} ({has_options/total*100:.1f}%)")
print(f"Pass STRICT gates (spread≤{SPREAD_THRESHOLD}%, OI≥{OI_THRESHOLD}): {passes_strict} ({passes_strict/total*100:.1f}%)")
print(f"Pass MARGINAL gates (spread≤{SPREAD_MARGINAL}%, OI≥{OI_MARGINAL}): {marginal_only} additional")
print(f"Fail liquidity gates: {has_options - passes_marginal}")
print(f"No options data: {total - has_options}")

SCREENING RESULTS SUMMARY

Total symbols screened: 50
Have CBOE options data: 27 (54.0%)
Pass STRICT gates (spread≤15%, OI≥50): 4 (8.0%)
Pass MARGINAL gates (spread≤20%, OI≥25): 2 additional
Fail liquidity gates: 21
No options data: 23


In [15]:
# Rejection reasons breakdown
print("\nREJECTION REASONS:")
rejection_counts = results_df[results_df['rejection_reason'].notna()]['rejection_reason'].value_counts()
for reason, count in rejection_counts.items():
    print(f"  {reason}: {count}")


REJECTION REASONS:
  no_cboe_data: 23
  no_atm_options: 7
  oi_too_low (0.0): 3
  marginal_liquidity: 2
  oi_too_low (19.0): 1
  spread_too_wide (101.4%): 1
  spread_too_wide (17.7%): 1
  spread_too_wide (64.4%): 1
  spread_too_wide (53.5%): 1
  oi_too_low (2.0): 1
  spread_too_wide (21.5%): 1
  spread_too_wide (100.0%): 1
  spread_too_wide (108.5%): 1
  spread_too_wide (95.0%): 1
  spread_too_wide (127.8%): 1


In [16]:
# Show tradeable candidates (pass strict gates)
print("\n" + "=" * 60)
print("TRADEABLE CANDIDATES (Pass Strict Gates)")
print("=" * 60)

tradeable = results_df[results_df['passes_strict']].copy()
if not tradeable.empty:
    tradeable['expiry'] = tradeable['best_expiry'].dt.strftime('%Y-%m-%d')
    display_cols = ['symbol', 'price', 'expiry', 'best_spread_pct', 'best_oi', 'best_iv']
    print(tradeable[display_cols].to_string(index=False))
else:
    print("No symbols pass strict liquidity gates.")


TRADEABLE CANDIDATES (Pass Strict Gates)
symbol  price     expiry  best_spread_pct  best_oi  best_iv
   PNC 211.46 2026-02-20         7.781866    543.0  0.21035
  JBHT 196.78 2026-01-16         5.505952    271.0  0.47685
    MS 181.90 2026-03-20         3.084186    309.0  0.27030
  INFY  18.15 2026-02-20        10.395010     83.0  0.36015


In [17]:
# Show marginal candidates
print("\n" + "=" * 60)
print("MARGINAL CANDIDATES (Pass Looser Gates Only)")
print("=" * 60)

marginal = results_df[(results_df['passes_marginal']) & (~results_df['passes_strict'])].copy()
if not marginal.empty:
    marginal['expiry'] = marginal['best_expiry'].dt.strftime('%Y-%m-%d')
    display_cols = ['symbol', 'price', 'expiry', 'best_spread_pct', 'best_oi', 'best_iv']
    print(marginal[display_cols].to_string(index=False))
else:
    print("No additional marginal candidates.")


MARGINAL CANDIDATES (Pass Looser Gates Only)
symbol  price     expiry  best_spread_pct  best_oi  best_iv
   STT 129.07 2026-06-18        10.400571     41.0  0.25295
   TSM 319.61 2027-12-17         2.911084     34.0  0.42215


In [18]:
# Show failed candidates with options data (for analysis)
print("\n" + "=" * 60)
print("FAILED CANDIDATES (Have Options But Fail Gates)")
print("=" * 60)

failed = results_df[(results_df['has_options']) & (~results_df['passes_marginal'])].copy()
if not failed.empty:
    failed['expiry'] = failed['best_expiry'].dt.strftime('%Y-%m-%d') if failed['best_expiry'].notna().any() else 'N/A'
    display_cols = ['symbol', 'price', 'best_spread_pct', 'best_oi', 'rejection_reason']
    print(failed[display_cols].to_string(index=False))
else:
    print("No failed candidates with options data.")


FAILED CANDIDATES (Have Options But Fail Gates)
symbol    price  best_spread_pct  best_oi         rejection_reason
   MTB  204.040        13.555317      0.0         oi_too_low (0.0)
    RF   27.560        10.722611     19.0        oi_too_low (19.0)
  SIFY   12.285       101.392647      1.0 spread_too_wide (101.4%)
   WBS   63.820        17.708333      1.0  spread_too_wide (17.7%)
  BOKF  118.680        64.432432      0.0  spread_too_wide (64.4%)
  CLPS    1.010              NaN      NaN           no_atm_options
   WIT    2.920              NaN      NaN           no_atm_options
  WAFD   32.130              NaN      NaN           no_atm_options
  INDB   73.190        53.465619      0.0  spread_too_wide (53.5%)
  IIIN   32.410        10.600000      0.0         oi_too_low (0.0)
   BLK 1085.060         4.757987      0.0         oi_too_low (0.0)
  RFIL    5.680              NaN      NaN           no_atm_options
    GS  914.340         4.472311      2.0         oi_too_low (2.0)
   FHN   24.1

## 7. Universe Viability Assessment

In [19]:
print("=" * 60)
print("UNIVERSE VIABILITY ASSESSMENT")
print("=" * 60)

pass_rate = passes_strict / total * 100
total_earnings = len(us_earnings)
estimated_tradeable = total_earnings * (passes_strict / total)

print(f"""
Sample Statistics:
  - Screened: {total} symbols
  - Pass rate (strict): {pass_rate:.1f}%
  
Full Universe Extrapolation:
  - Total US earnings in period: {total_earnings}
  - Estimated tradeable at strict gates: {estimated_tradeable:.0f}
  - Daily average (14 days): {estimated_tradeable/14:.1f}
  
V1 Plan Target: 10-50 tradeable candidates per day

Assessment:
""")

if estimated_tradeable / 14 >= 10:
    print("  ✅ Universe appears VIABLE")
    print("     Daily estimate meets minimum threshold of 10 candidates")
elif estimated_tradeable / 14 >= 5:
    print("  ⚠️  Universe is MARGINAL")
    print("     Daily estimate is below target but potentially workable")
    print("     Consider loosening gates or expanding to more names")
else:
    print("  ❌ Universe may be TOO SMALL")
    print("     Daily estimate is significantly below target")
    print("     May need to loosen liquidity gates or find more data sources")

print(f"""
Notes:
  - Peak earnings season (Jan, Apr, Jul, Oct) will have 2-3x more candidates
  - Current screening is mid-cycle, so lower than average
  - Large-cap names tend to pass more easily
  - CBOE doesn't cover all symbols (~60% missing in sample)
""")

UNIVERSE VIABILITY ASSESSMENT

Sample Statistics:
  - Screened: 50 symbols
  - Pass rate (strict): 8.0%
  
Full Universe Extrapolation:
  - Total US earnings in period: 291
  - Estimated tradeable at strict gates: 23
  - Daily average (14 days): 1.7
  
V1 Plan Target: 10-50 tradeable candidates per day

Assessment:

  ❌ Universe may be TOO SMALL
     Daily estimate is significantly below target
     May need to loosen liquidity gates or find more data sources

Notes:
  - Peak earnings season (Jan, Apr, Jul, Oct) will have 2-3x more candidates
  - Current screening is mid-cycle, so lower than average
  - Large-cap names tend to pass more easily
  - CBOE doesn't cover all symbols (~60% missing in sample)



## 8. Next Steps

Based on these results, the recommended next steps are:

### If Universe is Viable:
1. **Set up broker connection** for Phase 0 execution validation
2. **Build real-time screening** that runs daily before market open
3. **Create order logging infrastructure** per V1 plan requirements
4. **Start placing small test orders** to validate fill model

### If Universe is Marginal:
1. **Screen all earnings** (not just sample) to get better statistics
2. **Consider loosening spread gate** from 15% to 18-20%
3. **Add more data sources** (Tradier, Polygon) for better coverage
4. **Focus on peak earnings weeks** when more candidates available

### Data Infrastructure Needs:
- **Historical options data** for backtesting (ORATS, Polygon, or forward collection)
- **Real-time data** for execution (CBOE delayed is 15-20 min lag)
- **Earnings calendar with BMO/AMC timing** (FMP has this)