# üí∞ Super Gnosis FREE Data Pipeline Demo

**Cost: $0/month** (saves $450-1,000/month vs paid alternatives)

This notebook demonstrates the complete FREE data pipeline with 10 integrated sources.

## What You'll Learn
1. Fetch market data (yfinance - VIX, SPX, OHLCV)
2. Get options chains with Greeks (Yahoo Finance - FREE!)
3. Fetch macro data (FRED - Fed Funds, Treasury yields)
4. Estimate dark pool pressure (institutional flow)
5. Track short interest (FINRA official data)
6. Monitor retail sentiment (StockTwits)
7. Use unified DataSourceManager (intelligent fallback)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DGator86/V2---Gnosis/blob/main/notebooks/02_FREE_Data_Pipeline_Demo.ipynb)

In [None]:
# Setup
import sys
sys.path.append('/home/user/webapp')

import warnings
warnings.filterwarnings('ignore')

## 1Ô∏è‚É£ yfinance - VIX, SPX, OHLCV (FREE)

In [None]:
from engines.inputs.yfinance_adapter import YFinanceAdapter

adapter = YFinanceAdapter()

# Fetch VIX and SPX
vix = adapter.fetch_vix()
spx = adapter.fetch_spx()

print(f"üå°Ô∏è  Market Regime:")
print(f"   VIX: {vix:.2f}")
print(f"   SPX: {spx:.2f}")

# Fetch historical OHLCV
df = adapter.fetch_history("SPY", period="5d", interval="1d")
print(f"\nüìä SPY Last 5 Days:")
df.select(['timestamp', 'open', 'high', 'low', 'close', 'volume'])

## 2Ô∏è‚É£ Yahoo Options - FREE Options Chains + Greeks

In [None]:
from engines.inputs.yahoo_options_adapter import YahooOptionsAdapter

options = YahooOptionsAdapter()

# Fetch options chain
print("üìã Fetching SPY options chain...")
chain = options.fetch_options_chain(
    symbol="SPY",
    min_days_to_expiry=15,
    max_days_to_expiry=45
)

print(f"\n‚úÖ Fetched {len(chain)} options")
print(f"\nSample options with Greeks:")
chain.select([
    'option_type', 'strike', 'bid', 'ask',
    'delta', 'gamma', 'theta', 'implied_vol'
]).head(10)

## 3Ô∏è‚É£ FRED - Macro Economic Data (FREE)

In [None]:
import os
from engines.inputs.fred_adapter import FREDAdapter

# Note: Requires FREE API key from fred.stlouisfed.org
fred_key = os.getenv("FRED_API_KEY")

if fred_key:
    fred = FREDAdapter(api_key=fred_key)
    
    macro = fred.fetch_macro_regime_data(lookback_days=365)
    
    print("üí∞ Macro Economic Data:")
    print(f"   Fed Funds Rate: {macro['fed_funds_rate']:.2f}%")
    print(f"   10Y Treasury: {macro['treasury_10y']:.2f}%")
    print(f"   2Y Treasury: {macro['treasury_2y']:.2f}%")
    print(f"   Yield Curve Slope: {macro['yield_curve_slope']:+.2f}%")
    print(f"   Unemployment: {macro['unemployment']:.1f}%")
    print(f"   BAA Credit Spread: {macro['baa_spread']:.2f}%")
    print(f"   Recession Probability: {macro['recession_probability']:.1%}")
else:
    print("‚ö†Ô∏è  FRED_API_KEY not set. Get free key at: https://fred.stlouisfed.org/")

## 4Ô∏è‚É£ Dark Pool - Institutional Flow Estimation (FREE)

In [None]:
from engines.inputs.dark_pool_adapter import DarkPoolAdapter

dark_pool = DarkPoolAdapter()

# Estimate dark pool pressure
pressure = dark_pool.estimate_dark_pool_pressure("SPY")

print("üèä Dark Pool Metrics:")
print(f"   Dark Pool Ratio: {pressure['dark_pool_ratio']:.2%}")
print(f"   Net Dark Buying: {pressure['net_dark_buying']:+.3f}")
print(f"   Accumulation Score: {pressure['accumulation_score']:.3f}")
print(f"   Distribution Score: {pressure['distribution_score']:.3f}")

if pressure['accumulation_score'] > 0.6:
    print("\n   üü¢ Signal: Strong institutional accumulation")
elif pressure['distribution_score'] > 0.6:
    print("\n   üî¥ Signal: Strong institutional distribution")

## 5Ô∏è‚É£ Short Volume - FINRA Official Data (FREE)

In [None]:
from engines.inputs.short_volume_adapter import ShortVolumeAdapter
from datetime import date, timedelta

short_vol = ShortVolumeAdapter()

# Fetch recent short volume (3-day delay)
test_date = date.today() - timedelta(days=3)
data = short_vol.fetch_short_volume("SPY", date=test_date)

if data:
    print(f"üìä Short Volume for {test_date}:")
    print(f"   Short Volume: {data['short_volume']:,}")
    print(f"   Total Volume: {data['total_volume']:,}")
    print(f"   Short Ratio: {data['short_ratio']:.2%}")
else:
    print(f"‚ö†Ô∏è  Data not available for {test_date} (3-day delay)")

# Calculate squeeze pressure
print("\nüöÄ Short Squeeze Analysis:")
squeeze = short_vol.calculate_short_squeeze_pressure("SPY", lookback_days=10)

print(f"   Avg Short Ratio: {squeeze['avg_short_ratio']:.2%}")
print(f"   Squeeze Pressure: {squeeze['squeeze_pressure']:.3f}")
print(f"   Covering Signal: {squeeze['covering_signal']}")

## 6Ô∏è‚É£ StockTwits - Retail Sentiment (FREE)

In [None]:
from engines.inputs.stocktwits_adapter import StockTwitsAdapter

stocktwits = StockTwitsAdapter(use_cache=False)

try:
    # Fetch sentiment
    sentiment = stocktwits.fetch_sentiment("SPY", limit=30)
    
    print("üí¨ StockTwits Sentiment:")
    print(f"   Total Messages: {sentiment.total_messages}")
    print(f"   Bullish: {sentiment.bullish_messages} ({sentiment.bullish_messages/max(1, sentiment.total_messages):.1%})")
    print(f"   Bearish: {sentiment.bearish_messages} ({sentiment.bearish_messages/max(1, sentiment.total_messages):.1%})")
    print(f"   Sentiment Score: {sentiment.sentiment_score:+.3f} (-1 to +1)")
    print(f"   Confidence: {sentiment.confidence:.3f}")
    print(f"   Trending: {'üî• YES' if sentiment.is_trending else 'No'}")
    
    # Multi-symbol
    print("\nüìä Multi-Symbol Sentiment:")
    df = stocktwits.fetch_multi_symbol_sentiment(
        symbols=["SPY", "QQQ", "AAPL", "TSLA"],
        limit_per_symbol=20
    )
    df.select(['symbol', 'sentiment_score', 'total_messages', 'is_trending'])
    
finally:
    stocktwits.close()

## 7Ô∏è‚É£ Unified DataSourceManager (Intelligent Fallback)

In [None]:
from engines.inputs.data_source_manager import DataSourceManager

# Initialize with FREE sources
manager = DataSourceManager(
    fred_api_key=os.getenv("FRED_API_KEY")
)

# Check source status
print("üìä Data Source Status:")
status = manager.get_source_status()

for row in status.iter_rows(named=True):
    icon = "‚úÖ" if row["is_available"] else "‚ùå"
    print(f"   {icon} {row['source_type']}: {'Available' if row['is_available'] else 'Not Available'}")

In [None]:
# Fetch unified data (all sources in one call)
print("\nüéØ Fetching unified data for SPY...")
data = manager.fetch_unified_data(
    symbol="SPY",
    include_options=True,
    include_sentiment=True,
    include_macro=True
)

print(f"\n‚úÖ Unified Data Summary:")
print(f"\n   OHLCV:")
print(f"      Close: ${data.close:.2f}")
print(f"      Volume: {data.volume:,}")

if data.vix:
    print(f"\n   Regime:")
    print(f"      VIX: {data.vix:.2f}")
    print(f"      SPX: {data.spx:.2f}")

if data.fed_funds_rate:
    print(f"\n   Macro:")
    print(f"      Fed Funds: {data.fed_funds_rate:.2f}%")
    print(f"      10Y Treasury: {data.treasury_10y:.2f}%")

if data.options_chain_available:
    print(f"\n   Options:")
    print(f"      Available: Yes")
    print(f"      Num Options: {data.num_options}")

if data.stocktwits_sentiment is not None:
    print(f"\n   Sentiment:")
    print(f"      StockTwits: {data.stocktwits_sentiment:+.3f}")

if data.dark_pool_ratio:
    print(f"\n   Dark Pool:")
    print(f"      Ratio: {data.dark_pool_ratio:.2%}")

if data.short_volume_ratio:
    print(f"\n   Short Interest:")
    print(f"      Short Ratio: {data.short_volume_ratio:.2%}")

print(f"\n   üì¶ Data Sources Used: {', '.join(data.data_sources_used)}")

## üí∞ Cost Comparison

In [None]:
import pandas as pd

# FREE sources
free_sources = pd.DataFrame({
    'Source': ['yfinance', 'Yahoo Options', 'FRED', 'StockTwits', 
               'FINRA', 'Dark Pool', 'ta library', 'greekcalc'],
    'Features': ['VIX/SPX/OHLCV', 'Options+Greeks', 'Macro Data', 
                 'Sentiment', 'Short Volume', 'Institutional Flow',
                 '130+ Indicators', 'Greeks Validation'],
    'Cost': ['$0/mo'] * 8
})

print("‚úÖ FREE Sources:")
print(free_sources.to_string(index=False))
print(f"\nTotal FREE: $0/month")

# Paid alternatives
paid_sources = pd.DataFrame({
    'Service': ['Polygon.io', 'CBOE DataShop', 'ORATS', 'Quiver Quant'],
    'Cost': ['$249/mo', '$100-500/mo', '$99-299/mo', '$50-200/mo']
})

print("\n‚ùå Paid Alternatives:")
print(paid_sources.to_string(index=False))
print(f"\nTotal Paid: $450-1,000+/month")
print(f"\nüíµ YOUR SAVINGS: $450-1,000/month ($5,400-12,000/year)")

## üéâ Summary

You've explored:
- ‚úÖ 8 FREE data sources (saves $450-1,000/month)
- ‚úÖ VIX, SPX, OHLCV from yfinance
- ‚úÖ Options chains with Greeks from Yahoo Finance
- ‚úÖ Macro data from FRED
- ‚úÖ Dark pool institutional flow estimation
- ‚úÖ FINRA short interest tracking
- ‚úÖ StockTwits retail sentiment
- ‚úÖ Unified DataSourceManager with fallback

## üöÄ Next Steps

1. **Get Optional API Keys** (all FREE):
   - FRED: https://fred.stlouisfed.org/
   - IEX Cloud: https://iexcloud.io/ (50K messages/mo free)
   - Reddit: https://www.reddit.com/prefs/apps (for WSB sentiment)

2. **Integrate with ML Pipeline**:
   - Use unified data in feature engineering
   - Train models with macro + sentiment + options

3. **Deploy to Production**:
   - Set up DataSourceManager with all keys
   - Enable intelligent fallback
   - Monitor data quality

**Total Cost: $0/month! üéâ**