# NASDAQ Data Link Integration with CustomData

This notebook demonstrates how to fetch professional-grade market data from NASDAQ Data Link (formerly Quandl) and integrate it into Zipline using CustomData.

## Why NASDAQ Data Link?

- **Professional Quality**: Cleaned, adjusted data from reliable sources
- **Comprehensive Coverage**: Stocks, futures, options, forex, and more
- **Historical Depth**: Data going back decades
- **Corporate Actions**: Automatic adjustments for splits and dividends
- **API Reliability**: Enterprise-grade API with SLA guarantees

## Prerequisites

### 1. Get Your API Key

1. Sign up at [NASDAQ Data Link](https://data.nasdaq.com/)
2. Navigate to Account Settings ‚Üí API Key
3. Copy your API key

**Pricing:**
- **Free Tier**: Limited access, 50 calls/day
- **Premium**: Full access to premium datasets, higher rate limits

### 2. Set Up API Key

**Option A: Environment Variable (Recommended)**
```bash
# Add to .env file
NASDAQ_DATA_LINK_API_KEY=your_api_key_here
```

**Option B: Direct in Notebook** (for testing only)
```python
import os
os.environ['NASDAQ_DATA_LINK_API_KEY'] = 'your_api_key_here'
```

### 3. Install Package

In [None]:
# Install nasdaq-data-link if not already installed
!pip install nasdaq-data-link -q

## Step 1: Import Libraries and Configure

In [None]:
import os
import nasdaqdatalink
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
from datetime import datetime, timedelta
from dotenv import load_dotenv

# Zipline imports
from zipline.pipeline.data import create_custom_db, insert_custom_data, from_db
from zipline.pipeline.data import query_custom_data, list_custom_dbs, get_custom_db_info
from zipline.pipeline import Pipeline
from zipline.pipeline.engine import SimplePipelineEngine
from zipline.pipeline.loaders.custom_db_loader import DatabaseCustomDataLoader
from zipline.utils.calendar_utils import get_calendar

warnings.filterwarnings('ignore')

# Set up matplotlib
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

print("‚úì Libraries imported successfully!")

## Step 2: Configure API Key

Load API key from environment or set it directly:

In [None]:
# Try to load from .env file
load_dotenv()

# Get API key from environment
api_key = os.getenv('NASDAQ_DATA_LINK_API_KEY')

if not api_key:
    print("‚ö†Ô∏è  API key not found in environment!")
    print("\nPlease set your API key using one of these methods:\n")
    print("Method 1: Environment variable")
    print("  export NASDAQ_DATA_LINK_API_KEY='your_key_here'\n")
    print("Method 2: .env file")
    print("  Add to .env: NASDAQ_DATA_LINK_API_KEY=your_key_here\n")
    print("Method 3: Set directly in this cell (NOT recommended for production)")
    print("  Uncomment the line below and add your key:\n")
    # api_key = 'YOUR_API_KEY_HERE'  # UNCOMMENT AND REPLACE
    raise ValueError("API key required to continue")

# Configure NASDAQ Data Link
nasdaqdatalink.ApiConfig.api_key = api_key

print("‚úì API key configured successfully!")
print(f"  Key preview: {api_key[:8]}...{api_key[-4:]}")

## Step 3: Define Stock Universe

NASDAQ Data Link uses database codes for different data sources:
- **WIKI**: Historical stock prices (FREE - but discontinued)
- **EOD**: End of Day US Stock Prices (Premium)
- **SF1**: Core US Fundamentals Data (Premium)

For this example, we'll use **EOD** for premium users or **WIKI** for free tier testing.

In [None]:
# Choose your data source
# 'WIKI' - Free (historical data, no longer updated)
# 'EOD' - Premium (current end-of-day prices)
DATA_SOURCE = 'EOD'  # Change to 'WIKI' if you don't have premium access

# Define stock universe with NASDAQ Data Link codes
stocks = {
    'AAPL': {'name': 'Apple Inc.', 'sid': 1},
    'MSFT': {'name': 'Microsoft Corporation', 'sid': 2},
    'GOOGL': {'name': 'Alphabet Inc.', 'sid': 3},
    'AMZN': {'name': 'Amazon.com Inc.', 'sid': 4},
    'TSLA': {'name': 'Tesla Inc.', 'sid': 5},
    'NVDA': {'name': 'NVIDIA Corporation', 'sid': 6},
    'META': {'name': 'Meta Platforms Inc.', 'sid': 7},
    'JPM': {'name': 'JPMorgan Chase & Co.', 'sid': 8},
    'V': {'name': 'Visa Inc.', 'sid': 9},
    'WMT': {'name': 'Walmart Inc.', 'sid': 10},
}

# Date range
start_date = '2022-01-01'
end_date = '2023-12-31'

# Database directory
db_dir = '/data/custom_databases'

print(f"Data Source: {DATA_SOURCE}")
print(f"Stocks: {len(stocks)}")
print(f"Date Range: {start_date} to {end_date}")
print(f"Database Directory: {db_dir}")

# Create reverse mappings
ticker_to_sid = {ticker: info['sid'] for ticker, info in stocks.items()}
sid_to_ticker = {info['sid']: ticker for ticker, info in stocks.items()}

## Step 4: Test API Connection

Let's verify the API is working by fetching a single stock:

In [None]:
print("Testing API connection...\n")

try:
    # Test with Apple stock
    test_ticker = 'AAPL'
    test_code = f"{DATA_SOURCE}/{test_ticker}"
    
    print(f"Fetching: {test_code}")
    
    # Fetch just 5 days of data as a test
    test_data = nasdaqdatalink.get(
        test_code,
        start_date='2023-01-01',
        end_date='2023-01-10',
    )
    
    print(f"‚úì API connection successful!\n")
    print(f"Sample data for {test_ticker}:")
    print(test_data.head())
    print(f"\nAvailable columns: {', '.join(test_data.columns)}")
    
except Exception as e:
    print(f"‚ùå API Error: {e}\n")
    print("Common issues:")
    print("1. Invalid API key")
    print("2. No access to premium datasets (try DATA_SOURCE='WIKI')")
    print("3. Rate limit exceeded")
    print("4. Network connection issues")
    raise

## Step 5: Fetch Historical Data

Download complete historical data for all stocks:

In [None]:
print(f"Downloading data from NASDAQ Data Link ({DATA_SOURCE})...\n")

all_data = []
failed_tickers = []

for ticker, info in stocks.items():
    sid = info['sid']
    company_name = info['name']
    
    print(f"  [{sid}/{len(stocks)}] Fetching {ticker} ({company_name})...", end=" ")
    
    try:
        # Construct NASDAQ Data Link code
        nasdaq_code = f"{DATA_SOURCE}/{ticker}"
        
        # Fetch data
        df = nasdaqdatalink.get(
            nasdaq_code,
            start_date=start_date,
            end_date=end_date,
        )
        
        if df.empty:
            print("‚ùå No data available")
            failed_tickers.append(ticker)
            continue
        
        # Standardize column names (different sources use different naming)
        # EOD uses: Open, High, Low, Close, Volume, Dividend, Split, Adj_Open, Adj_High, Adj_Low, Adj_Close, Adj_Volume
        # WIKI uses: Open, High, Low, Close, Volume, Ex-Dividend, Split Ratio, Adj. Open, Adj. High, Adj. Low, Adj. Close, Adj. Volume
        
        column_mapping = {
            'Adj. Open': 'Adj_Open',
            'Adj. High': 'Adj_High',
            'Adj. Low': 'Adj_Low',
            'Adj. Close': 'Adj_Close',
            'Adj. Volume': 'Adj_Volume',
        }
        df = df.rename(columns=column_mapping)
        
        # Add metadata
        df['sid'] = sid
        df['ticker'] = ticker
        
        all_data.append(df)
        print(f"‚úì {len(df)} days")
        
    except nasdaqdatalink.errors.quandl_error.NotFoundError:
        print(f"‚ùå Not found in {DATA_SOURCE}")
        failed_tickers.append(ticker)
    except nasdaqdatalink.errors.quandl_error.ForbiddenError:
        print(f"‚ùå Access denied (premium data)")
        failed_tickers.append(ticker)
    except Exception as e:
        print(f"‚ùå Error: {e}")
        failed_tickers.append(ticker)

print(f"\n‚úì Successfully downloaded {len(all_data)}/{len(stocks)} stocks")
if failed_tickers:
    print(f"  Failed: {', '.join(failed_tickers)}")
print(f"  Total data points: {sum(len(df) for df in all_data):,}")

if not all_data:
    raise ValueError("No data was downloaded. Check your API access and data source.")

# Display sample data
print(f"\nSample data for {all_data[0]['ticker'].iloc[0]}:")
print(all_data[0][['Open', 'High', 'Low', 'Close', 'Volume']].head())

## Step 6: Create CustomData Database

Set up a database for NASDAQ Data Link market data:

In [None]:
print("Creating CustomData database for NASDAQ market data...\n")

# Create database with comprehensive columns
create_custom_db(
    'nasdaq-market-data',
    columns={
        # Raw OHLCV
        'open': float,
        'high': float,
        'low': float,
        'close': float,
        'volume': float,
        # Adjusted OHLCV (splits/dividends)
        'adj_open': float,
        'adj_high': float,
        'adj_low': float,
        'adj_close': float,
        'adj_volume': float,
    },
    bar_size='1d',
    db_dir=db_dir,
)

print("‚úì Database 'nasdaq-market-data' created successfully!")

# Display database info
db_info = get_custom_db_info('nasdaq-market-data', db_dir=db_dir)
print("\nDatabase Information:")
for key, value in db_info.items():
    print(f"  {key}: {value}")

## Step 7: Insert Data into Database

Convert and insert the NASDAQ data:

In [None]:
print("Inserting NASDAQ data into database...\n")

for i, df in enumerate(all_data, 1):
    sid = df['sid'].iloc[0]
    ticker = df['ticker'].iloc[0]
    
    print(f"  [{i}/{len(all_data)}] Inserting {ticker} (SID {sid})...", end=" ")
    
    try:
        # Create MultiIndex DataFrame (field, sid)
        data = pd.DataFrame({
            ('open', sid): df['Open'],
            ('high', sid): df['High'],
            ('low', sid): df['Low'],
            ('close', sid): df['Close'],
            ('volume', sid): df['Volume'],
        })
        
        # Add adjusted columns if available
        if 'Adj_Open' in df.columns:
            data[('adj_open', sid)] = df['Adj_Open']
            data[('adj_high', sid)] = df['Adj_High']
            data[('adj_low', sid)] = df['Adj_Low']
            data[('adj_close', sid)] = df['Adj_Close']
            data[('adj_volume', sid)] = df['Adj_Volume']
        else:
            # If no adjusted data, use raw data
            data[('adj_open', sid)] = df['Open']
            data[('adj_high', sid)] = df['High']
            data[('adj_low', sid)] = df['Low']
            data[('adj_close', sid)] = df['Close']
            data[('adj_volume', sid)] = df['Volume']
        
        # Set MultiIndex
        data.columns = pd.MultiIndex.from_tuples(data.columns, names=['field', 'sid'])
        
        # Insert into database
        insert_custom_data(
            'nasdaq-market-data',
            data,
            mode='update',
            db_dir=db_dir,
        )
        
        print(f"‚úì {len(data)} rows")
        
    except Exception as e:
        print(f"‚ùå Error: {e}")

print("\n‚úì All NASDAQ data inserted successfully!")

# Update database info
db_info = get_custom_db_info('nasdaq-market-data', db_dir=db_dir)
print(f"\nTotal rows in database: {db_info.get('row_count', 'N/A'):,}")

## Step 8: Query and Verify Data

Verify the data was inserted correctly:

In [None]:
# Query data for a specific stock
test_ticker = list(ticker_to_sid.keys())[0]
test_sid = ticker_to_sid[test_ticker]

print(f"Querying data for {test_ticker} (SID {test_sid})...\n")

result = query_custom_data(
    'nasdaq-market-data',
    start_date='2023-01-01',
    end_date='2023-01-31',
    sids=[test_sid],
    columns=['close', 'adj_close', 'volume'],
    db_dir=db_dir,
)

print(f"Retrieved {len(result)} rows\n")
print(result.head(10))

# Compare raw vs adjusted
if 'adj_close' in result.columns:
    print(f"\nAdjustment Analysis:")
    avg_diff = ((result['close'] - result['adj_close']) / result['close'] * 100).abs().mean()
    print(f"  Average difference between Close and Adj_Close: {avg_diff:.2f}%")
    print("  (This accounts for splits, dividends, and other corporate actions)")

## Step 9: Build Pipeline with NASDAQ Data

Create a comprehensive pipeline:

In [None]:
from zipline.pipeline.factors import CustomFactor, SimpleMovingAverage

print("Loading NASDAQ data into Pipeline...\n")

# Load the dataset
NASDAQData = from_db('nasdaq-market-data', db_dir=db_dir)

print("‚úì NASDAQData dataset loaded")
print(f"  Available columns: {', '.join([c for c in dir(NASDAQData) if not c.startswith('_')])}")

# Custom Factors
class AdjustedReturn(CustomFactor):
    """Calculate return using adjusted close prices"""
    inputs = [NASDAQData.adj_close]
    window_length = 2
    
    def compute(self, today, assets, out, adj_close):
        out[:] = (adj_close[-1] - adj_close[-2]) / adj_close[-2]

class AdjustedVolatility(CustomFactor):
    """Calculate volatility using adjusted close"""
    inputs = [NASDAQData.adj_close]
    window_length = 20
    
    def compute(self, today, assets, out, adj_close):
        returns = np.diff(adj_close, axis=0) / adj_close[:-1]
        out[:] = np.std(returns, axis=0)

class TrueRange(CustomFactor):
    """Calculate True Range (ATR component)"""
    inputs = [NASDAQData.high, NASDAQData.low, NASDAQData.close]
    window_length = 2
    
    def compute(self, today, assets, out, high, low, close):
        tr1 = high[-1] - low[-1]
        tr2 = np.abs(high[-1] - close[-2])
        tr3 = np.abs(low[-1] - close[-2])
        out[:] = np.maximum(tr1, np.maximum(tr2, tr3))

class RelativeVolume(CustomFactor):
    """Volume relative to 20-day average"""
    inputs = [NASDAQData.volume]
    window_length = 20
    
    def compute(self, today, assets, out, volume):
        avg_volume = np.mean(volume[:-1], axis=0)
        out[:] = volume[-1] / avg_volume

print("\n‚úì Custom factors defined")

## Step 10: Create Advanced Pipeline

Build a pipeline with professional technical indicators:

In [None]:
print("Building advanced pipeline...\n")

# Define the pipeline
pipeline = Pipeline(
    columns={
        # Price data (use adjusted for accuracy)
        'close': NASDAQData.close.latest,
        'adj_close': NASDAQData.adj_close.latest,
        'volume': NASDAQData.volume.latest,
        'adj_volume': NASDAQData.adj_volume.latest,
        
        # Moving averages (using adjusted close)
        'sma_10': SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=10),
        'sma_20': SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=20),
        'sma_50': SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=50),
        'sma_200': SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=200),
        
        # Returns and volatility
        'daily_return': AdjustedReturn(),
        'volatility_20d': AdjustedVolatility(),
        
        # Volume analysis
        'relative_volume': RelativeVolume(),
        
        # Technical indicators
        'true_range': TrueRange(),
        
        # Trading signals
        'golden_cross': (
            SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=50) >
            SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=200)
        ),
        'above_sma20': NASDAQData.adj_close.latest > SimpleMovingAverage(inputs=[NASDAQData.adj_close], window_length=20),
        'high_volume_spike': RelativeVolume() > 2.0,  # Volume > 2x average
    }
)

print(f"‚úì Pipeline created with {len(pipeline.columns)} columns")

## Step 11: Run Pipeline

Execute the pipeline for analysis:

In [None]:
print("Setting up Pipeline engine...\n")

# Get trading calendar
trading_calendar = get_calendar('NYSE')

# Create loader
loader = DatabaseCustomDataLoader(
    dataset=NASDAQData,
    db_path=f"{db_dir}/nasdaq-market-data.db",
)

def get_loader(column):
    if column.dataset == NASDAQData:
        return loader
    raise ValueError(f"No loader for {column}")

engine = SimplePipelineEngine(
    get_loader=get_loader,
    asset_finder=None,
    default_domain=None,
)

print("‚úì Engine ready\n")

# Define analysis period
analysis_start = pd.Timestamp('2023-06-01', tz='UTC')
analysis_end = pd.Timestamp('2023-06-30', tz='UTC')

trading_days = trading_calendar.sessions_in_range(analysis_start, analysis_end)

print(f"Analysis period: {analysis_start.date()} to {analysis_end.date()}")
print(f"Trading days: {len(trading_days)}\n")

# Run pipeline
print("Running pipeline...")
results = engine.run_pipeline(
    pipeline,
    start_date=analysis_start,
    end_date=analysis_end,
)

print(f"\n‚úì Pipeline completed!")
print(f"  Result shape: {results.shape}")
print(f"  Columns: {len(results.columns)}\n")

print("Sample results:")
print(results.head(10))

## Step 12: Advanced Analysis

Analyze the results with professional metrics:

In [None]:
print("\n=== NASDAQ DATA ANALYSIS ===\n")

# Analysis by stock
for sid, ticker in sorted(sid_to_ticker.items()):
    if sid in results.index.get_level_values(1):
        stock_data = results.xs(sid, level=1)
        
        print(f"{ticker} (SID {sid}):")
        print(f"  Adj Close: ${stock_data['adj_close'].mean():.2f} avg")
        print(f"  Daily Return: {stock_data['daily_return'].mean()*100:.3f}% avg")
        print(f"  Volatility: {stock_data['volatility_20d'].mean():.4f}")
        print(f"  Relative Volume: {stock_data['relative_volume'].mean():.2f}x avg")
        print(f"  Golden Cross: {stock_data['golden_cross'].any()}")
        print(f"  Days above SMA20: {stock_data['above_sma20'].sum()}/{len(stock_data)}")
        print()

## Step 13: Visualizations

Create professional charts:

In [None]:
# Select stocks for visualization
plot_tickers = list(ticker_to_sid.keys())[:4]

# Plot 1: Price with multiple moving averages
fig, axes = plt.subplots(2, 2, figsize=(16, 10))
fig.suptitle('NASDAQ Data - Price Trends with Moving Averages', fontsize=16, fontweight='bold')

for idx, ticker in enumerate(plot_tickers):
    if ticker not in ticker_to_sid:
        continue
        
    sid = ticker_to_sid[ticker]
    if sid not in results.index.get_level_values(1):
        continue
    
    ax = axes[idx // 2, idx % 2]
    stock_data = results.xs(sid, level=1)
    
    # Plot adjusted close and moving averages
    ax.plot(stock_data.index, stock_data['adj_close'], label='Adj Close', linewidth=2)
    ax.plot(stock_data.index, stock_data['sma_10'], label='SMA 10', linestyle='--', alpha=0.7)
    ax.plot(stock_data.index, stock_data['sma_20'], label='SMA 20', linestyle='--', alpha=0.7)
    ax.plot(stock_data.index, stock_data['sma_50'], label='SMA 50', linestyle='--', alpha=0.7)
    
    # Highlight golden cross if present
    if stock_data['golden_cross'].any():
        ax.axhline(y=stock_data['sma_200'].iloc[-1], color='gold', linestyle=':', 
                   linewidth=2, label='SMA 200 (Golden Cross)', alpha=0.6)
    
    ax.set_title(f'{ticker} - Professional Grade Data', fontweight='bold')
    ax.set_xlabel('Date')
    ax.set_ylabel('Price ($)')
    ax.legend()
    ax.grid(True, alpha=0.3)
    ax.tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

In [None]:
# Plot 2: Volume analysis with spikes
fig, axes = plt.subplots(2, 1, figsize=(16, 10))
fig.suptitle('Volume Analysis - NASDAQ Data', fontsize=16, fontweight='bold')

# Absolute volume
ax1 = axes[0]
for ticker in plot_tickers:
    if ticker in ticker_to_sid:
        sid = ticker_to_sid[ticker]
        if sid in results.index.get_level_values(1):
            stock_data = results.xs(sid, level=1)
            ax1.plot(stock_data.index, stock_data['adj_volume'] / 1e6, 
                    label=ticker, linewidth=2, marker='o', markersize=3)

ax1.set_title('Adjusted Trading Volume', fontweight='bold')
ax1.set_ylabel('Volume (Millions)')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Relative volume
ax2 = axes[1]
for ticker in plot_tickers:
    if ticker in ticker_to_sid:
        sid = ticker_to_sid[ticker]
        if sid in results.index.get_level_values(1):
            stock_data = results.xs(sid, level=1)
            ax2.plot(stock_data.index, stock_data['relative_volume'], 
                    label=ticker, linewidth=2)

ax2.axhline(y=2.0, color='red', linestyle='--', linewidth=2, 
           label='Spike Threshold (2x)', alpha=0.6)
ax2.set_title('Relative Volume (vs 20-day Average)', fontweight='bold')
ax2.set_xlabel('Date')
ax2.set_ylabel('Relative Volume')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## Step 14: Trading Signals

Generate professional trading signals:

In [None]:
print("\n" + "="*70)
print("PROFESSIONAL TRADING SIGNALS - NASDAQ DATA")
print("="*70 + "\n")

latest_date = results.index.get_level_values(0).max()
latest_data = results.xs(latest_date, level=0)

print(f"Signal Date: {latest_date.date()}\n")

# Golden Cross signals (SMA50 > SMA200)
golden_cross_stocks = latest_data[latest_data['golden_cross'] == True]
print("üèÜ GOLDEN CROSS SIGNALS (SMA50 > SMA200):")
if len(golden_cross_stocks) > 0:
    for sid in golden_cross_stocks.index:
        if sid in sid_to_ticker:
            ticker = sid_to_ticker[sid]
            price = golden_cross_stocks.loc[sid, 'adj_close']
            ret = golden_cross_stocks.loc[sid, 'daily_return'] * 100
            print(f"  {ticker}: ${price:.2f} (Return: {ret:+.2f}%)")
else:
    print("  None")

# Strong momentum (above SMA20 + positive return)
strong_momentum = latest_data[
    (latest_data['above_sma20'] == True) & 
    (latest_data['daily_return'] > 0)
]
print(f"\nüìà STRONG MOMENTUM ({len(strong_momentum)} stocks):")
for sid in strong_momentum.index:
    if sid in sid_to_ticker:
        ticker = sid_to_ticker[sid]
        ret = strong_momentum.loc[sid, 'daily_return'] * 100
        vol = strong_momentum.loc[sid, 'relative_volume']
        print(f"  {ticker}: +{ret:.2f}% (Rel Vol: {vol:.2f}x)")

# Volume spikes
volume_spikes = latest_data[latest_data['high_volume_spike'] == True]
print(f"\nüîä VOLUME SPIKES (>2x average, {len(volume_spikes)} stocks):")
for sid in volume_spikes.index:
    if sid in sid_to_ticker:
        ticker = sid_to_ticker[sid]
        rel_vol = volume_spikes.loc[sid, 'relative_volume']
        ret = volume_spikes.loc[sid, 'daily_return'] * 100
        print(f"  {ticker}: {rel_vol:.2f}x volume (Return: {ret:+.2f}%)")

# High volatility warnings
high_vol_threshold = latest_data['volatility_20d'].quantile(0.75)
high_volatility = latest_data[latest_data['volatility_20d'] > high_vol_threshold]
print(f"\n‚ö†Ô∏è  HIGH VOLATILITY ALERT ({len(high_volatility)} stocks):")
for sid in high_volatility.index:
    if sid in sid_to_ticker:
        ticker = sid_to_ticker[sid]
        vol = high_volatility.loc[sid, 'volatility_20d']
        print(f"  {ticker}: {vol:.4f}")

print("\n" + "="*70)

## Step 15: Data Quality Report

Verify data quality (important for professional use):

In [None]:
print("\n" + "="*70)
print("DATA QUALITY REPORT")
print("="*70 + "\n")

print(f"Data Source: NASDAQ Data Link ({DATA_SOURCE})")
print(f"Date Range: {start_date} to {end_date}")
print(f"Stocks: {len(sid_to_ticker)}\n")

# Check for adjustment differences
print("Adjustment Analysis:")
for sid, ticker in sorted(sid_to_ticker.items()):
    if sid in results.index.get_level_values(1):
        stock_data = results.xs(sid, level=1)
        
        # Calculate adjustment factor
        adj_factor = (stock_data['close'] / stock_data['adj_close']).mean()
        
        print(f"  {ticker}: Adjustment factor = {adj_factor:.4f}")
        
        if abs(adj_factor - 1.0) > 0.01:
            print(f"    ‚ö†Ô∏è  Significant adjustments detected (splits/dividends)")

# Data completeness
print("\nData Completeness:")
expected_days = len(trading_days)
for sid, ticker in sorted(sid_to_ticker.items()):
    if sid in results.index.get_level_values(1):
        stock_data = results.xs(sid, level=1)
        actual_days = len(stock_data)
        completeness = (actual_days / expected_days) * 100
        
        print(f"  {ticker}: {actual_days}/{expected_days} days ({completeness:.1f}%)")
        
        if completeness < 95:
            print(f"    ‚ö†Ô∏è  Missing data detected")

print("\n" + "="*70)
print("‚úì Quality check complete!")
print("="*70)

## Production Workflow

### Daily Update Script

Here's how to set up automated daily updates:

```python
# update_nasdaq_data.py
import os
import nasdaqdatalink
import pandas as pd
from datetime import datetime, timedelta
from zipline.pipeline.data import insert_custom_data

# Configure API
nasdaqdatalink.ApiConfig.api_key = os.getenv('NASDAQ_DATA_LINK_API_KEY')

# Your stock universe
stocks = {'AAPL': 1, 'MSFT': 2, ...}

def update_daily():
    """Fetch and update yesterday's data"""
    yesterday = (datetime.now() - timedelta(days=1)).strftime('%Y-%m-%d')
    
    for ticker, sid in stocks.items():
        try:
            # Fetch data
            df = nasdaqdatalink.get(
                f'EOD/{ticker}',
                start_date=yesterday,
                end_date=yesterday,
            )
            
            # Format and insert
            # ... (same formatting as above)
            
            print(f"Updated {ticker}")
        except Exception as e:
            print(f"Error updating {ticker}: {e}")

if __name__ == '__main__':
    update_daily()
```

### Cron Job Setup

Schedule daily updates after market close (4:30 PM ET):

```bash
# Edit crontab
crontab -e

# Add this line (runs at 5 PM ET daily)
0 17 * * 1-5 cd /path/to/zipline && python update_nasdaq_data.py
```

### Docker Deployment

Add to your `docker-compose.yml`:

```yaml
services:
  zipline-jupyter:
    environment:
      - NASDAQ_DATA_LINK_API_KEY=${NASDAQ_DATA_LINK_API_KEY}
```

## Best Practices

1. **Always use adjusted prices** for returns and analysis
2. **Monitor API rate limits** - NASDAQ has strict limits
3. **Cache data** - Don't re-download historical data
4. **Validate data quality** - Check for gaps and anomalies
5. **Handle corporate actions** - Adjusted data accounts for this
6. **Backup databases** - Regular backups of your custom databases
7. **Version your universe** - Track which stocks you're analyzing
8. **Log everything** - Keep logs of data fetches and errors

## Next Steps

- Add fundamental data from NASDAQ Data Link SF1 database
- Implement custom factors for your strategies
- Build backtesting framework
- Set up alerting for trading signals
- Integrate with broker API for live trading

## Resources

- [NASDAQ Data Link Documentation](https://docs.data.nasdaq.com/)
- [NASDAQ Data Link Python Package](https://github.com/Nasdaq/data-link-python)
- [Available Datasets](https://data.nasdaq.com/search)
- [CustomData Documentation](../docs/CUSTOM_DATA.md)
- [Database Storage Guide](../docs/CUSTOM_DATA_DATABASE.md)