# Sharadar Equity Prices - Professional Backtesting

This notebook demonstrates how to use Sharadar premium data with zipline-reloaded for professional-grade backtesting.

## Why Sharadar?

- **Institutional Quality**: Best-in-class data accuracy
- **Point-in-Time**: Historical values as they were known at the time
- **Comprehensive**: All US equities (~8,000+ tickers)
- **Corporate Actions**: Fully adjusted for splits and dividends
- **Professional Support**: Enterprise-grade support from NASDAQ

## Prerequisites

1. **Sharadar Subscription**: [Subscribe at NASDAQ Data Link](https://data.nasdaq.com/databases/SFA)
2. **API Key**: Get your key from Account Settings
3. **Bundle Ingested**: Run the setup first (see below)

## Step 1: Setup Sharadar Bundle

First, make sure you've ingested the Sharadar bundle. Run this **once** from the terminal:

```bash
# Set your API key
export NASDAQ_DATA_LINK_API_KEY='your_key_here'

# Setup Sharadar bundle with specific tickers (for testing)
python /scripts/manage_data.py setup --source sharadar \
    --tickers AAPL,MSFT,GOOGL,AMZN,TSLA,META,NVDA,NFLX \
    --name sharadar

# OR setup all US equities (production, takes 10-30 minutes)
python /scripts/manage_data.py setup --source sharadar --name sharadar-all

# Verify
zipline bundles
```

## Step 2: Import Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
from datetime import datetime

# Zipline imports
from zipline import run_algorithm
from zipline.api import (
    order_target_percent,
    symbol,
    record,
    set_commission,
    set_slippage,
    schedule_function,
    date_rules,
    time_rules,
)
from zipline.finance import commission, slippage

# Pipeline imports
from zipline.pipeline import Pipeline
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import SimpleMovingAverage, Returns, AverageDollarVolume
from zipline.api import attach_pipeline, pipeline_output

warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

print("✓ Libraries imported successfully!")

## Step 3: Simple Buy-and-Hold Strategy

Let's start with a simple strategy using Sharadar data:

In [None]:
def initialize(context):
    """Called once at start of backtest"""
    # Portfolio of tech stocks
    context.stocks = [
        symbol('AAPL'),
        symbol('MSFT'),
        symbol('GOOGL'),
        symbol('AMZN'),
    ]
    
    # Set commission and slippage to realistic values
    set_commission(commission.PerShare(cost=0.001, min_trade_cost=1.0))
    set_slippage(slippage.VolumeShareSlippage())

def handle_data(context, data):
    """Called every trading day"""
    # Equal weight across all stocks
    weight = 1.0 / len(context.stocks)
    
    for stock in context.stocks:
        if data.can_trade(stock):
            order_target_percent(stock, weight)
    
    # Record portfolio value
    record(portfolio_value=context.portfolio.portfolio_value)

# Run backtest
print("Running backtest with Sharadar data...\n")

results = run_algorithm(
    start=pd.Timestamp('2022-01-01', tz='UTC'),
    end=pd.Timestamp('2023-12-31', tz='UTC'),
    initialize=initialize,
    handle_data=handle_data,
    capital_base=100000,
    bundle='sharadar',  # Using Sharadar bundle
)

print("✓ Backtest complete!\n")
print(f"Start Value: ${results['portfolio_value'].iloc[0]:,.2f}")
print(f"End Value: ${results['portfolio_value'].iloc[-1]:,.2f}")
print(f"Total Return: {(results['portfolio_value'].iloc[-1] / results['portfolio_value'].iloc[0] - 1) * 100:.2f}%")

## Step 4: Analyze Results

In [None]:
# Calculate performance metrics
returns = results['returns']
cumulative_returns = (1 + returns).cumprod() - 1

print("Performance Metrics:")
print(f"  Total Return: {cumulative_returns.iloc[-1] * 100:.2f}%")
print(f"  Annual Return: {(cumulative_returns.iloc[-1] + 1) ** (252 / len(returns)) - 1:.2f}%")
print(f"  Volatility: {returns.std() * np.sqrt(252) * 100:.2f}%")
print(f"  Sharpe Ratio: {returns.mean() / returns.std() * np.sqrt(252):.2f}")
print(f"  Max Drawdown: {(results['portfolio_value'] / results['portfolio_value'].cummax() - 1).min() * 100:.2f}%")

# Plot portfolio value
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10))

# Portfolio value
ax1.plot(results.index, results['portfolio_value'], linewidth=2)
ax1.set_title('Portfolio Value Over Time (Sharadar Data)', fontsize=14, fontweight='bold')
ax1.set_ylabel('Portfolio Value ($)')
ax1.grid(True, alpha=0.3)
ax1.axhline(y=100000, color='red', linestyle='--', alpha=0.5, label='Initial Capital')
ax1.legend()

# Cumulative returns
ax2.plot(cumulative_returns.index, cumulative_returns * 100, linewidth=2, color='green')
ax2.set_title('Cumulative Returns', fontsize=14, fontweight='bold')
ax2.set_xlabel('Date')
ax2.set_ylabel('Return (%)')
ax2.grid(True, alpha=0.3)
ax2.axhline(y=0, color='red', linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()

## Step 5: Advanced Strategy with Pipeline

Now let's build a more sophisticated momentum strategy using Pipeline:

In [None]:
def make_pipeline():
    """Create a pipeline for stock selection"""
    # Price and volume factors
    close_price = USEquityPricing.close.latest
    volume = USEquityPricing.volume.latest
    
    # Moving averages
    sma_50 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=50)
    sma_200 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=200)
    
    # Momentum (20-day return)
    momentum = Returns(window_length=20)
    
    # Dollar volume (liquidity filter)
    dollar_volume = AverageDollarVolume(window_length=30)
    
    # Filters
    # 1. Liquid stocks (top 500 by dollar volume)
    liquid = dollar_volume.top(500)
    # 2. Bullish trend (price > SMA50 > SMA200)
    bullish = (close_price > sma_50) & (sma_50 > sma_200)
    # 3. Strong momentum (top 20%)
    strong_momentum = momentum.top(100, mask=liquid & bullish)
    
    return Pipeline(
        columns={
            'close': close_price,
            'sma_50': sma_50,
            'sma_200': sma_200,
            'momentum': momentum,
            'dollar_volume': dollar_volume,
        },
        screen=strong_momentum,
    )

def initialize_advanced(context):
    """Advanced strategy initialization"""
    # Attach pipeline
    attach_pipeline(make_pipeline(), 'momentum_strategy')
    
    # Rebalance monthly
    schedule_function(
        rebalance,
        date_rules.month_start(),
        time_rules.market_open(hours=1),
    )
    
    # Set commission and slippage
    set_commission(commission.PerShare(cost=0.001, min_trade_cost=1.0))
    set_slippage(slippage.VolumeShareSlippage())
    
    # Maximum positions
    context.max_positions = 10

def rebalance(context, data):
    """Rebalance portfolio monthly"""
    # Get pipeline output
    pipeline_data = pipeline_output('momentum_strategy')
    
    # Select top stocks by momentum
    top_stocks = pipeline_data.nlargest(context.max_positions, 'momentum')
    
    # Calculate weights
    weight = 1.0 / len(top_stocks) if len(top_stocks) > 0 else 0
    
    # Rebalance
    for stock in context.portfolio.positions:
        if stock not in top_stocks.index:
            order_target_percent(stock, 0)
    
    for stock in top_stocks.index:
        if data.can_trade(stock):
            order_target_percent(stock, weight)
    
    # Record metrics
    record(
        num_positions=len(context.portfolio.positions),
        portfolio_value=context.portfolio.portfolio_value,
    )

print("Running advanced momentum strategy with Sharadar data...\n")

results_advanced = run_algorithm(
    start=pd.Timestamp('2022-01-01', tz='UTC'),
    end=pd.Timestamp('2023-12-31', tz='UTC'),
    initialize=initialize_advanced,
    capital_base=100000,
    bundle='sharadar',
)

print("✓ Advanced backtest complete!\n")
print(f"Total Return: {(results_advanced['portfolio_value'].iloc[-1] / results_advanced['portfolio_value'].iloc[0] - 1) * 100:.2f}%")
print(f"Sharpe Ratio: {results_advanced['returns'].mean() / results_advanced['returns'].std() * np.sqrt(252):.2f}")

## Step 6: Compare Strategies

In [None]:
# Calculate cumulative returns
simple_returns = (1 + results['returns']).cumprod() - 1
advanced_returns = (1 + results_advanced['returns']).cumprod() - 1

# Plot comparison
fig, ax = plt.subplots(figsize=(14, 8))

ax.plot(simple_returns.index, simple_returns * 100, 
        label='Buy & Hold', linewidth=2, alpha=0.8)
ax.plot(advanced_returns.index, advanced_returns * 100, 
        label='Momentum Strategy', linewidth=2, alpha=0.8)

ax.set_title('Strategy Comparison (Sharadar Data)', fontsize=16, fontweight='bold')
ax.set_xlabel('Date', fontsize=12)
ax.set_ylabel('Cumulative Return (%)', fontsize=12)
ax.legend(fontsize=12)
ax.grid(True, alpha=0.3)
ax.axhline(y=0, color='red', linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()

# Print comparison table
print("\nStrategy Comparison:")
print("="*60)
print(f"{'Metric':<30} {'Buy & Hold':>15} {'Momentum':>15}")
print("="*60)

metrics = {
    'Total Return (%)': [
        simple_returns.iloc[-1] * 100,
        advanced_returns.iloc[-1] * 100,
    ],
    'Volatility (%)': [
        results['returns'].std() * np.sqrt(252) * 100,
        results_advanced['returns'].std() * np.sqrt(252) * 100,
    ],
    'Sharpe Ratio': [
        results['returns'].mean() / results['returns'].std() * np.sqrt(252),
        results_advanced['returns'].mean() / results_advanced['returns'].std() * np.sqrt(252),
    ],
    'Max Drawdown (%)': [
        (results['portfolio_value'] / results['portfolio_value'].cummax() - 1).min() * 100,
        (results_advanced['portfolio_value'] / results_advanced['portfolio_value'].cummax() - 1).min() * 100,
    ],
}

for metric, values in metrics.items():
    print(f"{metric:<30} {values[0]:>15.2f} {values[1]:>15.2f}")

print("="*60)

## Step 7: Daily Update Workflow

To keep your Sharadar data current, set up daily updates:

```bash
# Manual update
python /scripts/manage_data.py update --bundle sharadar

# Or automate with cron (runs at 6 PM ET daily)
0 18 * * 1-5 python /scripts/manage_data.py update --bundle sharadar
```

## Best Practices with Sharadar Data

### 1. Data Quality
- Sharadar provides **point-in-time data** - values are as they were known historically
- Corporate actions (splits, dividends) are already adjusted
- No look-ahead bias - perfect for backtesting

### 2. Performance
- Start with a subset of tickers for development
- Use full dataset for production
- Bundle data is optimized with bcolz compression

### 3. Updates
- Sharadar updates daily after market close
- Run bundle updates after 6 PM ET
- Keep 3-5 recent ingestions as backup

### 4. Fundamental Data
- Sharadar also provides SF1 (fundamentals) table
- Use CustomData to integrate fundamentals into Pipeline
- Combine pricing + fundamentals for sophisticated strategies

## Next Steps

1. **Expand Universe**: Add more tickers or use `sharadar-all` for full coverage
2. **Add Fundamentals**: Integrate SF1 fundamental data via CustomData
3. **Advanced Factors**: Build custom Pipeline factors
4. **Risk Management**: Add position sizing and stop-losses
5. **Live Trading**: Connect to broker API for live execution

## Resources

- [Sharadar Documentation](https://data.nasdaq.com/databases/SFA/documentation)
- [Sharadar Setup Guide](../docs/SHARADAR_GUIDE.md)
- [Bundle System](../docs/BUNDLES.md)
- [Pipeline Tutorial](https://zipline.ml4trading.io/beginner-tutorial.html)
- [ML4Trading Book](https://www.ml4trading.io/) - Excellent resource using Sharadar