# Backtesting with Sharadar Bundle

This notebook demonstrates how to use Sharadar data bundles for professional-grade backtesting.

## Why Use Bundles?

- **Persistent Storage**: Data stored in optimized bcolz format
- **Fast Access**: Optimized for backtesting performance
- **Point-in-Time**: Historical data as it was known at the time
- **Corporate Actions**: Automatic handling of splits, dividends
- **Institutional Quality**: Sharadar provides best-in-class data accuracy

## Prerequisites

1. NASDAQ Data Link API key with Sharadar subscription
2. Sharadar bundle ingested (see Step 1 below)

## Step 1: Setup Sharadar Bundle

First-time setup (run in terminal):

```bash
# Set your API key
export NASDAQ_DATA_LINK_API_KEY='your_key_here'

# Test with specific tickers (recommended)
python scripts/manage_sharadar.py ingest --tickers AAPL,MSFT,GOOGL,AMZN,TSLA

# Or download full database (~13,000 tickers)
python scripts/manage_sharadar.py ingest --all

# Check status
python scripts/manage_sharadar.py status
```

## Step 2: Import Libraries

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings

from zipline import run_algorithm
from zipline.api import (
    order_target_percent,
    symbol,
    record,
    set_commission,
    set_slippage,
)
from zipline.finance import commission, slippage

warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

print("✓ Libraries imported")

## Step 3: Simple Buy-and-Hold Strategy

In [None]:
def initialize(context):
    """Called once at start of backtest"""
    # Portfolio of tech stocks
    context.stocks = [
        symbol('AAPL'),
        symbol('MSFT'),
        symbol('GOOGL'),
    ]
    
    # Set realistic commission and slippage
    set_commission(commission.PerShare(cost=0.001, min_trade_cost=1.0))
    set_slippage(slippage.VolumeShareSlippage())

def handle_data(context, data):
    """Called every trading day"""
    # Equal weight portfolio
    weight = 1.0 / len(context.stocks)
    
    for stock in context.stocks:
        if data.can_trade(stock):
            order_target_percent(stock, weight)
    
    # Record metrics
    record(
        portfolio_value=context.portfolio.portfolio_value,
        cash=context.portfolio.cash,
    )

# Run backtest
print("Running backtest...\n")

results = run_algorithm(
    start=pd.Timestamp('2022-01-01', tz='UTC'),
    end=pd.Timestamp('2023-12-31', tz='UTC'),
    initialize=initialize,
    handle_data=handle_data,
    capital_base=100000,
    bundle='sharadar',
)

print("✓ Backtest complete!")
print(f"Total Return: {(results['portfolio_value'].iloc[-1] / results['portfolio_value'].iloc[0] - 1) * 100:.2f}%")

## Step 4: Analyze Results

In [None]:
# Calculate performance metrics
print("\n" + "="*70)
print("PERFORMANCE SUMMARY")
print("="*70 + "\n")

initial_value = 100000
final_value = results['portfolio_value'].iloc[-1]
total_return = (final_value - initial_value) / initial_value

print(f"Initial Capital:  ${initial_value:,.2f}")
print(f"Final Value:      ${final_value:,.2f}")
print(f"Total Return:     {total_return*100:+.2f}%")
print(f"Total P/L:        ${final_value - initial_value:+,.2f}")
print()

# Calculate additional metrics
returns = results['returns']
print(f"Sharpe Ratio:     {returns.mean() / returns.std() * np.sqrt(252):.2f}")
print(f"Max Drawdown:     {(results['portfolio_value'] / results['portfolio_value'].cummax() - 1).min()*100:.2f}%")
print(f"Volatility:       {returns.std() * np.sqrt(252) * 100:.2f}%")
print()
print(f"Trading Days:     {len(results)}")
print("\n" + "="*70)

## Step 5: Visualize Performance

In [None]:
fig, axes = plt.subplots(3, 1, figsize=(14, 12))
fig.suptitle('Backtest Results (Sharadar Data)', fontsize=16, fontweight='bold')

# Portfolio value
ax1 = axes[0]
ax1.plot(results.index, results['portfolio_value'], linewidth=2)
ax1.axhline(y=100000, color='gray', linestyle='--', alpha=0.5, label='Initial Capital')
ax1.set_title('Portfolio Value', fontweight='bold')
ax1.set_ylabel('Value ($)')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Cumulative returns
ax2 = axes[1]
cumulative_returns = (1 + results['returns']).cumprod() - 1
ax2.plot(results.index, cumulative_returns * 100, linewidth=2, color='green')
ax2.set_title('Cumulative Returns', fontweight='bold')
ax2.set_ylabel('Return (%)')
ax2.grid(True, alpha=0.3)
ax2.axhline(y=0, color='red', linestyle='--', alpha=0.5)

# Drawdown
ax3 = axes[2]
drawdown = (results['portfolio_value'] / results['portfolio_value'].cummax() - 1) * 100
ax3.fill_between(results.index, 0, drawdown, alpha=0.3, color='red')
ax3.plot(results.index, drawdown, linewidth=2, color='darkred')
ax3.set_title('Drawdown', fontweight='bold')
ax3.set_xlabel('Date')
ax3.set_ylabel('Drawdown (%)')
ax3.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("✓ Charts displayed")

## Step 6: Inspect Your Data

Check what data is available in your bundle:

In [None]:
# Run in terminal for detailed inspection:
# python scripts/inspect_bundle.py
# python scripts/inspect_bundle.py --ticker AAPL

print("Bundle inspection commands:")
print("  python scripts/inspect_bundle.py")
print("  python scripts/inspect_bundle.py --ticker AAPL")
print("\nOr check status:")
print("  python scripts/manage_sharadar.py status")

## Step 7: Save Results

Export results for further analysis:

In [None]:
# Save to pickle
results.to_pickle('backtest_results.pkl')
print("✓ Saved results to backtest_results.pkl")

# Export returns
results['returns'].to_csv('backtest_returns.csv')
print("✓ Saved returns to backtest_returns.csv")

# Export summary metrics
metrics = pd.DataFrame({
    'metric': ['Total Return', 'Sharpe Ratio', 'Max Drawdown', 'Volatility'],
    'value': [
        f"{total_return*100:.2f}%",
        f"{returns.mean() / returns.std() * np.sqrt(252):.2f}",
        f"{(results['portfolio_value'] / results['portfolio_value'].cummax() - 1).min()*100:.2f}%",
        f"{returns.std() * np.sqrt(252) * 100:.2f}%",
    ]
})
metrics.to_csv('backtest_metrics.csv', index=False)
print("✓ Saved metrics to backtest_metrics.csv")

## Bundle Management

### Check Available Bundles

In [None]:
!zipline bundles

### Clean Old Ingestions

```bash
# Clean old ingestions, keep last 3
python scripts/manage_sharadar.py clean --keep-last 3
```

## Summary

### What We've Accomplished

1. ✅ Setup Sharadar bundle with institutional-quality data
2. ✅ Ran buy-and-hold backtest
3. ✅ Analyzed performance with key metrics
4. ✅ Visualized portfolio performance and drawdown
5. ✅ Exported results for further analysis

### Data Flow

```
NASDAQ Data Link (Sharadar)
    ↓
Bundle Ingestion (manage_sharadar.py)
    ↓
Bundle Storage (~/.zipline/data/sharadar/)
    ↓
Backtest (run_algorithm)
    ↓
Results & Analysis
```

### Best Practices

1. **Start Small**: Test with a few tickers before downloading full database
2. **Realistic Costs**: Always set commission and slippage
3. **Version Control**: Keep multiple bundle ingestions as backup
4. **Point-in-Time**: Sharadar data prevents look-ahead bias
5. **Analysis**: Use pyfolio for detailed performance analysis (see `analyze_backtest_results.ipynb`)

### Next Steps

- Explore advanced strategies in `06_sharadar_professional_backtesting.ipynb`
- Learn Pipeline API in `07_pipeline_research.ipynb`
- Analyze results with pyfolio in `analyze_backtest_results.ipynb`

## Resources

- [Sharadar Documentation](https://data.nasdaq.com/databases/SFA/documentation)
- [Zipline Documentation](https://zipline.ml4trading.io/)
- [Bundle System Guide](https://zipline.ml4trading.io/bundles.html)
- [TradingAlgorithm API](https://zipline.ml4trading.io/api-reference.html)