# S&P 500 Index Simulation & Counterfactual Analysis

This notebook performs quantitative simulation and "what-if" analysis on the S&P 500 index.

**Key Features:**

- Scrape current S&P 500 constituents
- Simulate equal-weighted index performance
- Perform counterfactual analysis (e.g., "S&P 500 without Magnificent 7")
- Calculate comprehensive performance metrics
- Compare against benchmark

**Methodology:**

The S&P 500 is market-cap weighted, but accurate historical market-cap data is not freely available. This analysis uses an **equal-weighted proxy**, which is a standard academic approach for analyzing constituent contribution.


In [None]:
import sys
import os

# Add src directory to path
sys.path.insert(0, os.path.abspath('../src'))

import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import yfinance as yf

from core.market_sim import get_sp500_tickers, analyze_index_exclusion
from core.performance_metrics import (
    generate_performance_report,
    print_performance_report,
    calculate_max_drawdown
)
from config import DEFAULT_START_DATE, DEFAULT_RISK_FREE_RATE

pd.set_option('display.precision', 4)
pd.set_option('display.float_format', '{:.4f}'.format)


## 1. Configuration

Define simulation parameters:


In [None]:
# Simulation Configuration
START_DATE = "2020-01-01"  # Analysis start date
RISK_FREE_RATE = 0.02      # Annual risk-free rate (2%)

# Define companies to exclude ("The Magnificent 7")
MAGNIFICENT_SEVEN = [
    'AAPL',   # Apple
    'MSFT',   # Microsoft
    'GOOGL',  # Alphabet (Class A)
    'GOOG',   # Alphabet (Class C)
    'AMZN',   # Amazon
    'NVDA',   # NVIDIA
    'META',   # Meta (Facebook)
    'TSLA'    # Tesla
]

print("Simulation Parameters:")
print(f"  Start Date: {START_DATE}")
print(f"  Exclusion List: {', '.join(MAGNIFICENT_SEVEN)}")
print(f"  Risk-Free Rate: {RISK_FREE_RATE:.1%}")


## 2. Fetch S&P 500 Constituents


In [None]:
# Scrape current S&P 500 constituents from Wikipedia
sp500_tickers = get_sp500_tickers()

print(f"\nTotal S&P 500 constituents fetched: {len(sp500_tickers)}")
print(f"\nSample tickers: {sp500_tickers[:10]}")


## 3. Run Counterfactual Simulation

This compares three scenarios:

1. **S&P 500 Benchmark (^GSPC)**: The official index
2. **Equal-Weighted Baseline**: All 500+ constituents equally weighted
3. **Equal-Weighted Modified**: Excluding specified companies


In [None]:
# Run the simulation
fig_sim = analyze_index_exclusion(
    exclusion_list=MAGNIFICENT_SEVEN,
    start_date=START_DATE
)

if fig_sim:
    fig_sim.show()
else:
    print("Simulation failed. Check error messages above.")


## 4. Detailed Performance Metrics

Calculate comprehensive risk and return statistics for each scenario:


In [None]:
# Download data manually for detailed analysis
print("Downloading data for detailed analysis...")

all_tickers = sp500_tickers + ['^GSPC']
data = yf.download(all_tickers, start=START_DATE, progress=False)['Adj Close']
data = data.dropna(axis=1, how='all')

# Calculate returns
returns = data.pct_change().dropna(how='all')

# Define portfolios
benchmark_returns = returns['^GSPC']
stock_tickers = [t for t in sp500_tickers if t in returns.columns]
baseline_returns = returns[stock_tickers].mean(axis=1)
modified_tickers = [t for t in stock_tickers if t not in MAGNIFICENT_SEVEN]
modified_returns = returns[modified_tickers].mean(axis=1)

print(f"Valid stock tickers: {len(stock_tickers)}")
print(f"Modified portfolio tickers: {len(modified_tickers)}")


### 4.1 S&P 500 Benchmark Performance


In [None]:
report_benchmark = generate_performance_report(
    benchmark_returns,
    risk_free_rate=RISK_FREE_RATE
)
print_performance_report(report_benchmark)


### 4.2 Equal-Weighted Baseline Performance


In [None]:
report_baseline = generate_performance_report(
    baseline_returns,
    benchmark_returns=benchmark_returns,
    risk_free_rate=RISK_FREE_RATE
)
print_performance_report(report_baseline)


### 4.3 Modified Portfolio Performance (Excluding Magnificent 7)


In [None]:
report_modified = generate_performance_report(
    modified_returns,
    benchmark_returns=benchmark_returns,
    risk_free_rate=RISK_FREE_RATE
)
print_performance_report(report_modified)


## 5. Comparative Analysis

Side-by-side comparison of key metrics:


In [None]:
# Create comparison table
comparison_data = {
    'Metric': [
        'Total Return',
        'Annualized Return',
        'Annualized Volatility',
        'Sharpe Ratio',
        'Sortino Ratio',
        'Calmar Ratio',
        'Maximum Drawdown'
    ],
    'S&P 500 Benchmark': [
        f"{report_benchmark['Total_Return']:.2%}",
        f"{report_benchmark['Annualized_Return']:.2%}",
        f"{report_benchmark['Annualized_Volatility']:.2%}",
        f"{report_benchmark['Sharpe_Ratio']:.2f}",
        f"{report_benchmark['Sortino_Ratio']:.2f}",
        f"{report_benchmark['Calmar_Ratio']:.2f}",
        f"{report_benchmark['Max_Drawdown']:.2%}"
    ],
    'Equal-Weighted Baseline': [
        f"{report_baseline['Total_Return']:.2%}",
        f"{report_baseline['Annualized_Return']:.2%}",
        f"{report_baseline['Annualized_Volatility']:.2%}",
        f"{report_baseline['Sharpe_Ratio']:.2f}",
        f"{report_baseline['Sortino_Ratio']:.2f}",
        f"{report_baseline['Calmar_Ratio']:.2f}",
        f"{report_baseline['Max_Drawdown']:.2%}"
    ],
    'Modified (Ex-Mag7)': [
        f"{report_modified['Total_Return']:.2%}",
        f"{report_modified['Annualized_Return']:.2%}",
        f"{report_modified['Annualized_Volatility']:.2%}",
        f"{report_modified['Sharpe_Ratio']:.2f}",
        f"{report_modified['Sortino_Ratio']:.2f}",
        f"{report_modified['Calmar_Ratio']:.2f}",
        f"{report_modified['Max_Drawdown']:.2%}"
    ]
}

df_comparison = pd.DataFrame(comparison_data)
display(df_comparison)


## 6. Drawdown Analysis

Visualize the drawdown periods for each scenario:


In [None]:
# Calculate drawdowns
dd_benchmark = calculate_max_drawdown(benchmark_returns)['Drawdown_Series']
dd_baseline = calculate_max_drawdown(baseline_returns)['Drawdown_Series']
dd_modified = calculate_max_drawdown(modified_returns)['Drawdown_Series']

# Create drawdown plot
fig_dd = go.Figure()

fig_dd.add_trace(go.Scatter(
    x=dd_benchmark.index,
    y=dd_benchmark * 100,
    name='S&P 500 Benchmark',
    line=dict(color='black', width=2),
    fill='tozeroy'
))

fig_dd.add_trace(go.Scatter(
    x=dd_baseline.index,
    y=dd_baseline * 100,
    name='Equal-Weighted Baseline',
    line=dict(color='blue', width=2),
    fill='tozeroy'
))

fig_dd.add_trace(go.Scatter(
    x=dd_modified.index,
    y=dd_modified * 100,
    name='Modified (Ex-Mag7)',
    line=dict(color='red', width=2),
    fill='tozeroy'
))

fig_dd.update_layout(
    title='Drawdown Comparison',
    xaxis_title='Date',
    yaxis_title='Drawdown (%)',
    hovermode='x unified',
    legend=dict(yanchor="bottom", y=0.01, xanchor="left", x=0.01)
)

fig_dd.show()


In [None]:
# Calculate rolling 1-year returns
window = 252
rolling_benchmark = benchmark_returns.rolling(window).apply(lambda x: (1 + x).prod() - 1)
rolling_baseline = baseline_returns.rolling(window).apply(lambda x: (1 + x).prod() - 1)
rolling_modified = modified_returns.rolling(window).apply(lambda x: (1 + x).prod() - 1)

# Create rolling returns plot
fig_rolling = go.Figure()

fig_rolling.add_trace(go.Scatter(
    x=rolling_benchmark.index,
    y=rolling_benchmark * 100,
    name='S&P 500 Benchmark',
    line=dict(color='black', width=2)
))

fig_rolling.add_trace(go.Scatter(
    x=rolling_baseline.index,
    y=rolling_baseline * 100,
    name='Equal-Weighted Baseline',
    line=dict(color='blue', width=2)
))

fig_rolling.add_trace(go.Scatter(
    x=rolling_modified.index,
    y=rolling_modified * 100,
    name='Modified (Ex-Mag7)',
    line=dict(color='red', width=2)
))

fig_rolling.update_layout(
    title='Rolling 1-Year Returns',
    xaxis_title='Date',
    yaxis_title='Rolling 1-Year Return (%)',
    hovermode='x unified',
    legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01)
)

fig_rolling.show()


## Summary and Insights

This quantitative simulation demonstrates the impact of the Magnificent 7 companies on S&P 500 performance.

**Key Observations:**

1. **Total Return**: Compare how the index performs with and without mega-cap tech
2. **Risk-Adjusted Performance**: Sharpe and Sortino ratios reveal risk-adjusted efficiency
3. **Drawdowns**: Maximum drawdown shows worst-case scenarios
4. **Rolling Performance**: Shows how relative performance changes over time

**Limitations:**

- Uses equal weighting (not market-cap weighting)
- Based on current constituents (survivorship bias)
- Does not account for historical index rebalancing

**Next Steps:**

- Experiment with different exclusion lists
- Analyze different time periods
- Consider sector-level exclusions
