# Simulation Analysis

This notebook analyzes results from the Options Market Maker starter simulation. It expects a CSV at `results/simulation_output.csv` with the following columns:
- `time_step`
- `spot`
- `option_price`
- `delta`
- `underlying_pos`
- `cash`
- `pnl`

The notebook computes hedging error statistics, plots time series (spot, delta, underlying position, P&L), and produces diagnostic histograms.

Requirements:
```bash
pip install pandas matplotlib numpy seaborn
```

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

sns.set(style='darkgrid')
%matplotlib inline

In [None]:
# CONFIG: update if your option notional differs from the simulator's
OPTION_NOTIONAL = 100.0

# Path to CSV
csv_path = Path('../results/simulation_output.csv')
if not csv_path.exists():
    csv_path = Path('results/simulation_output.csv')
    if not csv_path.exists():
        raise FileNotFoundError(f"CSV not found at ../results or ./results: check your path")

df = pd.read_csv(csv_path)
df.head()

## Quick sanity checks

In [None]:
print('rows, cols:', df.shape)
print('columns:', df.columns.tolist())
print('time steps range:', df['time_step'].min(), '->', df['time_step'].max())
print('spot range:', df['spot'].min(), '->', df['spot'].max())

## Derived quantities

We compute:
- `delta_target` = target underlying to neutralize option delta = `- delta * OPTION_NOTIONAL`
- `delta_error` = `underlying_pos - delta_target` (how far the hedger is from perfect delta neutrality)
- RMSE and summary stats for hedging error
- P&L returns and simple performance stats

In [None]:
df = df.copy()
df['delta_target'] = -df['delta'] * OPTION_NOTIONAL
df['delta_error'] = df['underlying_pos'] - df['delta_target']
df['pnl_change'] = df['pnl'].diff().fillna(0.0)
df['abs_delta_error'] = df['delta_error'].abs()

def rmse(x):
    return np.sqrt(np.mean(np.array(x)**2))

hedge_rmse = rmse(df['delta_error'])
hedge_mean = df['delta_error'].mean()
hedge_std = df['delta_error'].std()

pnl_total = df['pnl'].iloc[-1]
pnl_mean = df['pnl_change'].mean()
pnl_vol = df['pnl_change'].std()

print(f'Hedging RMSE = {hedge_rmse:.6f}')
print(f'Hedging mean error = {hedge_mean:.6f}, std = {hedge_std:.6f}')
print(f'Final PnL = {pnl_total:.6f}, mean pnl change = {pnl_mean:.6e}, pnl vol = {pnl_vol:.6e}')

## Time-series plots
Plot spot, option price, delta, underlying position, and P&L on aligned subplots.

In [None]:
fig, axes = plt.subplots(5, 1, figsize=(12, 14), sharex=True)

df.plot(x='time_step', y='spot', ax=axes[0], marker='o')
axes[0].set_ylabel('Spot')
axes[0].set_title('Underlying Spot Price')

df.plot(x='time_step', y='option_price', ax=axes[1], marker='o')
axes[1].set_ylabel('Option Price')
axes[1].set_title('Model Option Price (mark-to-market)')

df.plot(x='time_step', y='delta', ax=axes[2], marker='o')
axes[2].set_ylabel('Delta (per contract)')
axes[2].set_title('Option Delta')

df.plot(x='time_step', y='underlying_pos', ax=axes[3], marker='o')
axes[3].set_ylabel('Underlying Position')
axes[3].set_title('Hedger Underlying Position')

df.plot(x='time_step', y='pnl', ax=axes[4], marker='o')
axes[4].set_ylabel('PnL')
axes[4].set_title('Mark-to-Market P&L')

plt.xlabel('time_step')
plt.tight_layout()
plt.show()

## Hedging error diagnostics
Histogram and time-series of delta error.

In [None]:
fig, axes = plt.subplots(2,1, figsize=(12,8))
sns.histplot(df['delta_error'], ax=axes[0], kde=True)
axes[0].set_title('Histogram of Delta Error (underlying_pos - delta_target)')
axes[0].set_xlabel('delta_error')

df.plot(x='time_step', y='delta_error', ax=axes[1], marker='o')
axes[1].axhline(0.0, color='k', linestyle='--')
axes[1].set_title('Delta Error over Time')
axes[1].set_xlabel('time_step')
plt.tight_layout()
plt.show()

## P&L attribution (basic)
We perform a very simple decomposition:
- `underlying_pnl` = change in value of underlying position
- `cash` already includes execution cash flows + costs
- `option_liability` = mark-to-market of the option (we assume a short 1 option, so it's a liability)

Note: this starter sim writes only limited fields. For full attribution, the simulator should record trade-by-trade fills, notional, and separate execution costs.

In [None]:
df['underlying_val'] = df['underlying_pos'] * df['spot']
df['option_liability'] = df['option_price']
df['simple_attrib'] = df['underlying_val'] + df['cash'] - df['option_liability']

plt.figure(figsize=(10,4))
plt.plot(df['time_step'], df['simple_attrib'], marker='o')
plt.title('Simple P&L Attribution (underlying + cash - option_liability)')
plt.xlabel('time_step')
plt.ylabel('value')
plt.grid(True)
plt.show()

print('Final simple attribution:', df['simple_attrib'].iloc[-1])

## Summary metrics table
Save a small CSV with summary metrics for record keeping.

In [None]:
summary = {
    'hedge_rmse': [hedge_rmse],
    'hedge_mean_error': [hedge_mean],
    'hedge_std_error': [hedge_std],
    'final_pnl': [pnl_total],
    'pnl_mean_change': [pnl_mean],
    'pnl_vol': [pnl_vol]
}
summary_df = pd.DataFrame(summary)
summary_df.to_csv('results/summary_metrics.csv', index=False)
summary_df

## Next steps / suggestions
- Add columns in the simulator for `trade_qty`, `trade_price`, `tx_cost` so you can compute realized hedging costs.
- Log option inventory changes and market maker quote fills separately to compute spread income.
- Sweep hedger thresholds and transaction cost parameters; run multiple seeds and compute distribution of final P&L.
- Use this notebook as a template for plotting multiple simulation runs on the same axes for comparison.