# Trader Behavior vs Market Sentiment — Full Notebook

This Colab notebook reproduces the full analysis:
1. Load historical trades & Fear & Greed Index data.
2. Clean & preprocess.
3. Merge by date and compute metrics.
4. Generate visualizations.
5. Summarize insights.


In [None]:
# Install libraries if missing (Colab)
!pip install pandas matplotlib
import pandas as pd, numpy as np, matplotlib.pyplot as plt


In [None]:
# Load datasets
# Replace with your file paths in Colab (after uploading or mounting Drive)
historical = pd.read_csv('historical_data.csv')
fg = pd.read_csv('fear_greed_index.csv')
print('Historical shape:', historical.shape)
print('Fear & Greed shape:', fg.shape)
historical.head(), fg.head()

In [None]:
# Normalize columns
def normalize_cols(df):
    df = df.copy()
    df.columns = (
        df.columns.str.strip().str.lower().str.replace('[^0-9a-zA-Z]+','_',regex=True).str.strip('_')
    )
    return df

hist = normalize_cols(historical)
fg2 = normalize_cols(fg)
hist.head()

In [None]:
# Parse timestamps
import numpy as np
if 'timestamp' in hist.columns:
    v = pd.to_numeric(hist['timestamp'], errors='coerce')
    unit = 'ms' if v.dropna().median() > 1e12 else 's'
    hist['dt'] = pd.to_datetime(v, unit=unit, utc=True)
elif 'timestamp_ist' in hist.columns:
    hist['dt'] = pd.to_datetime(hist['timestamp_ist'], errors='coerce')
else:
    hist['dt'] = pd.NaT

hist['date'] = hist['dt'].dt.date
hist.head()

In [None]:
# Compute notional
if 'size_usd' in hist.columns:
    hist['notional'] = hist['size_usd'].abs()
elif 'execution_price' in hist.columns and 'size_tokens' in hist.columns:
    hist['notional'] = (hist['execution_price'].abs() * hist['size_tokens'].abs())
else:
    hist['notional'] = np.nan

# Parse FGI
fg2['date'] = pd.to_datetime(fg2['date'], errors='coerce').dt.date
fg_clean = fg2[['date','classification','value']].dropna(subset=['date']).drop_duplicates()

# Merge
merged = pd.merge(hist, fg_clean, on='date', how='left')
merged['win'] = (merged['closed_pnl'] > 0).astype(int)
merged.head()

In [None]:
# Aggregate by sentiment
by_sent = merged.groupby('classification').agg({
    'closed_pnl': ['sum','mean','median','std','count'],
    'notional': ['sum','mean','median','std'],
    'win': ['mean']
})
by_sent

In [None]:
# Daily metrics
daily = merged.groupby(['date','classification']).agg({
    'closed_pnl': ['sum','mean'],
    'notional': ['sum'],
    'win': ['mean']
}).reset_index()
daily.head()

In [None]:
# Charts
pivot_pnl = daily.pivot(index='date', columns='classification', values=('closed_pnl','sum'))
pivot_pnl.plot(figsize=(10,4))
plt.title('Daily Realized PnL by Sentiment')
plt.show()

pivot_notional = daily.pivot(index='date', columns='classification', values=('notional','sum'))
pivot_notional.plot(figsize=(10,4))
plt.title('Daily Notional Volume by Sentiment')
plt.show()

## Next Steps
- You can extend with statistical tests (t-test between Greed vs Fear).
- Add regression models for predictive analysis.
- Save CSV outputs and charts for reporting.
