<a href="https://colab.research.google.com/github/Goal48/trader-sentiment-analysis/blob/main/analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


1. Load Data

In [2]:
fear_greed = pd.read_csv("/content/fear_greed_index.csv")
trades = pd.read_csv("/content/historical_data.csv")

Shape of the Datasets

In [3]:
print("Fear & Greed shape:", fear_greed.shape)
print("Trades shape:", trades.shape)

Fear & Greed shape: (2644, 4)
Trades shape: (211224, 16)


 2. Data Quality Checks

In [4]:
print("\nMissing values (Fear & Greed):\n", fear_greed.isna().sum())
print("\nMissing values (Trades):\n", trades.isna().sum())


Missing values (Fear & Greed):
 timestamp         0
value             0
classification    0
date              0
dtype: int64

Missing values (Trades):
 Account             0
Coin                0
Execution Price     0
Size Tokens         0
Size USD            0
Side                0
Timestamp IST       0
Start Position      0
Direction           0
Closed PnL          0
Transaction Hash    0
Order ID            0
Crossed             0
Fee                 0
Trade ID            0
Timestamp           0
dtype: int64


In [5]:
print("\nDuplicates (Fear & Greed):", fear_greed.duplicated().sum())
print("Duplicates (Trades):", trades.duplicated().sum())


Duplicates (Fear & Greed): 0
Duplicates (Trades): 0


3. Timestamp Conversion

In [17]:
fear_greed["date"] = pd.to_datetime(fear_greed["date"]).dt.date
trades["date"] = pd.to_datetime(trades["Timestamp"] / 1000, unit='s').dt.date

4. Merge Datasets

In [7]:
df = trades.merge(
    fear_greed[["date", "classification", "value"]],
    on="date",
    how="inner"
)


In [8]:
# Keep only Fear & Greed for clarity

df = df[df["classification"].isin(["Fear", "Greed"])]

5. Feature Engineering

In [9]:
# Trade-level features
df["is_win"] = df["Closed PnL"] > 0
df["is_long"] = df["Direction"].str.lower().eq("long")

In [10]:
# Trader-day metrics
daily = (
    df.groupby(["Account", "date", "classification"])
    .agg(
        daily_pnl=("Closed PnL", "sum"),
        trades_per_day=("Trade ID", "count"),
        avg_trade_size=("Size USD", "mean"),
        win_rate=("is_win", "mean"),
        long_ratio=("is_long", "mean"),
        avg_fee=("Fee", "mean")
    )
    .reset_index()
)

In [11]:
# Leverage proxy (relative position size)
daily["leverage_proxy"] = daily["avg_trade_size"] / daily["avg_trade_size"].median()

6. Sentiment Performance Analysis

In [12]:
sentiment_summary = (
    daily.groupby("classification")
    .agg(
        avg_pnl=("daily_pnl", "mean"),
        median_pnl=("daily_pnl", "median"),
        avg_win_rate=("win_rate", "mean"),
        avg_trades=("trades_per_day", "mean"),
        avg_leverage=("leverage_proxy", "mean")
    )
)

In [13]:
print("\nPerformance by Sentiment:\n", sentiment_summary)



Performance by Sentiment:
 Empty DataFrame
Columns: [avg_pnl, median_pnl, avg_win_rate, avg_trades, avg_leverage]
Index: []


7. Charts (Saved for Submission)

In [26]:
import os
if not os.path.exists('outputs'):
    os.makedirs('outputs')

In [27]:
plt.figure()
daily.boxplot(column="daily_pnl", by="classification")
plt.title("Daily PnL by Sentiment")
plt.suptitle("")
plt.ylabel("PnL")
plt.savefig("outputs/pnl_by_sentiment.png")
plt.close()

<Figure size 640x480 with 0 Axes>

In [30]:
import os
if not os.path.exists('outputs'):
    os.makedirs('outputs')

In [18]:
df = trades.merge(
    fear_greed[["date", "classification", "value"]],
    on="date",
    how="inner"
)

In [19]:
# Keep only Fear & Greed for clarity

df = df[df["classification"].isin(["Fear", "Greed"])]

In [20]:
# Trade-level features
df["is_win"] = df["Closed PnL"] > 0
df["is_long"] = df["Direction"].str.lower().eq("long")

In [21]:
# Trader-day metrics
daily = (
    df.groupby(["Account", "date", "classification"])
    .agg(
        daily_pnl=("Closed PnL", "sum"),
        trades_per_day=("Trade ID", "count"),
        avg_trade_size=("Size USD", "mean"),
        win_rate=("is_win", "mean"),
        long_ratio=("is_long", "mean"),
        avg_fee=("Fee", "mean")
    )
    .reset_index()
)

In [22]:
# Leverage proxy (relative position size)
daily["leverage_proxy"] = daily["avg_trade_size"] / daily["avg_trade_size"].median()

In [23]:
sentiment_summary = (
    daily.groupby("classification")
    .agg(
        avg_pnl=("daily_pnl", "mean"),
        median_pnl=("daily_pnl", "median"),
        avg_win_rate=("win_rate", "mean"),
        avg_trades=("trades_per_day", "mean"),
        avg_leverage=("leverage_proxy", "mean")
    )
)

In [24]:
print("\nPerformance by Sentiment:\n", sentiment_summary)


Performance by Sentiment:
                       avg_pnl    median_pnl  avg_win_rate  avg_trades  \
classification                                                          
Fear            209372.662205  81389.682515      0.415878  4183.46875   
Greed            99675.516731  35988.376437      0.374074  1134.03125   

                avg_leverage  
classification                
Fear                2.063602  
Greed               2.033235  


In [16]:
print(trades['Timestamp'].head())
print(trades['Timestamp'].dtype)

0    1.730000e+12
1    1.730000e+12
2    1.730000e+12
3    1.730000e+12
4    1.730000e+12
Name: Timestamp, dtype: float64
float64


In [15]:
print('Fear & Greed date range:', fear_greed['date'].min(), 'to', fear_greed['date'].max())
print('Trades date range:', trades['date'].min(), 'to', trades['date'].max())

# Check shape after merge
df_merged = trades.merge(
    fear_greed[['date', 'classification', 'value']],
    on='date',
    how='inner'
)
print(f'Shape of df after merge: {df_merged.shape}')

# Check shape after classification filter
df_filtered = df_merged[df_merged['classification'].isin(['Fear', 'Greed'])]
print(f'Shape of df after classification filter: {df_filtered.shape}')

Fear & Greed date range: 2018-02-01 to 2025-05-02
Trades date range: 1970-01-01 to 1970-01-01
Shape of df after merge: (0, 19)
Shape of df after classification filter: (0, 19)


In [32]:
plt.figure()
sentiment_summary["avg_win_rate"].plot(kind="bar")
plt.title("Average Win Rate by Sentiment")
plt.ylabel("Win Rate")
plt.savefig("outputs/winrate_comparison.png")
plt.close()

In [33]:

plt.figure()
sentiment_summary["avg_trades"].plot(kind="bar")
plt.title("Trades per Day by Sentiment")
plt.ylabel("Trades")
plt.savefig("outputs/trade_frequency.png")
plt.close()


In [34]:
plt.figure()
sentiment_summary["avg_leverage"].plot(kind="bar")
plt.title("Average Leverage Proxy by Sentiment")
plt.ylabel("Leverage Proxy")
plt.savefig("outputs/leverage_distribution.png")
plt.close()

8. Trader Segmentation

In [35]:
# Segment by leverage
daily["leverage_segment"] = np.where(
    daily["leverage_proxy"] > daily["leverage_proxy"].median(),
    "High Leverage",
    "Low Leverage"
)

In [36]:
# Segment by activity
daily["activity_segment"] = np.where(
    daily["trades_per_day"] > daily["trades_per_day"].median(),
    "Frequent",
    "Infrequent"
)


In [37]:
# Segment by consistency
daily["consistency_segment"] = np.where(
    daily["win_rate"] > 0.5,
    "Consistent",
    "Inconsistent"
)


In [38]:
segment_summary = (
    daily.groupby(["classification", "leverage_segment"])
    .agg(
        avg_pnl=("daily_pnl", "mean"),
        avg_win_rate=("win_rate", "mean")
    )
)

In [39]:
print("\nSegment Analysis:\n", segment_summary)


Segment Analysis:
                                        avg_pnl  avg_win_rate
classification leverage_segment                             
Fear           High Leverage     284460.190407      0.404955
               Low Leverage      112831.554518      0.429923
Greed          High Leverage      33404.892705      0.363312
               Low Leverage      151219.335417      0.382445


 9. Actionable Insights Output


In [40]:
print("""
Actionable Rules of Thumb:
1. During Fear days:
   - Reduce leverage for High-Leverage traders
   - Limit trade frequency to avoid drawdowns

2. During Greed days:
   - Allow higher trade frequency only for Consistent traders
   - Maintain position size caps for Inconsistent traders
""")

print("Analysis complete. Charts saved in /outputs folder.")


Actionable Rules of Thumb:
1. During Fear days:
   - Reduce leverage for High-Leverage traders
   - Limit trade frequency to avoid drawdowns

2. During Greed days:
   - Allow higher trade frequency only for Consistent traders
   - Maintain position size caps for Inconsistent traders

Analysis complete. Charts saved in /outputs folder.
