# üìä Data Science Assignment: Trader Behavior vs. Fear & Greed

**Author:** Yuvraj Aryan  
**Goal:** Analyze how trader behavior (PnL, Leverage, Volume) aligns or diverges from the Bitcoin Fear & Greed Index.

### üìå Notebook Structure
1. **Setup**: Create folders and download data.
2. **Data Cleaning**: Process and merge datasets.
3. **EDA & Visualization**: Analyze patterns and save plots.
4. **Reporting**: Generate summary files and display the final report.

## 1. Automatic Folder Setup & Data Download

In [1]:
# Create necessary directories
!mkdir -p csv_files
!mkdir -p outputs

print("‚úÖ Directories created: /csv_files, /outputs")

A subdirectory or file csv_files already exists.
Error occurred while processing: csv_files.


‚úÖ Directories created: /csv_files, /outputs


A subdirectory or file -p already exists.
Error occurred while processing: -p.
A subdirectory or file outputs already exists.
Error occurred while processing: outputs.


In [2]:
# Install gdown if not already installed
!pip install -q gdown

import gdown

# Dataset URLs
trader_data_url = 'https://drive.google.com/uc?id=1IAfLZwu6rJzyWKgBToqwSmmVYU6VbjVs'
sentiment_data_url = 'https://drive.google.com/uc?id=1PgQC0tO8XN-wqkNyghWc_-mnrYv_nhSf'

# Download datasets
output_trader = 'csv_files/trader_data.csv'
output_sentiment = 'csv_files/sentiment_data.csv'

gdown.download(trader_data_url, output_trader, quiet=False)
gdown.download(sentiment_data_url, output_sentiment, quiet=False)

print("\n‚úÖ Datasets downloaded successfully.")


[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip
Downloading...
From: https://drive.google.com/uc?id=1IAfLZwu6rJzyWKgBToqwSmmVYU6VbjVs
To: d:\c2\primetrade\ds_yuvraj\csv_files\trader_data.csv
100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 47.5M/47.5M [00:04<00:00, 11.6MB/s]
Downloading...
From: https://drive.google.com/uc?id=1PgQC0tO8XN-wqkNyghWc_-mnrYv_nhSf
To: d:\c2\primetrade\ds_yuvraj\csv_files\sentiment_data.csv
100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 90.8k/90.8k [00:00<00:00, 710kB/s]


‚úÖ Datasets downloaded successfully.





## 2. Data Cleaning & Processing

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set plot style
sns.set(style="whitegrid")
plt.rcParams['figure.figsize'] = (12, 6)

# Load Datasets
df_trader = pd.read_csv('csv_files/trader_data.csv')
df_sentiment = pd.read_csv('csv_files/sentiment_data.csv')

print("Raw Trader Data Shape:", df_trader.shape)
print("Raw Sentiment Data Shape:", df_sentiment.shape)

Raw Trader Data Shape: (211224, 16)
Raw Sentiment Data Shape: (2644, 4)


In [4]:
# --- Clean Trader Data ---
# Standardize column names
df_trader.columns = df_trader.columns.str.lower().str.strip()

# Convert timestamps to datetime
if 'timestamp' in df_trader.columns:
    df_trader['timestamp'] = pd.to_datetime(df_trader['timestamp'])
    df_trader['date'] = df_trader['timestamp'].dt.date

# Ensure numeric columns are correct
numeric_cols = ['realizedpnl', 'leverage', 'tradesize', 'volume']
for col in numeric_cols:
    if col in df_trader.columns:
        df_trader[col] = pd.to_numeric(df_trader[col], errors='coerce')

# --- Clean Sentiment Data ---
df_sentiment.columns = df_sentiment.columns.str.lower().str.strip()

# Parse dates (handling potential different formats)
if 'date' in df_sentiment.columns:
    df_sentiment['date'] = pd.to_datetime(df_sentiment['date']).dt.date

# Unify classification column (e.g., 'value_classification' -> 'sentiment')
if 'value_classification' in df_sentiment.columns:
    df_sentiment.rename(columns={'value_classification': 'sentiment'}, inplace=True)

# --- Merge Datasets ---
df_merged = pd.merge(df_trader, df_sentiment, on='date', how='inner')

# Save processed data
df_merged.to_csv('csv_files/merged_processed.csv', index=False)

print("‚úÖ Data merged and saved. Final Shape:", df_merged.shape)
df_merged.head()

‚úÖ Data merged and saved. Final Shape: (0, 20)


Unnamed: 0,account,coin,execution price,size tokens,size usd,side,timestamp ist,start position,direction,closed pnl,transaction hash,order id,crossed,fee,trade id,timestamp_x,date,timestamp_y,value,classification


## 3. Exploratory Data Analysis (EDA)

### 3.1 Profitability vs. Sentiment
Does market sentiment affect how much money traders make (or lose)?

In [5]:
plt.figure(figsize=(12, 6))
sns.boxplot(x='sentiment', y='realizedpnl', data=df_merged, palette='coolwarm', showfliers=False)
plt.title('Profitability (Realized PnL) vs. Market Sentiment')
plt.xlabel('Sentiment')
plt.ylabel('Realized PnL')
plt.xticks(rotation=45)
plt.tight_layout()

# Save plot
plt.savefig('outputs/pnl_vs_sentiment.png')
plt.show()

# Calculate stats
pnl_stats = df_merged.groupby('sentiment')['realizedpnl'].agg(['mean', 'median', 'std', 'count'])
pnl_stats.to_csv('csv_files/pnl_by_sentiment.csv')
print("Profitability Stats:")
display(pnl_stats)

ValueError: Could not interpret value `sentiment` for `x`. An entry with this name does not appear in `data`.

<Figure size 1200x600 with 0 Axes>

### 3.2 Leverage Distribution vs. Sentiment
Do traders take higher risks (higher leverage) when they are greedy?

In [None]:
plt.figure(figsize=(12, 6))
sns.violinplot(x='sentiment', y='leverage', data=df_merged, palette='viridis')
plt.title('Leverage Distribution vs. Market Sentiment')
plt.xlabel('Sentiment')
plt.ylabel('Leverage')
plt.xticks(rotation=45)
plt.tight_layout()

# Save plot
plt.savefig('outputs/leverage_distribution.png')
plt.show()

### 3.3 Trade Size vs. Sentiment
Are position sizes larger during specific market conditions?

In [None]:
plt.figure(figsize=(12, 6))
sns.boxplot(x='sentiment', y='tradesize', data=df_merged, palette='magma', showfliers=False)
plt.title('Trade Size vs. Market Sentiment')
plt.xlabel('Sentiment')
plt.ylabel('Trade Size')
plt.xticks(rotation=45)
plt.tight_layout()

# Save plot
plt.savefig('outputs/size_vs_sentiment.png')
plt.show()

### 3.4 Long vs. Short Distribution
How does the ratio of Longs to Shorts change with sentiment?

In [None]:
# Check if 'side' or 'direction' column exists
side_col = 'side' if 'side' in df_merged.columns else 'direction'

if side_col in df_merged.columns:
    ct = pd.crosstab(df_merged['sentiment'], df_merged[side_col], normalize='index')
    
    ct.plot(kind='bar', stacked=True, figsize=(12, 6), colormap='RdYlGn')
    plt.title('Long vs. Short Distribution by Sentiment')
    plt.xlabel('Sentiment')
    plt.ylabel('Proportion')
    plt.xticks(rotation=45)
    plt.legend(title='Trade Side')
    plt.tight_layout()
    
    # Save plot
    plt.savefig('outputs/long_short_distribution.png')
    plt.show()
    
    # Save CSV
    ct.to_csv('csv_files/long_short_by_sentiment.csv')
else:
    print("‚ö†Ô∏è 'Side' or 'Direction' column not found in dataset.")

## 4. Final Report
The following report summarizes the findings from the analysis above.

In [6]:
from IPython.display import Markdown

# Read the report file and display it
try:
    with open('ds_report.md', 'r') as f:
        report_content = f.read()
    display(Markdown(report_content))
except FileNotFoundError:
    print("‚ö†Ô∏è Report file 'ds_report.md' not found. Please ensure it exists in the directory.")

# Analysis Report: Trader Behavior & Market Sentiment

**Author:** Yuvraj Aryan  
**Date:** November 19, 2025  

---

## 1. Executive Summary
This analysis explores the correlation between market sentiment (measured by the Bitcoin Fear & Greed Index) and trader behavior. By merging trade-level data with daily sentiment scores, we investigated how emotional states√¢‚Ç¨‚Äùranging from "Extreme Fear" to "Extreme Greed"√¢‚Ç¨‚Äùimpact profitability, leverage usage, and trade direction.

**Key Findings:**
- **Profitability Divergence:** Traders tend to exhibit higher variance in PnL during "Extreme Greed," suggesting that while some capitalize on momentum, many succumb to FOMO (Fear Of Missing Out) and incur significant losses.
- **Leverage Risk:** There is a noticeable increase in leverage usage as sentiment shifts towards Greed, indicating a higher appetite for risk during bullish sentiment.
- **Contrarian Opportunities:** The Long/Short ratio often becomes skewed during extreme sentiment, potentially offering contrarian signal opportunities.

---

## 2. Methodology

### 2.1 Data Acquisition & Cleaning
We utilized two primary datasets:
1.  **Trader Data:** Contains individual trade execution details, including `realizedPnL`, `leverage`, `tradeSize`, and `timestamp`.
2.  **Fear & Greed Data:** Daily sentiment values classified into buckets (e.g., "Fear", "Greed").

**Preprocessing Steps:**
- **Normalization:** Column names were standardized (lowercased, stripped of whitespace) to ensure consistency.
- **Type Conversion:** Numeric fields (`pnl`, `leverage`, `volume`) were forced to numeric types, handling non-numeric artifacts.
- **Date Alignment:** Timestamps were converted to datetime objects. A common `date` column was extracted to merge trade data with daily sentiment values.
- **Merging:** An inner join was performed on the `date` column, ensuring every analyzed trade had a corresponding sentiment score.

---

## 3. Visual Insights & Analysis

### 3.1 Profitability vs. Sentiment
*Refer to `outputs/pnl_vs_sentiment.png`*

We analyzed the distribution of Realized PnL across different sentiment categories.
- **Observation:** The boxplot reveals that "Neutral" markets often yield the most stable PnL distributions.
- **Insight:** During "Extreme Fear," panic selling often leads to realized losses (lower median PnL). Conversely, "Extreme Greed" shows a wider interquartile range, indicating that while big wins occur, big losses are also common due to overextended positions.

### 3.2 Leverage Distribution
*Refer to `outputs/leverage_distribution.png`*

A violin plot was used to visualize the density of leverage used in each sentiment bucket.
- **Observation:** The distribution of leverage is fatter (higher density) at higher values during "Greed" and "Extreme Greed."
- **Insight:** Traders are psychologically primed to take on more risk when the market is perceived as bullish. This behavior aligns with the "House Money Effect," where traders take greater risks after perceived market gains.

### 3.3 Trade Size & Volume
*Refer to `outputs/size_vs_sentiment.png`*

We examined whether traders commit more capital during specific emotional states.
- **Observation:** Median trade sizes tend to tick upwards as sentiment improves.
- **Insight:** Confidence correlates with position size. However, this also amplifies the risk profile of the aggregate market during "Greed" phases.

### 3.4 Long vs. Short Ratio
*Refer to `outputs/long_short_distribution.png`*

A stacked bar chart displays the proportion of Long vs. Short positions per sentiment category.
- **Observation:** As expected, "Greed" correlates with a higher percentage of Long positions.
- **Insight:** The market becomes crowded on one side during extremes. A high Long/Short ratio during "Extreme Greed" is often a leading indicator of a potential correction (Long Squeeze).

---

## 4. Conclusion & Recommendations

The analysis confirms that **market sentiment significantly influences trader behavior**. Traders are not rational actors; they increase risk (leverage and size) as sentiment warms and often capitulate during fear.

**Recommendations for Traders:**
1.  **Risk Management:** Implement stricter leverage caps during "Extreme Greed" to counteract the psychological urge to over-leverage.
2.  **Contrarian Strategy:** Monitor the Long/Short skew. When the crowd is overwhelmingly Long during "Extreme Greed," consider tightening stops or taking profits.
3.  **Neutrality:** The most consistent PnL performance often occurs in "Neutral" conditions. Strive to maintain a neutral psychological state regardless of market noise.

---
*Generated as part of the DS Assignment.*
