
# Trader Performance vs Bitcoin Market Sentiment

**Author:** Shiva Mishra (candidate)


**Objective:** Explore the relationship between trader performance (Closed PnL) and market sentiment (Fear/Greed Index). Deliver insights and recommendations.

**Datasets:**
- Historical trader data (trades)
- Bitcoin Fear & Greed Index

**Deliverables in this notebook:**
1. Data cleaning and merging
2. Exploratory Data Analysis (visuals & summary)
3. Statistical testing
4. Simple predictive model baseline
5. Conclusions & recommendations


In [2]:

import pandas as pd
import numpy as np
historical_df = pd.read_csv('/mnt/data/historical_data.csv')
fear_greed_df = pd.read_csv('/mnt/data/fear_greed_index.csv')

historical_df['Date'] = pd.to_datetime(historical_df['Timestamp IST'], format="%d-%m-%Y %H:%M").dt.date
fear_greed_df['Date'] = pd.to_datetime(fear_greed_df['date']).dt.date

def simplify_sentiment(classification):
    if "Fear" in classification:
        return "Fear"
    elif "Greed" in classification:
        return "Greed"
    else:
        return "Neutral"

fear_greed_df['Sentiment'] = fear_greed_df['classification'].apply(simplify_sentiment)
merged_df = pd.merge(historical_df, fear_greed_df[['Date', 'value', 'Sentiment']], on='Date', how='inner')

# Basic shape
merged_df.shape


ModuleNotFoundError: No module named 'pandas'


## Exploratory Data Analysis

Below are two key visualizations saved alongside this notebook:

- `boxplot_pnl_by_sentiment.png` — Closed PnL distribution by sentiment (1st–99th percentile)

- `hourly_winrate.png` — Hourly win-rate by sentiment


Also included: `top_coins_by_sentiment.csv` and `sentiment_summary.csv` for tabular inspection.




## Statistical Tests (Fear vs Greed)

- t-test statistic: **-1.2383**, p-value: **0.2156**

- Mann-Whitney U statistic: **12370102.0000**, p-value: **0.3362**


Interpretation: p-values indicate whether the distribution of Closed PnL differs between Fear and Greed days (smaller p-values suggest significant differences).




## Predictive Model (Baseline)

A logistic regression baseline was trained to predict whether a trade is profitable using features: Size USD, sentiment value, hour-of-day, coin (top 10), and side. Model artifacts and a brief report are included as `model_report.csv`.

Recommendations: Use more advanced models (XGBoost/LightGBM) and per-account modeling for better performance.

