# 📊 Sentiment-Based Trading Strategy & Backtest

---

## 🔍 Project Overview

In this notebook, we explore a **rule-based trading strategy** driven by financial news sentiment. The goal is to identify market opportunities by analyzing whether positive or negative news impacts short-term stock prices.

We apply **natural language processing (NLP)** techniques to extract sentiment scores from financial news headlines and content. Based on sentiment thresholds, we generate buy/sell signals and evaluate the strategy using historical price data.

This project combines concepts from:

- **Natural Language Processing (NLP)**
- **Quantitative Trading**
- **Backtesting & Strategy Evaluation**

---

## 🎯 Objectives

- Load and preprocess financial news data with stock tickers and timestamps
- Apply sentiment analysis using pre-trained models (TextBlob, VADER, or Transformers)
- Design a simple rule-based trading strategy:
    - **Long if sentiment > 0.9**
    - **Avoid or short if sentiment < 0.1**
- Backtest the strategy using daily price returns
- Evaluate performance using:
    - **Total Return**
    - **Sharpe Ratio**
    - **Max Drawdown**
    - **Win Rate** (based on 1-day, 3-day, 5-day returns after the news)

---

## ⚙️ Tools and Libraries

- `pandas`, `numpy` for data manipulation
- `textblob`, `vaderSentiment`, or `transformers` for sentiment scoring
- `yfinance` for historical price data
- `matplotlib`, `seaborn` for data visualization
- Optional: `backtesting.py` or custom backtesting logic

---

## 📌 Note

This is a simplified prototype model for educational purposes, developed as part of the **QuantCU Camp**. In real-world trading, more robust models and risk management techniques are required.

---

## 👨‍💻 Let’s get started!

###  - Install useful libraries

In [None]:
pip install transformers pandas matplotlib seaborn scikit-learn tqdm

### - Sentiment Analysis using OpenAI chat completion

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification, pipeline
import pandas as pd
import time

# โหลดโมเดลที่เทรนสำหรับข่าวการเงิน
pretrained_model = 'ahmedrachid/FinancialBERT-Sentiment-Analysis'
model = BertForSequenceClassification.from_pretrained(pretrained_model, num_labels=3)
tokenizer = BertTokenizer.from_pretrained(pretrained_model)

# เตรียม pipeline สำหรับ sentiment analysis
nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)


### - Prepare Data


In [None]:
data = {
    'headline': [
        "Company A reports record profits for Q1",
        "Company B faces lawsuit over product failure",
        "Neutral outlook expected for Company C"
    ],
    'price_before': [100, 150, 200],
    'price_after': [110, 140, 200]
}

headlines_df = pd.DataFrame(data)

### - Sentiment analysis


In [None]:
from tqdm import tqdm

# ทำ sentiment แบบ batch (เร็วกว่า apply มาก)
sentences = list(headlines_df['headline'])
results = nlp(sentences, batch_size=8)

# แปะผลลัพธ์ลง DataFrame
headlines_df['sentiment'] = [r['label'] for r in results]


### - Analysis

In [None]:
# สร้างคอลัมน์แสดงการเปลี่ยนแปลงของราคา
headlines_df['price_change'] = headlines_df['price_after'] - headlines_df['price_before']

# ระบุทิศทางของราคาหุ้น: ขึ้น = 1, ลง = -1, เท่าเดิม = 0
headlines_df['price_direction'] = headlines_df['price_change'].apply(lambda x: 1 if x > 0 else (-1 if x < 0 else 0))

# แปลง sentiment เป็น expected direction
def sentiment_to_direction(sent):
    return 1 if sent == 'Positive' else (-1 if sent == 'Negative' else 0)

headlines_df['expected_direction'] = headlines_df['sentiment'].apply(sentiment_to_direction)


### - Report


In [None]:
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt

# แสดง confusion matrix
conf_mat = confusion_matrix(headlines_df['expected_direction'], headlines_df['price_direction'])
sns.heatmap(conf_mat, annot=True, fmt="d", xticklabels=["ลง", "เท่าเดิม", "ขึ้น"], yticklabels=["ข่าวร้าย", "ข่าวเฉย", "ข่าวดี"])
plt.xlabel("ทิศทางราคาจริง")
plt.ylabel("จากโมเดล Sentiment")
plt.title("Confusion Matrix: Sentiment vs Real Price Movement")
plt.show()

# รายงานประสิทธิภาพ
print(classification_report(headlines_df['expected_direction'], headlines_df['price_direction']))



In [None]:
print(headlines_df[['headline', 'sentiment', 'price_before', 'price_after', 'price_direction', 'expected_direction']])