<a href="https://colab.research.google.com/github/leomercanti/Course_Advanced_Investing_with_AI/blob/main/Advanced_Investing_with_AI_Module_2_Deep_Learning_and_NLP_in_Financial_Markets.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Course: Advanced Investing with AI**

## Module 2: Deep Learning and NLP in Financial Markets

(If you havent checked Module 1 yet, find it [here](https://colab.research.google.com/drive/15iRO6g-AyE2vGtdodh4xZ5RLmAcPcNV_))

<br>

**Learning Goals:**

- Develop a deep understanding of time series modeling using LSTMs and GRUs.
- Implement NLP techniques to analyze financial news, social media sentiment, and earnings calls.
- Apply deep learning techniques to predict stock prices and market movements.
- Use state-of-the-art models like BERT for sentiment analysis in financial contexts.


### 2.1 Core Readings and Resources

- **Textbook:** "Deep Learning for Finance" by Patrick Hebron

  - Chapter 2: Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) - Provides an in-depth explanation of time series forecasting using deep learning models.

- **Research Papers:**

  - “Sentiment Analysis of Financial News using Deep Learning” – Focus on how NLP models can be applied to financial texts.
  - “Stock Market Prediction using LSTM Networks” – Shows how LSTMs can model complex stock market behaviors.

- **Optional:**

  - “Attention is All You Need” by Vaswani et al. – Essential paper for understanding the architecture of Transformers, which form the backbone of models like BERT.

### 2.2 Key Topics Overview

**Time Series Forecasting with LSTMs**

- **Why LSTMs?** Traditional neural networks struggle with sequential data, like stock prices, because they don't remember previous inputs. LSTM (Long Short-Term Memory) networks solve this by maintaining a memory state, which makes them well-suited for predicting stock prices based on historical data.

- **Main Concepts:**

  - Gated Memory Cells: LSTMs use memory gates to retain important information across long sequences of data.
  - Vanishing Gradient Problem: LSTMs solve this problem, which traditional RNNs face, allowing them to capture long-term dependencies in financial data.

- **Use Case:** Predicting the next day’s stock price using past 30 days of historical prices.

- **Hands-On Example:** Predicting Stock Prices Using LSTMs

In [None]:
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense

In [None]:
# Download historical data
data = yf.download("AAPL", start="2016-01-01", end="2024-09-01")

In [None]:
# Preprocess data (using close price for simplicity)
close_prices = data['Close'].values
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(close_prices.reshape(-1, 1))

In [None]:
# Inspect data
print(data.tail())

In [None]:
# Prepare training data (using past 60 days to predict the next day)
sequence_length = 60
X_train, y_train = [], []

for i in range(sequence_length, len(scaled_data)):
    X_train.append(scaled_data[i-sequence_length:i, 0])
    y_train.append(scaled_data[i, 0])

X_train, y_train = np.array(X_train), np.array(y_train)
X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))

In [None]:
# Build LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], 1)))
model.add(LSTM(units=50))
model.add(Dense(units=1))  # Predicting next closing price

In [None]:
# Compile and train the model
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train, epochs=10, batch_size=32)

In [None]:
# Predict future prices
test_data = scaled_data[-sequence_length:]  # Last 60 days for prediction
test_data = np.reshape(test_data, (1, sequence_length, 1))
predicted_price = model.predict(test_data)
predicted_price = scaler.inverse_transform(predicted_price)

print(f"Predicted next day price: ${predicted_price[0][0]:.2f}")

**NLP and Sentiment Analysis in Financial Markets**

- **Why NLP in Finance?** Market sentiment derived from news articles, social media, or earnings calls can heavily influence stock price movements. NLP techniques, particularly sentiment analysis, can gauge the mood of the market and provide signals for trading decisions.

- **Main Concepts:**

  - **Text Preprocessing:** Tokenization, stop-word removal, stemming, and lemmatization.
  - **Word Embeddings:** Representing text as vectors using Word2Vec, GloVe, or more advanced models like BERT.
  - **Sentiment Analysis:** Using models like BERT, DistilBERT, or simple LSTM networks to classify the sentiment of news articles.

- **Hands-On Example:** Sentiment Analysis Using BERT

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification
import torch

In [None]:
# Load pre-trained BERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

In [None]:
# Example financial headline
headline = "Apple posts record profits, stock soars"

In [None]:
# Tokenization
inputs = tokenizer(headline, return_tensors='pt', padding=True, truncation=True, max_length=64)

In [None]:
# Sentiment prediction
outputs = model(**inputs)
logits = outputs.logits
sentiment = torch.softmax(logits, dim=1).detach().numpy()

In [None]:
# Output: sentiment probabilities
print(f"Sentiment probabilities: {sentiment}")

- **Sentiment Score Interpretation:** The output is the probability that the news headline is positive or negative. In the context of trading, you can use this sentiment score to make buy/sell decisions.

**Applications in Finance:**

- **News Sentiment Trading:** Aggregate sentiment from multiple news sources or social media and build sentiment-driven trading algorithms.
- **Earnings Call Analysis:** Use NLP to analyze sentiment and key phrases from earnings call transcripts, influencing stock price reactions.

### 2.3 Advanced Concepts: Time Series and Transformer Models

**Why Transformers in Finance?**
Transformers, especially models like BERT, GPT, and RoBERTa, have revolutionized NLP due to their ability to process large amounts of data with attention mechanisms. In finance, they are used for tasks like market sentiment analysis and financial document comprehension.

- **Main Concepts:**
  - **Attention Mechanism:** The core of transformers that allows the model to focus on different parts of the input text and understand context better.
  - **Pre-Trained Models:** Using pre-trained models (like BERT) and fine-tuning them for finance-specific tasks.

- **Hands-On Example:** Financial News Classification Using BERT

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification
from torch.optim import AdamW
from torch.utils.data import DataLoader, TensorDataset
import torch

In [None]:
# Example dataset: A collection of financial news headlines
news_data = [
    ("Apple releases new product line, stock surges", 1),  # Positive
    ("Tesla faces lawsuits over safety concerns, shares dip", 0),  # Negative
    # Add more headlines...
]

In [None]:
# Preprocessing the data
texts, labels = zip(*news_data)
inputs = tokenizer(list(texts), return_tensors='pt', padding=True, truncation=True, max_length=64)
labels = torch.tensor(labels)

In [None]:
# Create dataset and dataloader
dataset = TensorDataset(inputs['input_ids'], inputs['attention_mask'], labels)
dataloader = DataLoader(dataset, batch_size=8)

In [None]:
# Fine-tuning BERT for classification
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
optimizer = AdamW(model.parameters(), lr=5e-5)

In [None]:
# Training loop (simplified)
for epoch in range(3):
    for batch in dataloader:
        input_ids, attention_mask, labels = batch
        outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
        loss = outputs.loss
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

In [None]:
# Model inference on new data
test_headline = "Tesla unveils revolutionary battery technology, stock skyrockets"
test_inputs = tokenizer(test_headline, return_tensors='pt', padding=True, truncation=True, max_length=64)
output = model(**test_inputs)
predicted_sentiment = torch.softmax(output.logits, dim=1).detach().numpy()

print(f"Predicted Sentiment: {predicted_sentiment}")

- **Practical Uses:** Once fine-tuned, you can use BERT to classify any news headline into positive or negative sentiment and integrate it into trading algorithms.

### 2.4 End of Module Assignments and Practice (Optional)

- **Assignment 1:** Implement an LSTM model for predicting stock prices using at least 60 days of past price data. Tune the hyperparameters (e.g., number of LSTM units, epochs, and batch size) for optimal performance.

- **Assignment 2:** Download financial news articles related to specific stocks and perform sentiment analysis using a pre-trained BERT model. Analyze the sentiment’s correlation with stock price movements.

By the end of **Module 2**, you should be familiar with using **deep learning models**, such as **LSTMs**, to predict stock prices based on time series data and applying **NLP** techniques like **BERT** for sentiment analysis on financial news and reports. You’ve learned how to extract insights from both structured time series data and unstructured text data, which are critical for making more informed investment decisions.

This knowledge sets the stage for even more advanced AI techniques, including **reinforcement learning** and **automated trading**, which you will explore in the next module.