# `#OOTT`: Real-Time Sentiment-Based Trading Strategy

This final experiment aims to integrate previously developed techniques into a real-world financial context — specifically, oil trading. The strategy centers on real-time sentiment classification of tweets containing the `#OOTT` (Organization of Oil Traders on Twitter) hashtag, a well-followed tag among energy market participants.

Using Natural Language Processing (NLP), tweets are classified into **positive**, **neutral**, or **negative** sentiment categories. Based on this classification, a simple sentiment-driven trading signal is constructed:

* A **+1 long position** is triggered when the average sentiment over a given window exceeds a positive threshold.
* A **-1 short position** is triggered when the average sentiment falls below a negative threshold.

This approach tests the feasibility of applying real-time NLP classification to market-relevant social media data, offering a proof of concept for sentiment-based trading signals in the energy sector.


In [3]:
# Install required packages
!pip install -q pandas numpy snscrape transformers tensorflow nltk yfinance matplotlib seaborn tqdm

import pandas as pd
import numpy as np
import datetime
import re
import os

# Twitter scraping
import snscrape.modules.twitter as sntwitter  

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import pipeline
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from transformers import pipeline
import tensorflow as tf
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from collections import Counter
import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm
tqdm.pandas()
import warnings
warnings.filterwarnings('ignore')




[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m


---
# Data Collection

To demonstrate a proof of concept for applying Natural Language Processing (NLP) to trading contexts, synthetic oil market tweets were generated to replicate the style and content of posts typically found under the `#OOTT` hashtag on X (formerly Twitter). This approach facilitates model development and testing in the absence of accessible, large-scale, real-world data. The data is stored in `mock_oott_tweets.csv`

The method used for synthetic data generation is a template-based text synthesis technique. A curated set of tweet-like sentence templates was created to reflect domain-relevant language and typical market commentary. For each synthetic tweet, a template was randomly selected and combined with randomized metadata—such as timestamps, usernames, like counts, and retweet counts—to produce a semantically realistic and chronologically varied dataset.

As described by ChatGPT-4o, the approach can be summarized as follows:

> The method is a rule-based synthetic generation approach, often referred to as template-based data synthesis. A fixed set of tweet-like sentence templates were manually crafted to reflect domain-relevant language. Each tweet is generated by randomly sampling a template and combining it with randomized metadata (timestamp, username, likes, retweets) to simulate realistic engagement and chronology.

While access to real tweet data through the X (Twitter) API would offer higher fidelity and greater variability, that option was not pursued due to rate limitations, restrictive usage terms, and cost considerations that conflict with open-source and low-budget development objectives.

#### Alternative Approaches for Synthetic Data Generation

Several more sophisticated techniques exist for synthetic text generation but were not utilized in this context due to their respective constraints:

- **Generative Adversarial Networks (GANs)**  
  GANs perform well in continuous domains but are poorly suited to natural language due to the discrete and sequential nature of text. Basic GAN architectures are generally unable to capture linguistic coherence without substantial architectural modification.

- **SeqGAN (Sequence Generative Adversarial Network)**  
  A reinforcement learning-based variant of GANs designed for text. While more appropriate for language modeling, SeqGAN requires a large corpus of real tweets to train effectively and remains sensitive to hyperparameter choices and instability during training.

- **Fine-Tuning GPT-2**  
  Fine-tuning a pretrained language model such as GPT-2 on a labeled corpus of real `#OOTT` tweets would likely yield highly realistic outputs. However, this approach still depends on access to a sufficiently large, domain-specific dataset and computing resources for training.

In summary, template-based generation provides a practical and interpretable framework for simulating financial social media data in resource-constrained environments, supporting the development of NLP-based trading signals and classification models.

> Note: The synthetic nature of this data means that Sharpe ratios will be completely inaccurate, but still computed for a "proof of concept"

---

# Define Models

### Model Selection Overview

The following four Transformer-based models are defined for sentiment classification of oil-related tweets. Each model has been chosen to represent a distinct category of applicability, allowing for comparison across general-purpose, tweet-specific, and finance-domain-specialized language models.

| Model Key    | Model Name                                                          | Domain Focus    | Description                                                                                                                            |
| ------------ | ------------------------------------------------------------------- | --------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `distilbert` | `distilbert-base-uncased`                                           | General-purpose | A distilled version of BERT that provides strong baseline performance with reduced computational cost. Used as the control model.      |
| `cardiffnlp` | `cardiffnlp/twitter-roberta-base-sentiment`                         | Tweets          | A RoBERTa-based model trained on over 60M English tweets with sentiment labels. Optimized for short-form social media text.            |
| `bertweet`   | `finiteautomata/bertweet-base-sentiment-analysis`                   | Tweets          | Based on the BERTweet architecture, this model is pretrained and fine-tuned on Twitter data specifically for sentiment analysis tasks. |
| `finroberta` | `mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis` | Finance         | A lightweight DistilRoBERTa model fine-tuned on financial news headlines, intended to capture tone in economically relevant language.  |

These models are instantiated and managed through a centralized dictionary structure, enabling modular experimentation across zero-shot inference, pseudo-labeling, and fine-tuning tasks.


In [4]:
models = {
    "distilbert": {
        "id": "distilbert-base-uncased",
        "type": "general",
        "tokenizer": None,
        "model": None,
        "pipeline": None
    },
    "cardiffnlp": {
        "id": "cardiffnlp/twitter-roberta-base-sentiment",
        "type": "tweets",
        "tokenizer": None,
        "model": None,
        "pipeline": None
    },
    "bertweet": {
        "id": "finiteautomata/bertweet-base-sentiment-analysis",
        "type": "tweets",
        "tokenizer": None,
        "model": None,
        "pipeline": None
    },
    "finroberta": {
        "id": "mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis",
        "type": "finance",
        "tokenizer": None,
        "model": None,
        "pipeline": None
    }
}


# Load Data & Construct Helper Functions 

In [10]:
df = pd.read_csv("mock_oott_tweets.csv")

models = load_all_models(models)

def load_all_models(model_registry):
    for name, meta in model_registry.items():
        print(f"Loading model: {name} ({meta['id']})")
        tokenizer = AutoTokenizer.from_pretrained(meta["id"])
        model = TFAutoModelForSequenceClassification.from_pretrained(meta["id"])
        sentiment_pipe = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
        meta["tokenizer"] = tokenizer
        meta["model"] = model
        meta["pipeline"] = sentiment_pipe
    return model_registry

def run_zero_shot_sentiment(model_registry, df, text_column="text", sample_size=100):
    results = {}
    sample_df = df.sample(sample_size).reset_index(drop=True)

    for name, meta in model_registry.items():
        print(f"\nRunning sentiment with {name}...")
        preds = []
        for text in sample_df[text_column]:
            try:
                pred = meta["pipeline"](text[:512])[0]  # Truncate if needed
                preds.append(pred["label"])
            except Exception as e:
                preds.append("ERROR")
                print(f"Error: {e}")
        results[name] = preds

    return sample_df.assign(**results)

def summarize_sentiments(df, model_registry):
    from collections import Counter
    for name in model_registry:
        if name in df.columns:
            print(f"\n{name} Sentiment Distribution:")
            print(Counter(df[name]))

def prepare_tf_dataset(df, tokenizer, max_len=128, batch_size=8):
    tokens = tokenizer(
        list(df['text']),
        max_length=max_len,
        padding='max_length',
        truncation=True,
        return_tensors='tf'
    )

    dataset = tf.data.Dataset.from_tensor_slices((
        dict(tokens),
        df['label'].values
    ))
    return dataset.shuffle(100).batch(batch_size)



Loading model: distilbert (distilbert-base-uncased)


Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFDistilBertForSequenceClassification: ['vocab_transform.weight', 'vocab_transform.bias', 'vocab_layer_norm.weight', 'vocab_projector.bias', 'vocab_layer_norm.bias']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFDistilBertForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']
You should 

Loading model: cardiffnlp (cardiffnlp/twitter-roberta-base-sentiment)


KeyboardInterrupt: 

# Experiment 1: Few-Shot Learning

In [9]:
import pandas as pd
import numpy as np
import tensorflow as tf

# Load the dataset
df = pd.read_csv("mock_oott_tweets.csv")

# Sample 50 tweets for few-shot learning
few_shot_data = df.sample(n=50, random_state=42).reset_index(drop=True)

# Rename 'content' column to 'text' for consistency
few_shot_data = few_shot_data.rename(columns={"content": "text"})

# Assign random binary labels (0 = negative, 1 = positive) for now
few_shot_data["label"] = np.random.randint(0, 2, size=len(few_shot_data))

# Quick look
print(few_shot_data[["text", "label"]].head())

# Example: train DistilBERT on this few-shot sample
model_name = "distilbert"
tokenizer = models[model_name]["tokenizer"]
model = models[model_name]["model"]

# Prepare few-shot dataset
fewshot_ds = prepare_tf_dataset(few_shot_data, tokenizer)

# Fine-tune function (defined earlier)
history = fine_tune_few_shot(model, fewshot_ds, epochs=5)



                                                text  label
0  Geopolitical tensions in the Gulf fueling crud...      1
1  Crude oil inventories fell again this week. #OOTT      0
2  Geopolitical tensions in the Gulf fueling crud...      1
3  Russian exports reportedly down 300kbpd. Watch...      1
4  Geopolitical tensions in the Gulf fueling crud...      0


TypeError: 'NoneType' object is not callable