# Baseline Experiment
- Hypothesis (h1a): Sentiment outputs, mapped to direction, can predict short-term exchange rate movements.
- Objective: A baseline comparison before testing hyped-up LLMs capabilities
- Value Proposition: This is the first known study to be conducted on applying language models to trade in emerging currency markets. Especially in a multilingual context.
- Why sentiment analysis as baseline: 
    - FinBERT is widely cited in financial NLP literature
    - Outperforms general BERT and lexicon-based models on tasks like financial sentiment classification
    - Traditional ML methods rely on sparse inputs or static word embeddings (like Word2Vec) which don't capture context
    - Sentiment analysis is commonly used in generating trading signals, however I believe that market does not operate
    on whether a piece of text is happy or sad. Thus, I'm expecting the following experiments to outperform this baseline. 
    I just want to rule sentiment analysis out of the picture. "Predicting directional movement" is a better approach.
    - It was used as a benchmark in very similar paper found at https://doi.org/10.1016/j.mlwa.2023.100508

- Independent Variable (Predictor):
    - Text: headline / article content
    - Category: FinBERT sentiment output
    - Binary Label: heuristic mapping (positive -> 1, negative -> -1) (bullish or bearish in commercial terms)
    - (POSSIBLY CONSIDER as a control var/experiment?): Multi-class Label: neutral (0) label defined by threshold label (min exchange rate % change)

- Dependent Variable (Ground Truth):
    - Directional Movement: binary direction of exchange rate following news timestamp (time frame TBD)
    - (POSSIBLY CONSIDER?): percent change in exchange rate over defined window? Measures profitability...

- Dataset Creation Process:
    - News Data: There are only 3,519 headlines before **Timestamp[2024-12-30 17:38:00]**, which is the latest possible timestamp for a t+20 analysis (since exchange rate data ends at 17:58:00). In total, the dataset contains 4630 headlines. These cannot be used until exchange rates are available on a minute-level basis for December 30, 2024, to January 15, 2025. 
        - Dataset Creation Process: Bom Dia Mercado (BDM) → Eli formatted news data into excel file → preprocess.ipynb → export to repo → final dataset
    - FX Rate Data: Minute-level time series of USD/BRL exchange rates, synchronized with news timestamps in pandas ISO datetime object format.
        - Dataset Creation Process: Bloomberg → retrieve USD/BRL exchange rates as excel file → preprocess.ipynb → export to repo → final dataset
    - Final Dataset: experimental_dataset.csv
        - Took the last available exchange rate for any news released outside of market hours. Process described in preprocess.ipynb. 

- Methodology: Sentiment Analysis
    - Encoder-only (representation model) - BERT - FinBERT-PT-BR is a domain specific version of FinBERT, another domain specific BERT model
- Model:
    - HuggingFace Transformers Model: lucas-leme/FinBERT-PT-BR

- RESULTS:

Notes:
    - Dataset used in this experiment: experimental_dataset.csv with 3519 news headlines 
    - Using HEADLINES ONLY not ARTICLE CONTENT and COMMENTS as past research has shown that these are not useful for prediction purposes, and they are noisy.
    - Straightforward t+1 to t+20 prediction horizon by computing directional movement using following exchange rate minus exchange rate at t for each increment

In [None]:
'''
Doc 1 Methodology: Load lucas-leme/FinBERT-PT-BR / tokenizer → tokenize headlines → FinBERT → sentiment output

HuggingFace notes
- BERT is an architecture while lucas-leme/FinBERT-PT-BR is a checkpoint
- import the model specific class from the transformers library
- call from_pretrained() from the above class to download the model's weights (pytorch_model.bin) and configuration settings (config.json)
- tokenizer is a class from the transformers library that finds the tokenizer specified in the checkpoint and fully preprocesses input text
'''


## Load Model from HuggingFace (only need to do once)

In [None]:
from transformers import AutoTokenizer, BertForSequenceClassification

# Load from HuggingFace
tokenizer = AutoTokenizer.from_pretrained("lucas-leme/FinBERT-PT-BR")
model = BertForSequenceClassification.from_pretrained("lucas-leme/FinBERT-PT-BR")

# Save locally
model.save_pretrained("../checkpoints/exp001")
tokenizer.save_pretrained("../checkpoints/exp001")

## Load Model from Local Storage

In [None]:
from transformers import AutoTokenizer, BertForSequenceClassification, pipeline
import pandas as pd

local_path = "../../checkpoints/exp001"

df = pd.read_csv("") 

model = BertForSequenceClassification.from_pretrained(local_path)
tokenizer = AutoTokenizer.from_pretrained(local_path)

In [None]:
model = BertForSequenceClassification.from_pretrained(
    local_path,
    trust_remote_code=True,
    local_files_only=True
)
tokenizer = AutoTokenizer.from_pretrained(
    local_path,
    trust_remote_code=True,
    local_files_only=True
)
finbert_pipeline = pipeline(
    task='text-classification',
    model=model,
    tokenizer=tokenizer
)
# mapping predictions
pred_mapper = {
    0: "POSITIVE",
    1: "NEGATIVE", 
    2: "NEUTRAL"
}

In [None]:
# Sentiment Analysis
results = []
for headline in df['Headline']:
    result = finbert_pipeline(headline)[0]

    if result['label'] == pred_mapper[0]:  # POSITIVE
        sentiment = 1
    elif result['label'] == pred_mapper[1]:  # NEGATIVE
        sentiment = -1
    elif result['label'] == pred_mapper[2]:  # NEUTRAL
        sentiment = 0
    results.append(sentiment)

# save predictions to the dataframe
df['Prediction'] = results

In [None]:
df.to_csv("../../results/exp001/exp001.csv", index=False)

## Analyze Colab Results

In [None]:
import pandas as pd
df = pd.read_csv("../../results/exp001/exp001.csv")
preds = df['Prediction']
print(preds.value_counts(), '\n' 'total vals (it checks out - good): 'f'{preds.value_counts().sum()}')

In [None]:
from sklearn.metrics import confusion_matrix

filtered_df = df[df["Prediction"] != 0].copy() # no DA for neutral (0), just binary classification. Rid of all neutral predictions 

forward_return_cols = [col for col in df.columns if col.startswith("Forward Return t+")]

conf_matrices = {}
'''
[[TN, FP],
 [FN, TP]]
'''

for col in forward_return_cols:
    y_true = filtered_df[col]
    y_pred = filtered_df["Prediction"]

    #  -1 and 1 (exclude 0s in ground truth if present)
    mask = y_true != 0
    y_true_filtered = y_true[mask]
    y_pred_filtered = y_pred[mask]

    #  confusion matrix with labels fixed to [-1, 1]
    cm = confusion_matrix(y_true_filtered, y_pred_filtered, labels=[-1, 1])
    conf_matrices[col] = cm

# Display one example
for k, v in conf_matrices.items():
    print(f"Confusion matrix for {k}:\n{v}\n")

In [None]:
#accuracies

filtered_df = df[df["Prediction"] != 0].copy()

forward_return_cols = [col for col in df.columns if col.startswith("Forward Return t+")]
accuracies = {}

for col in forward_return_cols:
    y_true = filtered_df[col]
    y_pred = filtered_df["Prediction"]
    mask = y_true != 0
    accuracy = (y_true[mask] == y_pred[mask]).mean()
    accuracies[col] = accuracy

accuracy_df = pd.DataFrame.from_dict(accuracies, orient='index', columns=['Accuracy'])
accuracy_df.index.name = 'Horizon'
accuracy_df.reset_index(inplace=True)

display(accuracy_df)
