This notebook uses FinBERT for inference to get performance metrics on the test split of FinancialPhraseBank-v1.0/Sentence_50Agree.txt. 

Note: To run this, you must first acquire the ProsusAI/finbert repo (by download it or cloning it to your work env), then add this notebook the outer folder

ie. if you have the structure "Your Project Folder"/finBERT-master/

You add this notebook to "Your Project Folder" (Stock_Prediction for myself)

This notebook was ran in Google Colab

In [1]:
%cd drive/MyDrive/Stock_Prediction/finBERT-master/

/content/drive/MyDrive/Stock_Prediction/finBERT-master


In [2]:
! python scripts/datasets.py --data_path /content/drive/MyDrive/Stock_Prediction/finBERT-master/scripts/FinancialPhraseBank-v1.0/Sentences_50Agree.txt

  data = pd.read_csv(args.data_path, sep='.@', names=['text','label'], encoding='ISO-8859-1')


In [3]:
!pwd

/content/drive/MyDrive/Stock_Prediction/finBERT-master


In [4]:
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import TextClassificationPipeline
from sklearn.metrics import accuracy_score, precision_recall_fscore_support
import torch
import pandas as pd
from tqdm import tqdm

# Load FinBERT (ProsusAI/finbert)
model_name = "ProsusAI/finbert"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Load test dataset (assumes a CSV with 'text' and 'label' columns)
test_df = pd.read_csv("/content/drive/MyDrive/Stock_Prediction/finBERT-master/data/sentiment_data/test.csv", sep="\t")  # Change this path if needed
texts = test_df["text"].tolist()
true_labels = test_df["label"].tolist()  # Assumed to be integers: 0, 1, 2

# Create pipeline for classification
pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer, device=0 if torch.cuda.is_available() else -1, return_all_scores=False)

# Map FinBERT's outputs to integer labels
label_map = {'positive': 2, 'neutral': 1, 'negative': 0}

# Predict sentiment
pred_labels = []
for text in tqdm(texts):
    pred = pipeline(text)[0]['label'].lower()  # 'Positive', 'Neutral', 'Negative'
    pred_labels.append(label_map[pred])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/252 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/758 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

Device set to use cpu
100%|██████████| 970/970 [03:05<00:00,  5.23it/s]


ValueError: Mix of label input types (string and number)

In [5]:
true_labels = [label_map[label.lower()] for label in true_labels]
# Compute performance metrics
accuracy = accuracy_score(true_labels, pred_labels)
precision, recall, f1, _ = precision_recall_fscore_support(true_labels, pred_labels, average='macro')

print(f"Accuracy: {accuracy:.4f}")
print(f"Precision (macro): {precision:.4f}")
print(f"Recall (macro): {recall:.4f}")
print(f"F1 Score (macro): {f1:.4f}")

Accuracy: 0.8392
Precision (macro): 0.8062
Recall (macro): 0.8612
F1 Score (macro): 0.8290


In [10]:
# prompt: create a csv that shows the original test.csv, along with the pred_labels

import pandas as pd

# Assuming 'test_df' and 'pred_labels' are already defined from the previous code

# Create a new DataFrame with original data and predictions
results_df = test_df.copy()
pred_labels_mapped = [list(label_map.keys())[list(label_map.values()).index(label)] for label in pred_labels]
results_df['pred_labels_mapped'] = pred_labels_mapped
results_df['pred_labels'] = pred_labels_mapped

# Save the DataFrame to a CSV file
results_df.to_csv("results.csv", index=False, sep="\t")


In [11]:
# prompt: display results_df

# Assuming 'results_df' is already defined from the previous code

display(results_df)


Unnamed: 0.1,Unnamed: 0,text,label,pred_labels_mapped,pred_labels
0,2303,The Bristol Port Company has sealed a one mill...,positive,positive,positive
1,2736,A paper mill in the central Maine town of Madi...,neutral,neutral,neutral
2,2790,"ALEXANDRIA , Va. , Oct. 23 -- Hans-Otto Scheck...",neutral,neutral,neutral
3,2799,Altona stated that the private company of Alto...,neutral,neutral,neutral
4,2554,Registration is required,neutral,neutral,neutral
...,...,...,...,...,...
965,2343,"Raute , headquartered in Nastola , Finland , i...",neutral,neutral,neutral
966,4841,LONDON MarketWatch -- Share prices ended lower...,negative,negative,negative
967,4693,Net sales decreased to EUR 220.5 mn from EUR 4...,negative,negative,negative
968,3534,As a result some 20 persons will no longer be ...,negative,neutral,neutral


In [13]:
from sklearn.metrics import classification_report

print(classification_report(true_labels, pred_labels))

              precision    recall  f1-score   support

           0       0.74      0.92      0.82       128
           1       0.91      0.82      0.86       575
           2       0.77      0.84      0.80       267

    accuracy                           0.84       970
   macro avg       0.81      0.86      0.83       970
weighted avg       0.85      0.84      0.84       970

