### Uisnig FinBERT for Sentiment Analysis

Here, we will use FinBERT for sentiment analysis of the news in `media.csv`. We choose FinBERT because it is a pre-trained model that is specifically trained on financial news. We compared it some other models and found that it performed the best in the notebook `SA(NLP).ipynb`.

## Loading the data

In [1]:
import pandas as pd

media_data = pd.read_csv("../test_data/media.csv", index_col = 0, parse_dates = ['pub_date'])

In [2]:
from transformers import set_seed

set_seed(2023)

  from .autonotebook import tqdm as notebook_tqdm


## Using ProsusAI/finbert

See https://huggingface.co/ProsusAI/finbert

In [5]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer_finbert = AutoTokenizer.from_pretrained("ProsusAI/finbert")

model_finbert = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert")

In [6]:
from transformers import pipeline

analyzer_finbert = pipeline("sentiment-analysis", model= model_finbert , tokenizer = tokenizer_finbert)

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


In [7]:
## make a function that reades the label of the analyzer_finbert result
## if the label is positive, it returns the analyzer_finbert score
## if the label is negative, it returns the negative of the analyzer_finbert score

def analyzer_finbert_score(text):
    result = analyzer_finbert(text)
    if result[0].get('label') == 'positive':
        return result[0].get('score')
    if result[0].get('label') == 'negative':
        return -result[0].get('score')
    else:
        return 0

In [8]:
## make a function that reades the result of analyzer_finbert_score
## if the score is greater than 0.9 , it returns 2
## if the score is between 0.5 and 0.9, it returns 1
## if the score is between -0.5 and 0.5, it returns 0
## if the score is between -0.9 and -0.5, it returns -1
## if the score is less than -0.9, it returns -2

def analyzer_finbert_bin(text):
    score = analyzer_finbert_score(text)
    if score > 0.9:
        return 2
    if score > 0.5:
        return 1
    if score > -0.5:
        return 0
    if score > -0.9:
        return -1
    else:
        return -2


In [9]:
## Apply analyzer_fin pipeline to the text of each article and recording the sentiment scores in a new column

# initialize a list to store the sentiment scores
sentiment_scores = []

# loop through each article
for text in media_data['text']:
    # apply the sentiment analysis pipeline to the abstract
    sentiment_scores.append(analyzer_finbert_score(text))
    
# add the sentiment scores to the media data
media_data['finbert-sentiment-score'] = sentiment_scores

In [10]:
## Or we can apply analyzer_finbert pipeline to the text of each article and recording the sentiment scores in a new column

# initialize a list to store the sentiment scores
sentiment_scores = []

# loop through each article
for text in media_data['text']:
    # apply the sentiment analysis pipeline to the abstract
    sentiment_scores.append(analyzer_finbert_bin(text))
    
# add the sentiment scores to the media data
media_data['finbert-bin'] = sentiment_scores

In [11]:
media_data.sample(10)

Unnamed: 0,pub_date,abstract,lead_paragraph,snippet,headline.main,text,Polarity,Sentiment,NLP_fin-sentiment-text,finbert-sentiment-score,finbert-bin
56,2022-05-20 15:50:25+00:00,"Flush with cash, Facebook, Apple, Amazon, Micr...","SAN FRANCISCO — Apple, Amazon, Microsoft and t...","Flush with cash, Facebook, Apple, Amazon, Micr...",Big Tech Is Getting Clobbered on Wall Street. ...,"Flush with cash, Facebook, Apple, Amazon, Micr...",0.870781,0.830735,-0.974612,-0.915605,-2
12,2022-09-16 09:00:25+00:00,Need to find a restaurant or figure out how to...,When Ja’Kobi Moore decided to apply this year ...,Need to find a restaurant or figure out how to...,"For Gen Z, TikTok Is the New Search Engine",Need to find a restaurant or figure out how to...,-0.590656,0.774193,-0.999481,0.0,0
68,2022-04-08 14:34:29+00:00,New research finds companies are starting to r...,"As a middle school student in New York, Shekin...",New research finds companies are starting to r...,A 4-Year Degree Isn’t Quite the Job Requiremen...,New research finds companies are starting to r...,0.214991,0.123951,-0.683587,0.0,0
0,2022-10-25 20:37:03+00:00,Google’s parent company reported earnings that...,"Even Alphabet, the parent company of Google an...",Google’s parent company reported earnings that...,Alphabet’s Profit Drops 27 Percent From a Year...,Google’s parent company reported earnings that...,-0.478889,0.504897,-0.998828,-0.973848,-2
82,2022-01-20 21:10:22+00:00,"Mustafa Suleyman, who played a key role in the...","Mustafa Suleyman, a pioneer in the field of ar...","Mustafa Suleyman, who played a key role in the...",DeepMind co-founder leaves Google after a rock...,"Mustafa Suleyman, who played a key role in the...",0.812955,0.340731,-0.999767,0.0,0
66,2022-04-12 18:59:03+00:00,The company’s first consumer protection lawsui...,"In a first for the tech giant, Google filed a ...",The company’s first consumer protection lawsui...,"In a First, Google Goes After Puppy Fraud in C...",The company’s first consumer protection lawsui...,-0.6178,0.8901,-0.999104,-0.626287,-1
91,2022-01-10 15:00:05+00:00,The latest tranche totals about 200. They are ...,Google wrongly claimed attorney-client privile...,The latest tranche totals about 200. They are ...,Google must turn over more documents in a labo...,The latest tranche totals about 200. They are ...,-0.762656,0.638658,-0.999734,-0.827862,-1
44,2022-06-16 09:00:33+00:00,A video producer claims he was fired after he ...,"OREGON HOUSE, Calif. — In a tiny town in the f...",A video producer claims he was fired after he ...,How a Religious Sect Landed Google in a Lawsuit,A video producer claims he was fired after he ...,0.776261,0.86566,-0.931622,-0.839244,-1
46,2022-06-13 02:13:37+00:00,The tech giant admitted no wrongdoing as it re...,Google has settled a class-action lawsuit that...,The tech giant admitted no wrongdoing as it re...,Google Agrees to Pay $118 Million to Settle Pa...,The tech giant admitted no wrongdoing as it re...,-0.059274,0.316655,0.998235,0.0,0
20,2022-08-30 20:57:56+00:00,The social media app is not available on Googl...,Google said on Tuesday that it would not distr...,The social media app is not available on Googl...,Google Says Trump’s Truth Social Must Scrub Vi...,The social media app is not available on Googl...,0.409295,0.261195,-0.998796,-0.778782,-1


In [12]:
media_data.describe()

Unnamed: 0,Polarity,Sentiment,NLP_fin-sentiment-text,finbert-sentiment-score,finbert-bin
count,100.0,100.0,100.0,100.0,100.0
mean,-0.013616,0.549705,-0.53743,-0.333282,-0.59
std,0.595266,0.269949,0.79021,0.477305,0.853927
min,-0.998627,0.029718,-0.99988,-0.975459,-2.0
25%,-0.48127,0.315349,-0.998546,-0.833797,-1.0
50%,-0.021043,0.582934,-0.981465,0.0,0.0
75%,0.522078,0.769434,-0.656894,0.0,0.0
max,0.974263,0.994465,0.999672,0.886748,1.0


## Saving the results

In [13]:
## save the media data
media_data.to_csv("../test_data/media.csv")