## finBERT Model
This notebook is used to run the finBERT model where we can add sentiment to the stocks_news.csv file.

### 1. Installing transformers 

In [1]:
%%capture
%pip install 'transformers[torch]' torchvision torchaudio

### 2. Importing the libraries
Folowing commands are uploaded in the code file
*   transformers[`torch`]: This installs the Hugging Face Transformers library, which is a popular library for Natural Language Processing (NLP) tasks. The [`torch`] part specifies that the installation should include the PyTorch version of the library, which is needed if you plan to use it with PyTorch.
*   torchvision: This installs torchvision, which is a PyTorch library that provides datasets, transforms, and models for computer vision tasks.
*   torchaudio: This installs torchaudio, which is a PyTorch library for audio processing.
*   The %%capture magic command ensures that the output of the installation process is not displayed in the notebook cell.

In [2]:
from transformers import BertForSequenceClassification, BertTokenizer, pipeline
import torch
import torch.nn.functional as F
import pandas as pd
import matplotlib.pyplot as plt

2023-07-30 00:41:40.827529: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


### 3. Model and Tokenizer Initialization
`BertForSequenceClassification`: This is a class from the transformers library that represents a pre-trained BERT model fine-tuned for sequence classification tasks. In your case, the specific model being used is 'ProsusAI/finbert', which is a BERT model pre-trained on financial data and fine-tuned for sentiment analysis of financial news headlines.

`BertTokenizer`: This is a class from the transformers library that represents a BERT tokenizer. The tokenizer is responsible for converting text data into numerical tokens that can be understood by the BERT model. It segments text into subwords (or words) and converts them into numerical representations suitable for feeding into the BERT model.

By using from_pretrained() for both BertForSequenceClassification and BertTokenizer, the code is loading the pre-trained 'ProsusAI/finbert' model and its tokenizer. This allows you to use the pre-trained model for sentiment analysis on financial news headlines.

In [3]:
model = BertForSequenceClassification.from_pretrained('ProsusAI/finbert')
tokenizer = BertTokenizer.from_pretrained('ProsusAI/finbert')

model.save_pretrained("../..")
tokenizer.save_pretrained("../..")

In [4]:
stocks_news = pd.read_csv('../finbert_stocks_input.csv')
stocks_news.head(10)

Unnamed: 0,related,datetime,headline,id,source,summary,url,datetime_norm,Open,High,Low,Close,Adj Close,Volume
0,AAPL,1672719360,Apple downgraded to Neutral from Outperform at...,118097919,Thefly.com,Looking for stock market analysis and research...,https://finnhub.io/api/news?id=c94bafbd39d76d2...,2023-01-03,130.279999,130.899994,124.169998,125.07,124.706833,112117500
1,AAPL,1672806957,China's 2025 Strategy Emerges As Top 2023 Inve...,118110299,SeekingAlpha,The most important investment story of 2023 wi...,https://finnhub.io/api/news?id=f6acfb521bd94a6...,2023-01-04,126.889999,128.660004,125.080002,126.360001,125.993095,89113600
2,AAPL,1672892100,BMW Takes Cues From Apple With Radical Interio...,118133082,Yahoo,(Bloomberg) -- BMW AG’s latest prototype could...,https://finnhub.io/api/news?id=60dae185ca4d6d6...,2023-01-05,127.129997,127.769997,124.760002,125.019997,124.656982,80962700
3,AAPL,1672988286,Covid chaos in China and a new Swiss haven,118167357,Yahoo,"This is Kenji from Hong Kong, where the schedu...",https://finnhub.io/api/news?id=e3cb80be9a64b4f...,2023-01-06,126.010002,130.289993,124.889999,129.619995,129.243622,87754700
4,AAPL,1673293335,Apple's VP services Stern to depart - Insider,118194379,Reuters,"Apple Inc's vice president of services, Peter ...",https://finnhub.io/api/news?id=576e97b88e07030...,2023-01-09,130.470001,133.410004,129.889999,130.149994,129.772079,70790800
5,AAPL,1673398441,Apple to begin making in-house screens from 20...,118194376,Reuters,Apple Inc is planning to start using its own c...,https://finnhub.io/api/news?id=f6aa021fdc3dc7a...,2023-01-10,130.259995,131.259995,128.119995,130.729996,130.350403,63896200
6,AAPL,1673414160,Apple Is Moving More of Its Supply Chain In-Ho...,118224930,MarketWatch,Apple’s apparent desire to move more of its co...,https://finnhub.io/api/news?id=616aff14a647bd2...,2023-01-11,131.25,133.509995,130.460007,133.490005,133.102386,69458900
7,AAPL,1673496480,Taiwan Semiconductor stock rises after chipmak...,118222444,MarketWatch,Contract chipmaker Taiwan Semiconductor Manufa...,https://finnhub.io/api/news?id=cd04aa27af0e266...,2023-01-12,133.880005,134.259995,131.440002,133.410004,133.022614,71379600
8,AAPL,1673591031,"Apple chief Tim Cook takes over 40% pay cut, s...",118232127,Yahoo,"Pay cut based on ‘shareholder feedback, Apple’...",https://finnhub.io/api/news?id=bc3b35bd469962a...,2023-01-13,132.029999,134.919998,131.660004,134.759995,134.368698,57809700
9,AAPL,1673928420,Foxconn Replaces iPhone Business Chief After T...,118286413,Yahoo,(Bloomberg) -- Key Apple Inc. manufacturing pa...,https://finnhub.io/api/news?id=d5aefc49ae150f9...,2023-01-17,134.830002,137.289993,134.130005,135.940002,135.545273,63646600


### 4. Stock News FinBERT Model Prediction
Here, we perform the operations using the PyTorch. The `headline` and `summary` is combined to form one single entity called `news_description`.

In [5]:
print(len(stocks_news))

638


In [6]:
stocks_sentiment = stocks_news.copy()

stocks_sentiment.head(10)

Unnamed: 0,related,datetime,headline,id,source,summary,url,datetime_norm,Open,High,Low,Close,Adj Close,Volume
0,AAPL,1672719360,Apple downgraded to Neutral from Outperform at...,118097919,Thefly.com,Looking for stock market analysis and research...,https://finnhub.io/api/news?id=c94bafbd39d76d2...,2023-01-03,130.279999,130.899994,124.169998,125.07,124.706833,112117500
1,AAPL,1672806957,China's 2025 Strategy Emerges As Top 2023 Inve...,118110299,SeekingAlpha,The most important investment story of 2023 wi...,https://finnhub.io/api/news?id=f6acfb521bd94a6...,2023-01-04,126.889999,128.660004,125.080002,126.360001,125.993095,89113600
2,AAPL,1672892100,BMW Takes Cues From Apple With Radical Interio...,118133082,Yahoo,(Bloomberg) -- BMW AG’s latest prototype could...,https://finnhub.io/api/news?id=60dae185ca4d6d6...,2023-01-05,127.129997,127.769997,124.760002,125.019997,124.656982,80962700
3,AAPL,1672988286,Covid chaos in China and a new Swiss haven,118167357,Yahoo,"This is Kenji from Hong Kong, where the schedu...",https://finnhub.io/api/news?id=e3cb80be9a64b4f...,2023-01-06,126.010002,130.289993,124.889999,129.619995,129.243622,87754700
4,AAPL,1673293335,Apple's VP services Stern to depart - Insider,118194379,Reuters,"Apple Inc's vice president of services, Peter ...",https://finnhub.io/api/news?id=576e97b88e07030...,2023-01-09,130.470001,133.410004,129.889999,130.149994,129.772079,70790800
5,AAPL,1673398441,Apple to begin making in-house screens from 20...,118194376,Reuters,Apple Inc is planning to start using its own c...,https://finnhub.io/api/news?id=f6aa021fdc3dc7a...,2023-01-10,130.259995,131.259995,128.119995,130.729996,130.350403,63896200
6,AAPL,1673414160,Apple Is Moving More of Its Supply Chain In-Ho...,118224930,MarketWatch,Apple’s apparent desire to move more of its co...,https://finnhub.io/api/news?id=616aff14a647bd2...,2023-01-11,131.25,133.509995,130.460007,133.490005,133.102386,69458900
7,AAPL,1673496480,Taiwan Semiconductor stock rises after chipmak...,118222444,MarketWatch,Contract chipmaker Taiwan Semiconductor Manufa...,https://finnhub.io/api/news?id=cd04aa27af0e266...,2023-01-12,133.880005,134.259995,131.440002,133.410004,133.022614,71379600
8,AAPL,1673591031,"Apple chief Tim Cook takes over 40% pay cut, s...",118232127,Yahoo,"Pay cut based on ‘shareholder feedback, Apple’...",https://finnhub.io/api/news?id=bc3b35bd469962a...,2023-01-13,132.029999,134.919998,131.660004,134.759995,134.368698,57809700
9,AAPL,1673928420,Foxconn Replaces iPhone Business Chief After T...,118286413,Yahoo,(Bloomberg) -- Key Apple Inc. manufacturing pa...,https://finnhub.io/api/news?id=d5aefc49ae150f9...,2023-01-17,134.830002,137.289993,134.130005,135.940002,135.545273,63646600


In [7]:
print(stocks_sentiment['headline'].isnull().sum())
print(len(stocks_sentiment['headline']))

0
638


In [8]:
stock_sentiment_predictor = pipeline(task='sentiment-analysis', model=model, tokenizer=tokenizer)

# Get a list of headlines from the DataFrame
stock_headline = stocks_sentiment['headline'].tolist()

# Use the sentiment_predictor pipeline to predict sentiments for all headlines
stock_predictions = stock_sentiment_predictor(stock_headline)
# print(predictions)
print(len(stock_predictions))

# Extract the predicted sentiment labels and sentiment scores using list comprehensions
sentiments = [pred['label'] for pred in stock_predictions]
scores = [round(pred['score'], 3) for pred in stock_predictions]


Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


638


In [10]:
print(stock_predictions)

[{'label': 'negative', 'score': 0.5127232074737549}, {'label': 'positive', 'score': 0.7960140705108643}, {'label': 'neutral', 'score': 0.8397179841995239}, {'label': 'neutral', 'score': 0.7734225392341614}, {'label': 'negative', 'score': 0.9159166812896729}, {'label': 'neutral', 'score': 0.8729947209358215}, {'label': 'negative', 'score': 0.9399288892745972}, {'label': 'negative', 'score': 0.9666069746017456}, {'label': 'negative', 'score': 0.962792158126831}, {'label': 'negative', 'score': 0.6591130495071411}, {'label': 'positive', 'score': 0.9440958499908447}, {'label': 'neutral', 'score': 0.8667579293251038}, {'label': 'neutral', 'score': 0.9506423473358154}, {'label': 'positive', 'score': 0.8969438076019287}, {'label': 'neutral', 'score': 0.5101176500320435}, {'label': 'positive', 'score': 0.40891119837760925}, {'label': 'neutral', 'score': 0.8587100505828857}, {'label': 'neutral', 'score': 0.8877654075622559}, {'label': 'neutral', 'score': 0.8271477818489075}, {'label': 'neutral',

In [11]:
# Add the predicted_sentiment and sentiment_score as new columns to the DataFrame
stocks_sentiment['sentiment'] = sentiments
stocks_sentiment['score'] = scores

In [12]:
stocks_sentiment.head(20)

Unnamed: 0,related,datetime,headline,id,source,summary,url,datetime_norm,Open,High,Low,Close,Adj Close,Volume,sentiment,score
0,AAPL,1672719360,Apple downgraded to Neutral from Outperform at...,118097919,Thefly.com,Looking for stock market analysis and research...,https://finnhub.io/api/news?id=c94bafbd39d76d2...,2023-01-03,130.279999,130.899994,124.169998,125.07,124.706833,112117500,negative,0.513
1,AAPL,1672806957,China's 2025 Strategy Emerges As Top 2023 Inve...,118110299,SeekingAlpha,The most important investment story of 2023 wi...,https://finnhub.io/api/news?id=f6acfb521bd94a6...,2023-01-04,126.889999,128.660004,125.080002,126.360001,125.993095,89113600,positive,0.796
2,AAPL,1672892100,BMW Takes Cues From Apple With Radical Interio...,118133082,Yahoo,(Bloomberg) -- BMW AG’s latest prototype could...,https://finnhub.io/api/news?id=60dae185ca4d6d6...,2023-01-05,127.129997,127.769997,124.760002,125.019997,124.656982,80962700,neutral,0.84
3,AAPL,1672988286,Covid chaos in China and a new Swiss haven,118167357,Yahoo,"This is Kenji from Hong Kong, where the schedu...",https://finnhub.io/api/news?id=e3cb80be9a64b4f...,2023-01-06,126.010002,130.289993,124.889999,129.619995,129.243622,87754700,neutral,0.773
4,AAPL,1673293335,Apple's VP services Stern to depart - Insider,118194379,Reuters,"Apple Inc's vice president of services, Peter ...",https://finnhub.io/api/news?id=576e97b88e07030...,2023-01-09,130.470001,133.410004,129.889999,130.149994,129.772079,70790800,negative,0.916
5,AAPL,1673398441,Apple to begin making in-house screens from 20...,118194376,Reuters,Apple Inc is planning to start using its own c...,https://finnhub.io/api/news?id=f6aa021fdc3dc7a...,2023-01-10,130.259995,131.259995,128.119995,130.729996,130.350403,63896200,neutral,0.873
6,AAPL,1673414160,Apple Is Moving More of Its Supply Chain In-Ho...,118224930,MarketWatch,Apple’s apparent desire to move more of its co...,https://finnhub.io/api/news?id=616aff14a647bd2...,2023-01-11,131.25,133.509995,130.460007,133.490005,133.102386,69458900,negative,0.94
7,AAPL,1673496480,Taiwan Semiconductor stock rises after chipmak...,118222444,MarketWatch,Contract chipmaker Taiwan Semiconductor Manufa...,https://finnhub.io/api/news?id=cd04aa27af0e266...,2023-01-12,133.880005,134.259995,131.440002,133.410004,133.022614,71379600,negative,0.967
8,AAPL,1673591031,"Apple chief Tim Cook takes over 40% pay cut, s...",118232127,Yahoo,"Pay cut based on ‘shareholder feedback, Apple’...",https://finnhub.io/api/news?id=bc3b35bd469962a...,2023-01-13,132.029999,134.919998,131.660004,134.759995,134.368698,57809700,negative,0.963
9,AAPL,1673928420,Foxconn Replaces iPhone Business Chief After T...,118286413,Yahoo,(Bloomberg) -- Key Apple Inc. manufacturing pa...,https://finnhub.io/api/news?id=d5aefc49ae150f9...,2023-01-17,134.830002,137.289993,134.130005,135.940002,135.545273,63646600,negative,0.659


In [13]:
def modify_scores(row):
    if row['sentiment'] == 'neutral':
        return 0
    elif row['sentiment'] == 'negative':
        return 0 - row['score']
    else:
        return row['score']

# Apply the modify_scores function to the DataFrame
stocks_sentiment['score'] = stocks_sentiment.apply(modify_scores, axis=1)

In [14]:
stocks_sentiment.to_csv('finbert_stocks_output.csv', index=False)