## Codebook training: Session 4

[Section 1: Get News Headlines and Stories](#Section-1:-Get-News-Headlines-and-Stories)<br>
[Section 2: Text Sentiment Classification with NLP](#Section-2:-Text-Sentiment-Classification-with-NLP)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [2.1 Key Principles behind the NLP](#2.1-Key-Principles-behind-the-NLP)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [2.2 Transformers: Text classification using FinBert](#2.2-Transformers:-Text-classification-using-FinBert)<br>
[Section 3: NLP text classification usecase](#Section-3-NLP-text-classification-usecase)

In [None]:
!pip install refinitiv.data

In [62]:
import refinitiv.data as rd
from refinitiv.data.content import news

In [63]:
rd.open_session()

<refinitiv.data.session.Definition object at 0x23899e92a70 {name='default'}>

## Section 1: Get News Headlines and Stories

### 1.1 Get News Headlines

In [64]:
help(news.headlines.Definition)

Help on class Definition in module refinitiv.data._data.content.news.headlines:

class Definition(refinitiv.data._data.content.news._news_data_provider_layer.NewsDataProviderLayer)
 |  Definition(query: str, count: int = 10, date_from: Union[str, datetime.timedelta, Tuple[datetime.datetime, datetime.date]] = None, date_to: Union[str, datetime.timedelta, Tuple[datetime.datetime, datetime.date]] = None, sort_order: refinitiv.data._data.content.news.sort_order.SortOrder = <SortOrder.new_to_old: 'newToOld'>, extended_params: dict = None)
 |  
 |  This class describes parameters to retrieve data for news headlines.
 |  
 |  Parameters
 |  ----------
 |  query: str
 |      The user search query.
 |  
 |  count: int, optional
 |      Count to limit number of headlines. Min value is 0. Default: 10
 |  
 |  date_from: str or timedelta, optional
 |      Beginning of date range.
 |      String format is: '%Y-%m-%dT%H:%M:%S'. e.g. '2016-01-20T15:04:05'.
 |  
 |  date_to: str or timedelta, optional

In [68]:
response = news.headlines.Definition("Apple").get_data()
response.data.df

Unnamed: 0,versionCreated,text,storyId,sourceCode
2022-04-19 10:30:32.377,2022-04-19 10:30:32.377000+00:00,Apple iPad Air 5th Gen review: Best keeps on g...,urn:newsml:reuters.com:20220419:nNRAkdynaz:1,NS:INDIAE
2022-04-19 09:39:28.771,2022-04-19 09:39:28.771000+00:00,Apple's iPhone 14 series could come in only tw...,urn:newsml:reuters.com:20220419:nNRAkdwsv2:1,NS:INDIAE
2022-04-19 08:27:02.711,2022-04-19 08:27:02.711000+00:00,Beschäftigte in Apple-Store in New York wollen...,urn:newsml:reuters.com:20220419:nApa1HV61a:1,NS:APA
2022-04-19 07:39:32.000,2022-04-19 07:39:32+00:00,"US STOCKS-Уолл-стрит снизилась к закрытию, инв...",urn:newsml:reuters.com:20220419:nL5N2WH1E6:1,NS:RTRS
2022-04-19 05:10:15.358,2022-04-19 07:08:24.307000+00:00,Refinitiv Newscasts - How To Manage a Crypto I...,urn:newsml:reuters.com:20220419:nRTV6MvsfY:9,NS:REALV
2022-04-19 05:58:09.523,2022-04-19 05:58:09.523000+00:00,"iPhone SE 2022 review: Ageing, but going stron...",urn:newsml:reuters.com:20220419:nNRAkdrk5o:1,NS:BUSSTA
2022-04-19 01:50:31.984,2022-04-19 01:50:31.984000+00:00,通脹重壓之下，蘋果(AAPL.US)紐約零售店員工希望時薪漲至30美元,urn:newsml:reuters.com:20220419:nZTCZnRRPa:1,NS:ZTCJ
2022-04-19 01:50:31.694,2022-04-19 01:50:31.694000+00:00,通胀重压之下，苹果(AAPL.US)纽约零售店员工希望时薪涨至30美元,urn:newsml:reuters.com:20220419:nZTCv8X1Ga:1,NS:ZTCJ
2022-04-19 01:24:46.371,2022-04-19 01:24:46.371000+00:00,United States : Apple helps suppliers rapidly ...,urn:newsml:reuters.com:20220419:nNRAkdnrhv:1,NS:ECLPCM
2022-04-19 01:19:15.000,2022-04-19 01:19:15+00:00,DJIA:ตลาดหุ้นนิวยอร์ค:หุ้นสหรัฐร่วงลงขณะตลาดรอ...,urn:newsml:reuters.com:20220419:nL3N2WH092:1,NS:RTRS


In [69]:
response.data.raw

{'headlines': [{'displayDirection': 'LeftToRight',
   'documentType': 'Story',
   'firstCreated': '2022-04-19T10:30:32.377Z',
   'isAlert': False,
   'language': 'L:en',
   'reportCode': '',
   'sourceCode': 'NS:INDIAE',
   'sourceName': 'Indian Express',
   'storyId': 'urn:newsml:reuters.com:20220419:nNRAkdynaz:1',
   'text': 'Apple iPad Air 5th Gen review: Best keeps on getting better',
   'versionCreated': '2022-04-19T10:30:32.377Z'},
  {'displayDirection': 'LeftToRight',
   'documentType': 'Story',
   'firstCreated': '2022-04-19T09:39:28.771Z',
   'isAlert': False,
   'language': 'L:en',
   'reportCode': '',
   'sourceCode': 'NS:INDIAE',
   'sourceName': 'Indian Express',
   'storyId': 'urn:newsml:reuters.com:20220419:nNRAkdwsv2:1',
   'text': "Apple's iPhone 14 series could come in only two sizes instead of three",
   'versionCreated': '2022-04-19T09:39:28.771Z'},
  {'displayDirection': 'LeftToRight',
   'documentType': 'Story',
   'firstCreated': '2022-04-19T08:27:02.711Z',
   'i

#### 1.1.1 Get headlines from English news

In [71]:
response = news.headlines.Definition("AAPL.O and L:EN").get_data()
response.data.df

Unnamed: 0,versionCreated,text,storyId,sourceCode
2022-04-19 10:30:32.377,2022-04-19 10:30:32.377000+00:00,Apple iPad Air 5th Gen review: Best keeps on g...,urn:newsml:reuters.com:20220419:nNRAkdynaz:1,NS:INDIAE
2022-04-19 09:39:28.771,2022-04-19 09:39:28.771000+00:00,Apple's iPhone 14 series could come in only tw...,urn:newsml:reuters.com:20220419:nNRAkdwsv2:1,NS:INDIAE
2022-04-19 05:10:15.358,2022-04-19 07:08:24.307000+00:00,Refinitiv Newscasts - How To Manage a Crypto I...,urn:newsml:reuters.com:20220419:nRTV6MvsfY:9,NS:REALV
2022-04-19 05:58:09.523,2022-04-19 05:58:09.523000+00:00,"iPhone SE 2022 review: Ageing, but going stron...",urn:newsml:reuters.com:20220419:nNRAkdrk5o:1,NS:BUSSTA
2022-04-19 01:24:46.371,2022-04-19 01:24:46.371000+00:00,United States : Apple helps suppliers rapidly ...,urn:newsml:reuters.com:20220419:nNRAkdnrhv:1,NS:ECLPCM
2022-04-18 20:27:47.000,2022-04-18 20:27:47+00:00,US STOCKS-Wall St ends lower as investors awai...,urn:newsml:reuters.com:20220418:nL2N2WG1RM:6,NS:RTRS
2022-04-18 20:12:00.084,2022-04-18 20:12:00.084000+00:00,Blackmagic Design Announces New HyperDeck Shut...,urn:newsml:reuters.com:20220418:nBw1fQ63ka:1,NS:BSW
2022-04-18 20:01:59.000,2022-04-18 20:01:59+00:00,US STOCKS-Wall St ends topsy-turvy day lower a...,urn:newsml:reuters.com:20220418:nL2N2WG1EO:6,NS:RTRS
2022-04-18 18:45:21.272,2022-04-18 18:45:21.272000+00:00,iPhone 14 may include satellite connectivity f...,urn:newsml:reuters.com:20220418:nNRAkdgewg:1,NS:ASNEWS
2022-04-18 18:20:06.043,2022-04-18 18:41:34.444000+00:00,SHAREHOLDER ALERT: Pomerantz Law Firm Investig...,urn:newsml:newsroom:20220418:nVMNnZK8Wa:0,NS:CMNW


#### 1.1.2 Get headlines within a range of dates

In [74]:
response = news.headlines.Definition(
    query = "AAPL.O and L:EN",
    date_from = '2022-04-10T12:00:00',
    date_to = '2022-04-14T12:00:00',
    count = 15
).get_data()
response.data.df

Unnamed: 0,versionCreated,text,storyId,sourceCode
2022-04-14 05:30:15.000,2022-04-14 05:32:28+00:00,"TSMC's Q1 profit up 45%, beats market estimates",urn:newsml:reuters.com:20220414:nP8N2VY02I:3,NS:RTRS
2022-04-14 02:17:49.491,2022-04-14 02:17:49.491000+00:00,United States : Apple introduces new version o...,urn:newsml:reuters.com:20220414:nNRAkbxl3k:1,NS:ECLPCM
2022-04-13 22:06:09.955,2022-04-14 00:13:02.311000+00:00,Refinitiv Newscasts - Growth stocks rally as e...,urn:newsml:reuters.com:20220413:nRTV9QQGsQ:15,NS:RTRS
2022-04-13 22:53:52.780,2022-04-13 22:53:52.780000+00:00,Apple (AAPL) Outpaces Stock Market Gains: What...,urn:newsml:reuters.com:20220413:nNRAkbw24x:1,NS:ZACKSC
2022-04-13 20:01:51.000,2022-04-13 20:01:51+00:00,MEDIA-Apple MacBook shipments delayed as China...,urn:newsml:reuters.com:20220413:nL3N2WB3CB:1,NS:RTRS
2022-04-11 11:02:08.940,2022-04-13 19:40:34.537000+00:00,"Refinitiv Newscasts - Michael Kong, CEO of Fan...",urn:newsml:reuters.com:20220411:nRTV36Vb13:3,NS:RNTPC
2022-04-13 18:26:10.221,2022-04-13 18:26:10.221000+00:00,"Apple (AAPL), Tom Hanks' Playtone Ink Multi-Ye...",urn:newsml:reuters.com:20220413:nNRAkbpj9w:1,NS:ZACKSC
2022-04-13 18:23:42.967,2022-04-13 18:23:42.967000+00:00,The iPad mini is available at its lowest ever ...,urn:newsml:reuters.com:20220413:nNRAkbr3gw:1,NS:INDEPE
2022-04-13 17:53:49.401,2022-04-13 17:53:49.401000+00:00,iPhone maker Pegatron suspends operations at t...,urn:newsml:reuters.com:20220413:nNRAkbp04j:1,NS:ASNEWS
2022-04-11 17:49:22.286,2022-04-13 17:28:33.048000+00:00,Refinitiv Newscasts - Warner Bros Discovery sh...,urn:newsml:reuters.com:20220411:nRTVn1Hn8:11,NS:RTRS


### 1.2 Get News Story

In [76]:
story = news.story.Definition("urn:newsml:reuters.com:20220414:nNRAkbxl3k:1").get_data()

print(story.data.story.title, '\n')
print(story.data.story.content)

United States : Apple introduces new version of iMovie featuring Storyboards and Magic Movie 

Apple today introduced a new version of iMovie with features that make it easier than ever to create beautiful edited videos on iPhone and iPad. Storyboards helps aspiring content creators and moviemakers learn to edit and improve their video storytelling skills with pre-made templates for popular types of videos shared on social, with colleagues, or with classmates ? videos like DIYs, cooking tutorials, product reviews, science experiments, and more. Storyboards makes it easy to get started with flexible shot lists and step-by-step guidance on which clips to capture for each video type.
For those who want to create a video even faster, Magic Movie instantly creates a polished video from the clips and photos a user selects, automatically adding transitions, effects, and music to the edit. Both new features include a range of styles to help personalize the final look and feel of a video, inclu

## Section 2: Text Sentiment Classification with NLP

<img src="images/sentiment_class.jpg" width="1000">

### 2.1: Key Principles behind the NLP

In [None]:
!pip install transformers

#### 2.1.1 Tokenization - Representing words in a way that computer can process them

Tokenization is the first process of NLP when a text is split into words or subwords, which then is converted to ids through a look-up table. Although this seems pretty straightforward, there are multiple ways of splitting sentences into words or subwords, and each way has its own advantages and disadvantages. Hugging faces provide a great introductory guideline on tokenization, which can be found [here](https://huggingface.co/docs/transformers/master/tokenizer_summary).

<img src="images/tokenization.jpg" width="700">

In [77]:
from transformers import BertTokenizer, BertForSequenceClassification
import torch

In [78]:
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

In [79]:
headline_text = response.data.df['text'][1][16:54]
print(headline_text)

Apple introduces new version of iMovie


In [80]:
encoded_input = tokenizer(headline_text)
encoded_input

{'input_ids': [101, 6207, 13999, 2047, 2544, 1997, 10047, 4492, 2666, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

In [55]:
tokenizer.decode(encoded_input["input_ids"])

'[CLS] apple introduces new version of imovie [SEP]'

#### Padding and Truncation

Batched inputs are often different lengths, so they can’t be converted to fixed-size tensors. Padding and truncation are strategies for dealing with this problem, to create rectangular tensors from batches of varying lengths. **Padding** adds a special padding token to ensure shorter sequences will have the same length as either the longest sequence in a batch or the maximum length accepted by the model. **Truncation** works in the other direction by truncating long sequences.

In [83]:
sentence  = [
    'Apple introduces new version of iMovie',
    'Refinitiv Newscasts - Stocks slide as rising bond yields hit megacaps'
]

encoded_input = tokenizer(sentence, return_tensors="pt", padding = True)
encoded_input

{'input_ids': tensor([[  101,  6207, 13999,  2047,  2544,  1997, 10047,  4492,  2666,   102,
             0,     0,     0,     0,     0,     0,     0],
        [  101, 25416,  5498, 29068, 24482,  1011, 15768,  7358,  2004,  4803,
          5416, 16189,  2718, 13164, 17695,  2015,   102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

#### 2.1.2 Word Embeddings

Word embeddings are the vector representation of words where words or phrases from the vocabulary are mapped to vectors of real numbers. The vector encodes the meaning of the word such that the words that are closer in the vector space are expected to be similar in meaning. As it comes to the technical creation of the embeddings, these are created using a neural network with an input layer, hidden layer, and an output layer. An illustrative and explanatory example is provided [in this blog post](https://towardsdatascience.com/creating-word-embeddings-coding-the-word2vec-algorithm-in-python-using-deep-learning-b337d0ba17a8).

<img src="images/embedding.jpg" width="900">

For sentence classification, we’re only only interested in BERT’s output for the [CLS] token, so we use ony that slice of the cube and discard everything else.

<img src="images/CLS.png" width="900">

#### 2.1.3 Classification

After we have the output of BERT, we use that as the dataset to train the logistic regression model. The 768 columns are the features, and as for the labels we need to get them from initial labeled dataset. You can learn more about how logistic regression works from [this tutorial](https://www.datacamp.com/community/tutorials/understanding-logistic-regression-python).

<img src="images/classification.png" width="700"> <img src="images/logistic.png" width="700">

### 2.2 Transformers: Text classification using FinBert

The Hugging Face [transformers](https://huggingface.co/docs/transformers/index) package is a Python library that provides numerous pre-trained models that are used for a variety of NLP tasks. One of such pre-trained models in FinBert, which is introduced in greater detail in this section.

As it comes to [FinBert](https://huggingface.co/ProsusAI/finbert) model itself, it is a pre-trained NLP model to analyze the sentiment of the financial text. It is built by further training the BERT language model in the finance domain, using [Reuters TRC2](https://trec.nist.gov/data/reuters/reuters.html) financial corpus and thereby fine-tuning it for financial sentiment classification. After the model is adapted to the domain-specific language, it is trained with labeled data for the sentiment classification task.

[Financial PhraseBook dataset](https://www.researchgate.net/publication/251231364_FinancialPhraseBank-v10) by Malo et al. (2014) has been used to train the classification task. The dataset consisting of 4845 instances is carefully labeled by 16 experts and master students with finance backgrounds who, along with labels, reported inter-annotator agreement levels for each sentence.

According to the [FinBert GitHub account](https://github.com/ProsusAI/finBERT), in order to use the pre-trained FinBert model, one should:
* Create a directory for the model.
* Download the model (pytorch_model.bin) and put it into the created directory.
* Put a copy of config.json in that same directory.
* Call the model with .from_pretrained(model directory name)

I have already created a folder and stored the required files in a directory called finbert. To load the model, we just need to run the code below. Additionally, I load the BERT tokenizer after loading the model.


<img src="images/finbert.png" width="700">

In [84]:
model = BertForSequenceClassification.from_pretrained('finbert_model/pytorch_model.bin',config='finbert_model/config.json',num_labels=3)
label_list=['positive','negative','neutral']

In [85]:
dataset = response.data.df[['text', 'storyId']]
dataset

Unnamed: 0,text,storyId
2022-04-14 05:30:15.000,"TSMC's Q1 profit up 45%, beats market estimates",urn:newsml:reuters.com:20220414:nP8N2VY02I:3
2022-04-14 02:17:49.491,United States : Apple introduces new version o...,urn:newsml:reuters.com:20220414:nNRAkbxl3k:1
2022-04-13 22:06:09.955,Refinitiv Newscasts - Growth stocks rally as e...,urn:newsml:reuters.com:20220413:nRTV9QQGsQ:15
2022-04-13 22:53:52.780,Apple (AAPL) Outpaces Stock Market Gains: What...,urn:newsml:reuters.com:20220413:nNRAkbw24x:1
2022-04-13 20:01:51.000,MEDIA-Apple MacBook shipments delayed as China...,urn:newsml:reuters.com:20220413:nL3N2WB3CB:1
2022-04-11 11:02:08.940,"Refinitiv Newscasts - Michael Kong, CEO of Fan...",urn:newsml:reuters.com:20220411:nRTV36Vb13:3
2022-04-13 18:26:10.221,"Apple (AAPL), Tom Hanks' Playtone Ink Multi-Ye...",urn:newsml:reuters.com:20220413:nNRAkbpj9w:1
2022-04-13 18:23:42.967,The iPad mini is available at its lowest ever ...,urn:newsml:reuters.com:20220413:nNRAkbr3gw:1
2022-04-13 17:53:49.401,iPhone maker Pegatron suspends operations at t...,urn:newsml:reuters.com:20220413:nNRAkbp04j:1
2022-04-11 17:49:22.286,Refinitiv Newscasts - Warner Bros Discovery sh...,urn:newsml:reuters.com:20220411:nRTVn1Hn8:11


In [86]:
storyId = dataset['storyId'][3]
story = news.story.Definition(storyId).get_data()
story = story.data.story.content

print(story)

Apr 13, 2022
Apple (AAPL) closed the most recent trading day at $170.40, moving +1.63% from the previous trading session. This change outpaced the S&P 500's 1.12% gain on the day. At the same time, the Dow added 1.01%, and the tech-heavy Nasdaq lost 0.08%.
Coming into today, shares of the maker of iPhones, iPads and other products had gained 8.1% in the past month. In that same time, the Computer and Technology sector gained 2.22%, while the S&P 500 gained 4.63%.
Investors will be hoping for strength from Apple as it approaches its next earnings release, which is expected to be April 28, 2022. The company is expected to report EPS of $1.43, up 2.14% from the prior-year quarter. Meanwhile, the Zacks Consensus Estimate for revenue is projecting net sales of $94.29 billion, up 5.25% from the year-ago period.
Looking at the full year, our Zacks Consensus Estimates suggest analysts are expecting earnings of $6.16 per share and revenue of $397.57 billion. These totals would mark changes of +

In [87]:
tokenized = tokenizer(story, return_tensors="pt", truncation = True, max_length = 512)
tokenized['input_ids']

tensor([[  101, 19804,  2410,  1010, 16798,  2475,  6207,  1006,  9779, 24759,
          1007,  2701,  1996,  2087,  3522,  6202,  2154,  2012,  1002, 10894,
          1012,  2871,  1010,  3048,  1009,  1015,  1012,  6191,  1003,  2013,
          1996,  3025,  6202,  5219,  1012,  2023,  2689,  2041, 15327,  2094,
          1996,  1055,  1004,  1052,  3156,  1005,  1055,  1015,  1012,  2260,
          1003,  5114,  2006,  1996,  2154,  1012,  2012,  1996,  2168,  2051,
          1010,  1996, 23268,  2794,  1015,  1012,  5890,  1003,  1010,  1998,
          1996,  6627,  1011,  3082, 17235,  2850,  4160,  2439,  1014,  1012,
          5511,  1003,  1012,  2746,  2046,  2651,  1010,  6661,  1997,  1996,
          9338,  1997, 18059,  2015,  1010, 25249,  2015,  1998,  2060,  3688,
          2018,  4227,  1022,  1012,  1015,  1003,  1999,  1996,  2627,  3204,
          1012,  1999,  2008,  2168,  2051,  1010,  1996,  3274,  1998,  2974,
          4753,  4227,  1016,  1012,  2570,  1003,  

In [88]:
outputs = model(**tokenized)
outputs

SequenceClassifierOutput(loss=None, logits=tensor([[ 1.6101, -1.4971, -0.3028]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)

In [89]:
sentF = label_list[torch.argmax(outputs[0])]
sentF

'positive'

In [92]:
prediction = []

for text in dataset['text']:
    tokenized = tokenizer(text, return_tensors="pt")
    outputs = model(**tokenized)
    sentF = label_list[torch.argmax(outputs[0])]
    prediction.append(sentF)
dataset.insert(loc = len(dataset.columns), column = 'prediction', value = prediction)
dataset

Unnamed: 0,text,storyId,prediction
2022-04-14 05:30:15.000,"TSMC's Q1 profit up 45%, beats market estimates",urn:newsml:reuters.com:20220414:nP8N2VY02I:3,positive
2022-04-14 02:17:49.491,United States : Apple introduces new version o...,urn:newsml:reuters.com:20220414:nNRAkbxl3k:1,positive
2022-04-13 22:06:09.955,Refinitiv Newscasts - Growth stocks rally as e...,urn:newsml:reuters.com:20220413:nRTV9QQGsQ:15,positive
2022-04-13 22:53:52.780,Apple (AAPL) Outpaces Stock Market Gains: What...,urn:newsml:reuters.com:20220413:nNRAkbw24x:1,positive
2022-04-13 20:01:51.000,MEDIA-Apple MacBook shipments delayed as China...,urn:newsml:reuters.com:20220413:nL3N2WB3CB:1,negative
2022-04-11 11:02:08.940,"Refinitiv Newscasts - Michael Kong, CEO of Fan...",urn:newsml:reuters.com:20220411:nRTV36Vb13:3,neutral
2022-04-13 18:26:10.221,"Apple (AAPL), Tom Hanks' Playtone Ink Multi-Ye...",urn:newsml:reuters.com:20220413:nNRAkbpj9w:1,neutral
2022-04-13 18:23:42.967,The iPad mini is available at its lowest ever ...,urn:newsml:reuters.com:20220413:nNRAkbr3gw:1,negative
2022-04-13 17:53:49.401,iPhone maker Pegatron suspends operations at t...,urn:newsml:reuters.com:20220413:nNRAkbp04j:1,negative
2022-04-11 17:49:22.286,Refinitiv Newscasts - Warner Bros Discovery sh...,urn:newsml:reuters.com:20220411:nRTVn1Hn8:11,negative


In [61]:
predictions = []

for text in dataset['text']:
    tokenized = tokenizer(text, return_tensors="pt")
    outputs = model(**tokenized)
    sentF = label_list[torch.argmax(outputs[0])]
    predictions.append(sentF)
dataset.insert(loc = len(dataset.columns), column = 'predictions', value = predictions)
dataset

Unnamed: 0,text,storyId,predictions
2022-04-14 05:30:15.000,"TSMC's Q1 profit up 45%, beats market estimates",urn:newsml:reuters.com:20220414:nP8N2VY02I:3,positive
2022-04-14 02:17:49.491,United States : Apple introduces new version o...,urn:newsml:reuters.com:20220414:nNRAkbxl3k:1,positive
2022-04-13 22:06:09.955,Refinitiv Newscasts - Growth stocks rally as e...,urn:newsml:reuters.com:20220413:nRTV9QQGsQ:15,positive
2022-04-13 22:53:52.780,Apple (AAPL) Outpaces Stock Market Gains: What...,urn:newsml:reuters.com:20220413:nNRAkbw24x:1,positive
2022-04-13 20:01:51.000,MEDIA-Apple MacBook shipments delayed as China...,urn:newsml:reuters.com:20220413:nL3N2WB3CB:1,negative
2022-04-11 11:02:08.940,"Refinitiv Newscasts - Michael Kong, CEO of Fan...",urn:newsml:reuters.com:20220411:nRTV36Vb13:3,neutral
2022-04-13 18:26:10.221,"Apple (AAPL), Tom Hanks' Playtone Ink Multi-Ye...",urn:newsml:reuters.com:20220413:nNRAkbpj9w:1,neutral
2022-04-13 18:23:42.967,The iPad mini is available at its lowest ever ...,urn:newsml:reuters.com:20220413:nNRAkbr3gw:1,negative
2022-04-13 17:53:49.401,iPhone maker Pegatron suspends operations at t...,urn:newsml:reuters.com:20220413:nNRAkbp04j:1,negative
2022-04-11 17:49:22.286,Refinitiv Newscasts - Warner Bros Discovery sh...,urn:newsml:reuters.com:20220411:nRTVn1Hn8:11,negative


### 2.3: NLP text classification usecase

* [Introduction to News Sentiment Analysis with Eikon Data APIs - a Python example](https://developers.refinitiv.com/en/article-catalog/article/introduction-news-sentiment-analysis-eikon-data-apis-python-example)
* [Predicting M&A Targets Using ML: Unlocking the potential of NLP based variables](https://developers.refinitiv.com/en/article-catalog/article/predicting-MnA-targets-using-ML-Unlocking-the-potential-of-NLP-variables)