## Step 2: Performing sentiment analysis on news headings

**Objectives**
- 2.1. Importing a sentiment model from Huggingface
- 2.2. Writing functions to calculate average sentiment for each day
- 2.3. Getting the news heading and outputting the sentiment score in JSON format

In [29]:
from google.colab import files
a = files.upload()
b = files.upload() #inputJsonFile = '/content/headlines.json' outputJsonFile = '/content/daily_scores.json'

KeyboardInterrupt: 

In [15]:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import pandas as pd
import json

### 2.1. Importing a sentiment model from Huggingface

In [16]:
modelName = "nlptown/bert-base-multilingual-uncased-sentiment"
tokenizer = AutoTokenizer.from_pretrained(modelName)
model = AutoModelForSequenceClassification.from_pretrained(modelName)

### 2.2. Writing functions to calculate average sentiment for each day

In [17]:
def calculateDailySentiment(headlines):
    texts = [headline['heading'] for headline in headlines]
    inputs = tokenizer(texts, return_tensors="pt", truncation=True, padding=True, max_length=512, return_attention_mask=True)
    outputs = model(**inputs)
    logits = outputs.logits
    scores = logits.softmax(dim=1)
    averageScore = scores.mean(dim=0).tolist()
    return averageScore
#enddef

def analyzeAndSaveSentiment(inputFile, outputFile):
    with open(inputFile, 'r') as file:
        data = json.load(file)
    #endwith

    result = {}

    for date, headlines in data.items():
        averageScore = calculateDailySentiment(headlines)
        print(f"{date} > {averageScore}")
        result[date] = averageScore
    #endfor

    with open(outputFile, 'w') as outputFile:
        json.dump(result, outputFile, indent=2)
    #endwith
#enddef


### 2.3. Getting the news heading and outputting the sentiment score in JSON format
- Score for each day is saved in the file ([daily_scores.json](./data/news2023/daily_scores.json))

In [30]:
inputJsonFile = '/content/headlines.json'
outputJsonFile = '/content/daily_scores.json'

analyzeAndSaveSentiment(inputJsonFile, outputJsonFile)

2015-01-01 > [0.3070025146007538, 0.1701173186302185, 0.1548246294260025, 0.17130403220653534, 0.19675149023532867]
2015-01-02 > [0.28250327706336975, 0.15611006319522858, 0.18065737187862396, 0.19504719972610474, 0.18568213284015656]
2015-01-03 > [0.32533952593803406, 0.14950305223464966, 0.15599913895130157, 0.18876515328884125, 0.18039309978485107]
2015-01-04 > [0.32683515548706055, 0.1765918731689453, 0.17063122987747192, 0.1716979742050171, 0.15424376726150513]
2015-01-05 > [0.3772907257080078, 0.14218366146087646, 0.14909176528453827, 0.17509512603282928, 0.15633873641490936]
2015-01-06 > [0.28608688712120056, 0.14581941068172455, 0.1645268201828003, 0.19867727160453796, 0.20488962531089783]
2015-01-07 > [0.27082404494285583, 0.15743446350097656, 0.17185550928115845, 0.1849762201309204, 0.21490982174873352]
2015-01-08 > [0.34370893239974976, 0.15521514415740967, 0.14898769557476044, 0.1641692817211151, 0.18791887164115906]
2015-01-09 > [0.3578956723213196, 0.16952630877494812, 0.

KeyboardInterrupt: 