## Step 2: Performing sentiment analysis on news headings

**Objectives**
- 2.1. Importing a sentiment model from Huggingface
- 2.2. Writing functions to calculate average sentiment for each day
- 2.3. Getting the news heading and outputting the sentiment score in JSON format

In [1]:
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

import json

  from .autonotebook import tqdm as notebook_tqdm


### 2.1. Importing a sentiment model from Huggingface

In [2]:
modelName = "nlptown/bert-base-multilingual-uncased-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(modelName)
tokenizer = AutoTokenizer.from_pretrained(modelName)

### 2.2. Writing functions to calculate average sentiment for each day

In [5]:
def calculateDailySentiment(headlines):
    texts = [headline['heading'] for headline in headlines]
    inputs = tokenizer(texts, return_tensors="pt", truncation=True, padding=True, max_length=512, return_attention_mask=True)
    outputs = model(**inputs)
    logits = outputs.logits
    scores = logits.softmax(dim=1)
    averageScore = scores.mean(dim=0).tolist()
    return averageScore
#enddef

def analyzeAndSaveSentiment(inputFile, outputFile):
    with open(inputFile, 'r', errors="ignore") as file:
        data = json.load(file)
    #endwith

    result = {}

    for date, headlines in data.items():
        averageScore = calculateDailySentiment(headlines)
        print(f"{date} > {averageScore}")
        result[date] = averageScore
    #endfor

    with open(outputFile, 'w') as outputFile:
        json.dump(result, outputFile, indent=2)
    #endwith
#enddef


### 2.3. Getting the news heading and outputting the sentiment score in JSON format
- Score for each day is saved in the file ([daily_scores.json](./data/news2023/daily_scores.json))

In [6]:
inputJsonFile = './data/news2023/headlines.json'
outputJsonFile = './data/news2023/daily_scores.json'

analyzeAndSaveSentiment(inputJsonFile, outputJsonFile)

2023-01-01 > [0.3303655982017517, 0.18404191732406616, 0.17504505813121796, 0.1581822633743286, 0.15236514806747437]
2023-01-02 > [0.28398597240448, 0.18440347909927368, 0.16833671927452087, 0.17533433437347412, 0.18793949484825134]
2023-01-03 > [0.30308911204338074, 0.1791391521692276, 0.17202498018741608, 0.17286118865013123, 0.17288553714752197]
2023-01-04 > [0.35545614361763, 0.20434696972370148, 0.17998100817203522, 0.1434616893529892, 0.11675418168306351]
2023-01-05 > [0.4087914824485779, 0.19528710842132568, 0.14346462488174438, 0.12112268805503845, 0.13133417069911957]
2023-01-06 > [0.3567364513874054, 0.16727925837039948, 0.14585870504379272, 0.15034796297550201, 0.1797776073217392]
2023-01-07 > [0.35848310589790344, 0.1848071962594986, 0.17866139113903046, 0.15008945763111115, 0.12795886397361755]
2023-01-08 > [0.38150161504745483, 0.19246016442775726, 0.16483478248119354, 0.13628986477851868, 0.12491355836391449]
2023-01-09 > [0.2732820212841034, 0.17190752923488617, 0.17149