<a href="https://colab.research.google.com/github/Bolitis3/ml_finance_imperial/blob/main/Programming_Sessions/Programming_Session_7/Programing_Session.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📊 Programming Session 8: Financial News Sentiment Analysis

In this lab, you will analyze real-time financial news using FinBERT and a small language model to extract and interpret market sentiment.

### 🔧 Q1: Install Required Packages
Use `pip` to install the following packages:
- `yfinance`
- `transformers`
- `sentencepiece`

⬇️ *Write your code below:*

In [1]:
# Q1
!pip install yfinance
!pip install transformers
!pip install sentencepiece




### 📥 Q2: Retrieve News Headlines
Choose a stock ticker (e.g., `AAPL`, `TSLA`, `MSFT`) and use `yfinance` to retrieve the 5 most recent news headlines.

⬇️ *Write your code below:*

In [4]:
# Q2
import yfinance as yf
ticker = yf.Ticker('TSLA')
news = ticker.news
news_sorted = sorted(news,key=lambda x: x.get('content',{}).get('pubDate',' '),reverse = True)
news_sorted

[{'id': '0498a914-adb8-3193-84fb-88f64ccbff77',
  'content': {'id': '0498a914-adb8-3193-84fb-88f64ccbff77',
   'contentType': 'STORY',
   'title': 'Heard in the Street Monday Recap: Done Deal',
   'description': '',
   'summary': 'President Trump reached a trade deal with the European Union over the weekend. The deal imposes a 15% baseline tariff on products imported from the bloc, including cars. The EU agreed to buy $750 billion of U.',
   'pubDate': '2025-07-29T07:20:20Z',
   'displayTime': '2025-07-29T07:20:20Z',
   'isHosted': False,
   'bypassModal': False,
   'previewUrl': 'https://finance.yahoo.com/m/0498a914-adb8-3193-84fb-88f64ccbff77/heard-in-the-street-monday.html',
   'thumbnail': {'originalUrl': 'https://media.zenfs.com/en/wsj.com/7b2c285183be07ca3c4e8242b25a59a8',
    'originalWidth': 1200,
    'originalHeight': 630,
    'caption': '',
    'resolutions': [{'url': 'https://s.yimg.com/uu/api/res/1.2/Fwf5HL4GsEGPbDfR3MJBPg--~B/aD02MzA7dz0xMjAwO2FwcGlkPXl0YWNoeW9u/https://me

"'Mag 7' turned 'Mid 7': Is 2025 the end of tech dominance?"

### 📄 Q3: Display Headlines
Print each of the 5 headlines in a numbered list.

⬇️ *Write your code below:*

In [6]:
# Q3

for i, article in enumerate(news_sorted[:5]):
  article = article['content']
  print(f"{i+1}. {article.get('title')} - {article.get('pubDate')}")



1. Heard in the Street Monday Recap: Done Deal - 2025-07-29T07:20:20Z
2. Samsung Electronics chief heads to Washington after $16.5 billion Tesla chip deal - 2025-07-29T07:10:18Z
3. Samsung Electronics shares extend gains after Tesla deal, but challenges remain - 2025-07-29T01:10:11Z
4. Elon Musk Confirms Tesla As the Mystery Big-Tech That Signed $16.5 Billion Chip Contract With Samsung: 'I Will Walk The Line Personally' To Boost Progress (UPDATED) - 2025-07-28T23:31:02Z
5. I Asked ChatGPT What Elon Musk’s ‘America Party’ Means for My Taxes, Here’s What it Said - 2025-07-28T23:01:32Z


### 🤖 Q4: Load FinBERT Model
Load the `yiyanghkust/finbert-tone` model and tokenizer using Hugging Face `transformers`, and create a sentiment pipeline.

⬇️ *Write your code below:*

In [8]:
# Q4
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline

model = "yiyanghkust/finbert-tone"
tokenizer = BertTokenizer.from_pretrained(model)
model = BertForSequenceClassification.from_pretrained(model)

sentiment_pipeline = pipeline("sentiment-analysis", model = model, tokenizer= tokenizer)



vocab.txt: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/533 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/439M [00:00<?, ?B/s]

Device set to use cpu


### 🧠 Q5: Apply FinBERT to Headlines
Use FinBERT to classify each headline and print results in the format:
`<headline>` → `<label>` (score = X.XX)

⬇️ *Write your code below:*

In [16]:
# Q5

import math

Headlines= []

for i, article in enumerate(news_sorted):
  title = news_sorted[i]['content'].get('title')
  Headlines.append(title)

Classifier=[]
for article in Headlines:
  sentiment = sentiment_pipeline(article)
  note = f"<{article}> --> <{sentiment[0].get('label')}> (score = {math.floor(sentiment[0].get('score')*100)/100})"
  Classifier.append(note)
  print(note)
  print()


<Heard in the Street Monday Recap: Done Deal> --> <Neutral> (score = 0.99)

<Samsung Electronics chief heads to Washington after $16.5 billion Tesla chip deal> --> <Neutral> (score = 0.99)

<Samsung Electronics shares extend gains after Tesla deal, but challenges remain> --> <Positive> (score = 0.99)

<Elon Musk Confirms Tesla As the Mystery Big-Tech That Signed $16.5 Billion Chip Contract With Samsung: 'I Will Walk The Line Personally' To Boost Progress (UPDATED)> --> <Positive> (score = 0.99)

<I Asked ChatGPT What Elon Musk’s ‘America Party’ Means for My Taxes, Here’s What it Said> --> <Neutral> (score = 0.99)

<Waymo to launch autonomous ride-hailing in Dallas next year> --> <Neutral> (score = 0.99)

<Why Tesla Stock Jumped Today> --> <Positive> (score = 0.8)

<Podcast: Stock Indexes Closed Mixed After U.S.-EU Trade Deal> --> <Neutral> (score = 0.99)

<Equities End Mixed as Markets Look Past EU Trade Deal; Fed Decision in Focus> --> <Neutral> (score = 0.76)

<Tesla, Samsung, energy

### 📊 Q6: Count Sentiment Types
Count and print how many headlines are Positive, Neutral, and Negative.

⬇️ *Write your code below:*

In [18]:
# Q6
Neg=0
Neutral = 0
Pos = 0


for article in Headlines:
  sentiment = sentiment_pipeline(article)
  sentiment = sentiment[0].get('label')
  if sentiment == 'Negative':
    Neg +=1
  elif sentiment == 'Neutral':
    Neutral +=1
  else:
    Pos +=1

print(f"Negative : {Neg} \n Neutral : {Neutral} \n Positive : {Pos}")

Negative : 0 
 Neutral : 7 
 Positive : 3


### 📝 Q7: Create LLM Prompt
Construct a prompt including all headlines and their sentiments. End with a question like:
`What is the overall sentiment of this stock?`

⬇️ *Write your code below:*

In [28]:
# Q7

prompt = "Here is some headlines and their sentiments\n\n"

Classifier_prompted = "\n".join(Classifier)

prompt = prompt + Classifier_prompted

prompt = prompt + "\n\n\nWhat is the overall sentiment of this stock?"

print(prompt)

Here is some headlines and their sentiments

<Heard in the Street Monday Recap: Done Deal> --> <Neutral> (score = 0.99)
<Samsung Electronics chief heads to Washington after $16.5 billion Tesla chip deal> --> <Neutral> (score = 0.99)
<Samsung Electronics shares extend gains after Tesla deal, but challenges remain> --> <Positive> (score = 0.99)
<Elon Musk Confirms Tesla As the Mystery Big-Tech That Signed $16.5 Billion Chip Contract With Samsung: 'I Will Walk The Line Personally' To Boost Progress (UPDATED)> --> <Positive> (score = 0.99)
<I Asked ChatGPT What Elon Musk’s ‘America Party’ Means for My Taxes, Here’s What it Said> --> <Neutral> (score = 0.99)
<Waymo to launch autonomous ride-hailing in Dallas next year> --> <Neutral> (score = 0.99)
<Why Tesla Stock Jumped Today> --> <Positive> (score = 0.8)
<Podcast: Stock Indexes Closed Mixed After U.S.-EU Trade Deal> --> <Neutral> (score = 0.99)
<Equities End Mixed as Markets Look Past EU Trade Deal; Fed Decision in Focus> --> <Neutral> (s

### 💬 Q8: Run LLM to Summarize
Use `google/flan-t5-small` to summarize the sentiment using your custom prompt.

⬇️ *Write your code below:*

In [39]:
# Q8
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = 'google/flan-t5-small'
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForSeq2SeqLM.from_pretrained(model)

inputs = tokenizer(prompt, return_tensors = "pt")
outputs = model.generate(**inputs)

summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
summary

'positive'

### 🔎 Q9: Run LLM to each sentence
Then run LLM to each sentence with a new custom prompt.

⬇️ *Write your short answer below:*

In [47]:
# Q9
for note in Classifier:
  prompt1=note + " What does this sentiment imply?\n\n"
  print(prompt1)
  input = tokenizer(prompt1, return_tensors = "pt")
  output = model.generate(**input)
  sum_up = tokenizer.decode(output[0], skip_special_tokens=True)
  print(sum_up)
  print()

<Heard in the Street Monday Recap: Done Deal> --> <Neutral> (score = 0.99) What does this sentiment imply?


Positive

<Samsung Electronics chief heads to Washington after $16.5 billion Tesla chip deal> --> <Neutral> (score = 0.99) What does this sentiment imply?


negative

<Samsung Electronics shares extend gains after Tesla deal, but challenges remain> --> <Positive> (score = 0.99) What does this sentiment imply?


positive

<Elon Musk Confirms Tesla As the Mystery Big-Tech That Signed $16.5 Billion Chip Contract With Samsung: 'I Will Walk The Line Personally' To Boost Progress (UPDATED)> --> <Positive> (score = 0.99) What does this sentiment imply?


positive

<I Asked ChatGPT What Elon Musk’s ‘America Party’ Means for My Taxes, Here’s What it Said> --> <Neutral> (score = 0.99) What does this sentiment imply?


I Asked ChatGPT What Elon Musk’s ‘America Party’ Means

<Waymo to launch autonomous ride-hailing in Dallas next year> --> <Neutral> (score = 0.99) What does this sentiment i

### 💼 Q10: Compare the output of the summary of the one of the previous answer.

⬇️ *Write your answer below:*

In [None]:
# Q10