<a href="https://colab.research.google.com/github/VIPinKumar07/Quantitative-Finance/blob/main/Code/Stock_News_Sentiment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 📈**Introduction**
In the dynamic world of finance, staying ahead of market trends is key to making informed decisions. One powerful way to gain insights into market sentiment is through sentiment analysis of news headlines. The Python code provided here is a robust tool designed to fetch and analyze recent news related to specific stock tickers, offering a glimpse into the sentiment surrounding companies like GS, JPM, MS, AAPL, NVDA, and GOOG.

## Understanding the Code

The code utilizes various libraries, including Pandas, BeautifulSoup, and NLTK's VADER sentiment analyzer, to scrape financial news from the Finviz website. By fetching headlines and performing sentiment analysis, it provides a comprehensive overview of the market sentiment for the chosen stocks.

## Python Code

In [1]:
# Importing necessary modules and libraries
import pandas as pd
from bs4 import BeautifulSoup
from urllib.request import urlopen, Request
from nltk.sentiment.vader import SentimentIntensityAnalyzer
import nltk
from datetime import datetime

In [2]:
# Stocks to be Analysed- Goldman Sachs(GS), JP Morgan(MS), Morgan Stanley(MS), Apple(AAPL), Nvidia(NVDA),Google(GOOG)
num_news = 10
stocks = ['GS','JPM','MS','AAPL','NVDA','GOOG']

# Get news data from finviz website
url_finviz = 'https://finviz.com/quote.ashx?t='
news_tables = {}  # List to store stock news

In [3]:
for stock in stocks:
    url = url_finviz + stock
    request = Request(url=url, headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'})
    resp = urlopen(request)
    html = BeautifulSoup(resp, features="lxml")
    news_table = html.find(id='news-table')
    news_tables[stock] = news_table

In [4]:
# Display recent news headlines for each stock
try:
    for stock in stocks:
        df = news_tables[stock]
        df_tr = df.findAll('tr')

        print('\n')
        print('Recent news Headlines for {}: '.format(stock))
        print('----------------------------------')

        for i, table_row in enumerate(df_tr):
            news_text = table_row.a.text
            td_text = table_row.td.text.strip()
            print(news_text, '(', td_text,')')

            if i==num_news-1:
                break;
except KeyError:
    pass



Recent news Headlines for GS: 
----------------------------------
The Companies Conducting Layoffs in 2023: Heres the List ( Today 11:30AM )
Goldman hopes to double private credit business as it retreats from consumer banking ( Dec-11-23 04:29PM )
Final Trades: Cisco, Goldman Sachs and KLA Corp. ( 02:25PM )
Goldman Sachs Group (GS) Rose on Successful Execution ( 04:29AM )
The risk of recession is quite low, says Goldman Sachs' Jan Hatzius ( Dec-08-23 12:16PM )
2024 will be the year of bonds, says Goldman Sachs' Ashish Shah ( 09:10AM )
Senator Warner: Big bank CEOs have an argument when it comes to the Feds proposed capital rules ( Dec-07-23 01:56PM )
Sen. Warner 'frustrated' with banks ignoring liquidity risks ( 12:32PM )
Big Banks Need More Out of Wall Street ( 11:28AM )
How effective is Congress being in its banking regulations ( Dec-06-23 05:03PM )


Recent news Headlines for JPM: 
----------------------------------
JPMorgan Winds Down Robo-Advisor, Cites Profitability Challenges 

In [5]:
# Extract and parse news data
parsed_news = []
for stock in stocks:
    news_table =news_tables[stock]
    df_tr = news_table.findAll('tr')

    for i,row in enumerate(df_tr):
        text = row.a.text
        date_scrape = row.td.text.split()

        if len(date_scrape) == 1:
            time = date_scrape[0]
            date = 'Today'
        else:
            date = date_scrape[0]
            time = date_scrape[1]

        stock_name = stock.split('_')[0]
        parsed_news.append([stock_name, date, time, text])

In [6]:
# Create dataframe of news headlines with sentiment scores
columns = ['Stock', 'Date', 'Time', 'Headline']
news = pd.DataFrame(parsed_news, columns=columns)
print(news.shape)
news.head(10)

(600, 4)


Unnamed: 0,Stock,Date,Time,Headline
0,GS,Today,11:30AM,The Companies Conducting Layoffs in 2023: Here...
1,GS,Dec-11-23,04:29PM,Goldman hopes to double private credit busines...
2,GS,Today,02:25PM,"Final Trades: Cisco, Goldman Sachs and KLA Corp."
3,GS,Today,04:29AM,Goldman Sachs Group (GS) Rose on Successful Ex...
4,GS,Dec-08-23,12:16PM,"The risk of recession is quite low, says Goldm..."
5,GS,Today,09:10AM,"2024 will be the year of bonds, says Goldman S..."
6,GS,Dec-07-23,01:56PM,Senator Warner: Big bank CEOs have an argument...
7,GS,Today,12:32PM,Sen. Warner 'frustrated' with banks ignoring l...
8,GS,Today,11:28AM,Big Banks Need More Out of Wall Street
9,GS,Dec-06-23,05:03PM,How effective is Congress being in its banking...


In [7]:
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

In [8]:
# Perform sentiment analysis
analyzer = SentimentIntensityAnalyzer()

scores = news['Headline'].apply(analyzer.polarity_scores).tolist()
scores_df = pd.DataFrame(scores)

# Join sentiment scores to news dataframe
news = news.join(scores_df)
news.head()

Unnamed: 0,Stock,Date,Time,Headline,neg,neu,pos,compound
0,GS,Today,11:30AM,The Companies Conducting Layoffs in 2023: Here...,0.0,1.0,0.0,0.0
1,GS,Dec-11-23,04:29PM,Goldman hopes to double private credit busines...,0.0,0.671,0.329,0.6597
2,GS,Today,02:25PM,"Final Trades: Cisco, Goldman Sachs and KLA Corp.",0.0,1.0,0.0,0.0
3,GS,Today,04:29AM,Goldman Sachs Group (GS) Rose on Successful Ex...,0.0,0.648,0.352,0.5859
4,GS,Dec-08-23,12:16PM,"The risk of recession is quite low, says Goldm...",0.448,0.552,0.0,-0.7425


In [9]:
# Convert date to datetime object
news['Date'] = news['Date'].apply(lambda x: datetime.now().date() if x == 'Today' else x)  # Substitute "Today" with current date
news['Date'] = pd.to_datetime(news['Date'])
#news = news.drop(columns=['Headline'])

# Group news by ticker and calculate mean sentiment for each
unique_stocks = news['Stock'].unique().tolist()
mean_sentiments = []
for stock in stocks:
    dataframe = news[news['Stock'] == stock]
    mean = round(dataframe['compound'].mean(), 2)
    mean_sentiments.append(mean)

# Create and display dataframe of tickers with mean sentiment scores
sentiments_df = pd.DataFrame(list(zip(stocks, mean_sentiments)), columns=['Stock', 'Mean Sentiment'])
sentiments_df = sentiments_df.set_index('Stock')
sentiments_df = sentiments_df.sort_values('Mean Sentiment', ascending=False)
print('\n')
sentiments_df.head()





Unnamed: 0_level_0,Mean Sentiment
Stock,Unnamed: 1_level_1
NVDA,0.26
MS,0.12
GS,0.1
JPM,0.1
GOOG,0.1
