# **Sentiment Analysis for stock news and price trend indicator**

## **Input the stock ticker for which you want sentiment analysis**


In [None]:
# Input the ticker

tickers=['f']

## Steps for the sentiment analysis-price movement.

- ***BeautifulSoup Library***
    - Here BeautifulSoup library is used for webscrapping the news table from the website 'https://finviz.com/quote.ashx?t='
    - Using inspect on the webpage we can locate table-id which is then used to scrappe tr (tablerow).
    - The news headlines are then converted into a dataframe which is used for sentiment analysis.
- ***NLTK VADER Library***
    -  NLTK VADER for sentiment analysis is used to find the neutral, postive, negative and compound sentiment for each headline. 
    - The compound values are then aggregated date-wise to get a meanscore dataframe.
- ***Yfinance Library***
    - Yfinance is used to obtain the stock prices for the ticker and S&P500, NASDAQ, DOW30 index closing values.
    - A 2wk (week) period is considered for the price charts so that we can compare the price movements with the sentiment score. 
    - Important to remember that the sentiment score of t0 is seen in the price movement of t1
- ***Dash Plotly Express Library***
    - Plotly express is used to create that charts for sentiment analysis and price movements.
    - Dash app is used to create a web page which can display charts on a web page and the user can interact with the chart and Dash interacts with the python code and displays the updated chart. 
    

In [None]:
#for the sentiment analysis
from urllib.request import urlopen, Request
from bs4 import BeautifulSoup
import os
import pandas as pd
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import pandas as pd
import plotly.express as px
from jupyter_dash import JupyterDash

# NLTK VADER for sentiment analysis
from nltk.sentiment.vader import SentimentIntensityAnalyzer
# tickers=['tsla']
finwiz_url = 'https://finviz.com/quote.ashx?t='

# Ticker news

news_tables = {}
# tickers = ticker2
for ticker in tickers:
    url = finwiz_url + ticker
    req = Request(url=url,headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20100101 Firefox/20.0'}) 
    response = urlopen(req)    
    # Read the contents of the file into 'html'
    html = BeautifulSoup(response)
    # Find 'news-table' in the Soup and load it into 'news_table'
    news_table = html.find(id='news-table')
    # Add the table to our dictionary
    news_tables[ticker] = news_table

parsed_news = []

# Iterate through the news
for file_name, news_table in news_tables.items():
    # Iterate through all tr tags in 'news_table'
    for x in news_table.findAll('tr'):
        # read the text from each tr tag into text
        # get text from a only
        text = x.a.get_text() 
        # splite text in the td tag into a list 
        date_scrape = x.td.text.split()
        # if the length of 'date_scrape' is 1, load 'time' as the only element

        if len(date_scrape) == 1:
            time = date_scrape[0]
            
        # else load 'date' as the 1st element and 'time' as the second    
        else:
            date = date_scrape[0]
            time = date_scrape[1]
        # Extract the ticker from the file name, get the string up to the 1st '_'  
        ticker = file_name.split('_')[0]
        
        # Append ticker, date, time and headline as a list to the 'parsed_news' list
        parsed_news.append([ticker, date, time, text])
        
# Instantiate the sentiment intensity analyzer
vader = SentimentIntensityAnalyzer()

# Set column names
columns = ['ticker', 'date', 'time', 'headline']

# Convert the parsed_news list into a DataFrame called 'parsed_and_scored_news'
parsed_and_scored_news = pd.DataFrame(parsed_news, columns=columns)

# Iterate through the headlines and get the polarity scores using vader
scores = parsed_and_scored_news['headline'].apply(vader.polarity_scores).tolist()

# # Convert the 'scores' list of dicts into a DataFrame
scores_df = pd.DataFrame(scores)

# # Join the DataFrames of the news and the list of dicts
parsed_and_scored_news = parsed_and_scored_news.join(scores_df, rsuffix='_right')

# # Convert the date column from string to datetime
parsed_and_scored_news['date'] = pd.to_datetime(parsed_and_scored_news.date).dt.date

# parsed_and_scored_news.head(50)

# Group by date and ticker columns from scored_news and calculate the mean
mean_scores = parsed_and_scored_news.groupby(['ticker','date']).mean()
# Unstack the column ticker
mean_scores1 = mean_scores.unstack()
# Get the cross-section of compound in the 'columns' axis
mean_scores11 = mean_scores1.xs('compound', axis="columns").transpose()
#Create the fig sentiment compound scores
sentifig=px.bar(mean_scores11,barmode='group',title= 'Compound sentiment scores of news aticles')

In [19]:
mean_scores11

ticker,f
date,Unnamed: 1_level_1
2021-07-21,-0.142975
2021-07-22,-0.146733
2021-07-23,0.008
2021-07-24,-0.379486
2021-07-26,0.224657
2021-07-27,0.085
2021-07-28,0.219206
2021-07-29,0.137904
2021-07-30,0.218417
2021-08-02,0.144675


In [20]:
parsed_and_scored_news.head(25)

Unnamed: 0,ticker,date,time,headline,neg,neu,pos,compound
0,f,2021-08-03,01:54PM,GM Option Traders Predict Earnings to be Runni...,0.167,0.833,0.0,-0.2023
1,f,2021-08-03,09:36AM,3 Auto Stocks Likely to Zoom Past Q2 Earnings ...,0.0,1.0,0.0,0.0
2,f,2021-08-03,09:00AM,"Hau Thai-Tang to Discuss Ford+, Winning in Ele...",0.0,0.815,0.185,0.5267
3,f,2021-08-03,07:10AM,Ford Driving Toward 2H Upside,0.0,1.0,0.0,0.0
4,f,2021-08-03,12:46AM,Ford Motor Company (NYSE:F) Remains Undervalue...,0.0,0.584,0.416,0.6908
5,f,2021-08-02,03:06PM,GM Stock A Buy? General Motors Tests Key Level...,0.0,0.827,0.173,0.3182
6,f,2021-08-02,01:43PM,Jack Roush Jr. on booming performance car and ...,0.0,1.0,0.0,0.0
7,f,2021-08-02,01:19PM,This Bill Gates-backed battery maker is on tra...,0.0,1.0,0.0,0.0
8,f,2021-08-02,01:01PM,These Are The Best EV Stocks To Buy And Watch Now,0.0,0.704,0.296,0.6369
9,f,2021-08-02,10:59AM,Is Ford Stock A Buy Now After Earnings? Automa...,0.0,1.0,0.0,0.0


In [15]:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import pandas as pd
import plotly.express as px

from jupyter_dash import JupyterDash

external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']

app = dash.Dash(__name__, external_stylesheets=external_stylesheets)

# Code for creating the price charts -- 2 charts for stocks and 1 chart for index

import plotly.express as px
import plotly.graph_objects as go
import yfinance as yf

# Stock price charts for the inputed tickers

stock = yf.download(tickers[0],period='2wk' ,interval = "1d").reset_index()
figprice = px.line(stock, x='Date', y="Close")
# stock2 = yf.download(tickers[1],period='2wk' ,interval = "1d").reset_index()
# zzzz = px.line(stock2, x='Date', y="Close")



# Index Chart for S&P 500, NASDAQ, DOW30 

index1 = yf.download("^GSPC ^IXIC ^DJI",period='2wk' ,interval = "1d")
closeidx=index1['Close'].reset_index()
closeidx['SP500_change']=closeidx['^GSPC'].pct_change()
closeidx['NASDAQ_change']=closeidx['^IXIC'].pct_change()
closeidx['DOW30_change']=closeidx['^DJI'].pct_change()
# closeidx
figindex = px.line(closeidx, x='Date',y=['SP500_change','NASDAQ_change','DOW30_change'])




app.layout = html.Div(children=[
    # All elements from the top of the page
    html.Div([
        html.Div([
            html.H1(children='Sentiments Chart'),

            html.Div(children='''
                Sentiment analysis for stocks
            '''),

            dcc.Graph(
                id='graph1',
                figure=sentifig
            ),  
        ], className='six columns'),
        html.Div([
            html.H1(children='Price Chart '),

            html.Div(children='''
                Price chart for stocks
            '''),

            dcc.Graph(
                id='graph2',
                figure=figprice
            ),  
        ], className='six columns'),
    ], className='row'),
    # New Div for all elements in the new 'row' of the page
    html.Div([
        html.H1(children='Index Chart'),

        html.Div(children='''
            Chart shows the percentage change over last 2 weeks
        '''),

        dcc.Graph(
            id='graph3',
            figure=figindex
        ),  
    ], className='row'),
])

if __name__ == '__main__':
    app.run_server(debug=False, port = 8294)

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  3 of 3 completed
Dash is running on http://127.0.0.1:8294/

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: off


 * Running on http://127.0.0.1:8294/ (Press CTRL+C to quit)
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET /_dash-component-suites/dash_renderer/react@16.v1_1_2m1576595738.8.6.min.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET /_dash-component-suites/dash_renderer/prop-types@15.v1_1_2m1576595738.7.2.min.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET /_dash-component-suites/dash_core_components/highlight.v1_3_1m1576595950.pack.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET /_dash-component-suites/dash_renderer/react-dom@16.v1_1_2m1576595738.8.6.min.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET /_dash-component-suites/dash_html_components/dash_html_components.v1_0_1m1576596177.min.js HTTP/1.1[0m" 200 -
127.0.0.1 - - [03/Aug/2021 12:43:21] "[37mGET /_dash-component-suites/dash_renderer/dash_renderer.v1_1_2m1576595738.min.js HTT