# Financial Sentiment Analysis (Single)

In this program, I run a sentiment analysis of a single company based on financial news articles.

The company that I am targeting is Nvidia [NVDA]  

From that sentiment, I will use the data to try and predict the price movement of that company's stock price.  

The packages that I am using are:  
`os`, `dotenv`, `datetime`, `newsapi`, `pandas`, `nltk`, `re`, `string`, `yfinance`

## Fetching News Articles

The first step is to fetch the news articles.  

I am using `NewsAPI` to get articles quickly and easily. Then, I use `pandas` to put the articles into a dataframe, where I can collect and read the data easier.  

**Filtering articles:**  
Filter articles that only exist  
- `NewsAPI` sometimes fetches articles that were removed  

**Extracting the data:**  
Extract only the necessary data from the articles
- Title
- Description
- Content

All others can be discarded.  

Both of these steps are part of the cleaning data step that is next in text preprocessing.

In [1]:
import os
from dotenv import load_dotenv

In [2]:
# get path to the environment file
env_path = '../../config/.env'
load_dotenv(env_path)

True

In [3]:
# import datetime, timedelta modules from datetime
from datetime import datetime, timedelta

In [4]:
# import newsapi package
from newsapi import NewsApiClient

In [5]:
# init newsapi
newsapi = NewsApiClient(api_key=os.getenv('NEWS_API_KEY'))

In [6]:
company = "Nvidia"
days_back = 29
end_date = datetime.now()
start_date = end_date - timedelta(days=days_back)

In [7]:
# fetch all articles that mention Nvidia
all_articles = newsapi.get_everything(q=company,
                                      from_param=start_date.strftime('%Y-%m-%d'),
                                      to=end_date.strftime('%Y-%m-%d'),
                                      language='en')

In [8]:
import pandas as pd
pd.__version__

'2.2.3'

In [9]:
# place all_articles into a dataframe
all_articles_df = pd.DataFrame(all_articles['articles'])
all_articles_df

Unnamed: 0,source,author,title,description,url,urlToImage,publishedAt,content
0,"{'id': None, 'name': 'Yahoo Entertainment'}",Lawrence Bonk,DOJ subpoenas NVIDIA as part of antitrust prob...,The DOJ has sent subpoenas to NVIDIA and other...,https://consent.yahoo.com/v2/collectConsent?se...,,2024-09-04T15:34:35Z,"If you click 'Accept all', we and our partners..."
1,"{'id': None, 'name': 'Gizmodo.com'}",Kyle Barr,The Leaked Nvidia RTX 5090 Has So Many Cores I...,Get ready to watch the lights on your block di...,https://gizmodo.com/the-leaked-nvidia-rtx-5090...,https://gizmodo.com/app/uploads/2024/09/Nvidia...,2024-09-27T13:35:22Z,The GeForce RTX 4090 is already so big that an...
2,"{'id': None, 'name': 'Yahoo Entertainment'}",Jeremy Gan,ByteDance will reportedly use Huawei chips to ...,"As first reported by Reuters, ByteDance, the C...",https://consent.yahoo.com/v2/collectConsent?se...,,2024-09-30T15:48:46Z,"If you click 'Accept all', we and our partners..."
3,"{'id': 'business-insider', 'name': 'Business I...",Emma Cosgrove,Nvidia might actually lose in this key part of...,"As AI matures, Nvidia, Groq, and Cerebras focu...",https://www.businessinsider.com/nvidia-may-los...,https://i.insider.com/66d0c408392a3bda9f2349e3...,2024-09-01T13:00:02Z,Justin Sullivan/Getty\r\n<ul><li>Inference mad...
4,"{'id': 'business-insider', 'name': 'Business I...",Eugene Kim,This chart shows one potential advantage AWS's...,"AI chip investments by Amazon, Google, and Mic...",https://www.businessinsider.com/aws-ai-chips-w...,https://i.insider.com/6622c44b23b29110d3011ce1...,2024-09-26T09:00:02Z,Noah Berger/Getty Images\r\n<ul><li>Big tech c...
...,...,...,...,...,...,...,...,...
95,"{'id': None, 'name': 'Yahoo Entertainment'}","Sean Williams, The Motley Fool","Billionaires Warren Buffett, David Tepper, and...",Some of Wall Street's most successful value-se...,https://finance.yahoo.com/news/billionaires-wa...,https://s.yimg.com/cv/apiv2/social/images/yaho...,2024-09-21T09:06:00Z,"For the better part of two years, the bulls ha..."
96,"{'id': None, 'name': 'Theregister.com'}",Thomas Claburn,OpenAI allegedly wants TSMC 1.6nm for in-house...,"Another job for Broadcom, then\nOpenAI's first...",https://www.theregister.com/2024/09/04/openai_...,https://regmedia.co.uk/2021/01/12/shutterstock...,2024-09-04T02:29:19Z,OpenAI's first custom-designed silicon chips a...
97,"{'id': None, 'name': 'Theregister.com'}",Liam Proven,Double Debian update: 11.11 and 12.7 arrive at...,But Bullseye's days are numbered and it's time...,https://www.theregister.com/2024/09/04/double_...,https://regmedia.co.uk/2021/08/16/shutterstock...,2024-09-04T11:28:06Z,"The latest update to Debian ""Bookworm"" arrives..."
98,"{'id': 'business-insider', 'name': 'Business I...",Dan DeFrancesco,The tech industry is ready for robot taxis. Bu...,"Driverless cars are gaining momentum, with Tes...",https://www.businessinsider.com/waymo-robot-ta...,https://i.insider.com/66a90e9a1a227600e632ca38...,2024-09-04T12:48:22Z,Waymo's fully autonomous Jaguar I-PACEBlue Pla...


In [10]:
# filter articles function
# only filters valid articles
# valid meaning: article exists and description of article exists
def filter_removed_articles(articles):
    return [article for article in articles if article.get('title') != '[Removed]']

In [11]:
# filter the all_articles
valid_articles = filter_removed_articles(all_articles['articles'])

In [12]:
valid_articles_df = pd.DataFrame(valid_articles)
valid_articles_df

Unnamed: 0,source,author,title,description,url,urlToImage,publishedAt,content
0,"{'id': None, 'name': 'Yahoo Entertainment'}",Lawrence Bonk,DOJ subpoenas NVIDIA as part of antitrust prob...,The DOJ has sent subpoenas to NVIDIA and other...,https://consent.yahoo.com/v2/collectConsent?se...,,2024-09-04T15:34:35Z,"If you click 'Accept all', we and our partners..."
1,"{'id': None, 'name': 'Gizmodo.com'}",Kyle Barr,The Leaked Nvidia RTX 5090 Has So Many Cores I...,Get ready to watch the lights on your block di...,https://gizmodo.com/the-leaked-nvidia-rtx-5090...,https://gizmodo.com/app/uploads/2024/09/Nvidia...,2024-09-27T13:35:22Z,The GeForce RTX 4090 is already so big that an...
2,"{'id': None, 'name': 'Yahoo Entertainment'}",Jeremy Gan,ByteDance will reportedly use Huawei chips to ...,"As first reported by Reuters, ByteDance, the C...",https://consent.yahoo.com/v2/collectConsent?se...,,2024-09-30T15:48:46Z,"If you click 'Accept all', we and our partners..."
3,"{'id': 'business-insider', 'name': 'Business I...",Emma Cosgrove,Nvidia might actually lose in this key part of...,"As AI matures, Nvidia, Groq, and Cerebras focu...",https://www.businessinsider.com/nvidia-may-los...,https://i.insider.com/66d0c408392a3bda9f2349e3...,2024-09-01T13:00:02Z,Justin Sullivan/Getty\r\n<ul><li>Inference mad...
4,"{'id': 'business-insider', 'name': 'Business I...",Eugene Kim,This chart shows one potential advantage AWS's...,"AI chip investments by Amazon, Google, and Mic...",https://www.businessinsider.com/aws-ai-chips-w...,https://i.insider.com/6622c44b23b29110d3011ce1...,2024-09-26T09:00:02Z,Noah Berger/Getty Images\r\n<ul><li>Big tech c...
...,...,...,...,...,...,...,...,...
94,"{'id': None, 'name': 'Yahoo Entertainment'}","Sean Williams, The Motley Fool","Billionaires Warren Buffett, David Tepper, and...",Some of Wall Street's most successful value-se...,https://finance.yahoo.com/news/billionaires-wa...,https://s.yimg.com/cv/apiv2/social/images/yaho...,2024-09-21T09:06:00Z,"For the better part of two years, the bulls ha..."
95,"{'id': None, 'name': 'Theregister.com'}",Thomas Claburn,OpenAI allegedly wants TSMC 1.6nm for in-house...,"Another job for Broadcom, then\nOpenAI's first...",https://www.theregister.com/2024/09/04/openai_...,https://regmedia.co.uk/2021/01/12/shutterstock...,2024-09-04T02:29:19Z,OpenAI's first custom-designed silicon chips a...
96,"{'id': None, 'name': 'Theregister.com'}",Liam Proven,Double Debian update: 11.11 and 12.7 arrive at...,But Bullseye's days are numbered and it's time...,https://www.theregister.com/2024/09/04/double_...,https://regmedia.co.uk/2021/08/16/shutterstock...,2024-09-04T11:28:06Z,"The latest update to Debian ""Bookworm"" arrives..."
97,"{'id': 'business-insider', 'name': 'Business I...",Dan DeFrancesco,The tech industry is ready for robot taxis. Bu...,"Driverless cars are gaining momentum, with Tes...",https://www.businessinsider.com/waymo-robot-ta...,https://i.insider.com/66a90e9a1a227600e632ca38...,2024-09-04T12:48:22Z,Waymo's fully autonomous Jaguar I-PACEBlue Pla...


In [13]:
# extract article essentials function
# extract only the title, description, and content from the articles
def extract_article_essentials(articles):
    return [{'title': article['title'], 'content': article['content'], 'publishedAt': article['publishedAt']} for article in articles]

In [14]:
extracted_articles = extract_article_essentials(valid_articles)

In [15]:
extracted_articles_df = pd.DataFrame(extracted_articles)
extracted_articles_df

Unnamed: 0,title,content,publishedAt
0,DOJ subpoenas NVIDIA as part of antitrust prob...,"If you click 'Accept all', we and our partners...",2024-09-04T15:34:35Z
1,The Leaked Nvidia RTX 5090 Has So Many Cores I...,The GeForce RTX 4090 is already so big that an...,2024-09-27T13:35:22Z
2,ByteDance will reportedly use Huawei chips to ...,"If you click 'Accept all', we and our partners...",2024-09-30T15:48:46Z
3,Nvidia might actually lose in this key part of...,Justin Sullivan/Getty\r\n<ul><li>Inference mad...,2024-09-01T13:00:02Z
4,This chart shows one potential advantage AWS's...,Noah Berger/Getty Images\r\n<ul><li>Big tech c...,2024-09-26T09:00:02Z
...,...,...,...
94,"Billionaires Warren Buffett, David Tepper, and...","For the better part of two years, the bulls ha...",2024-09-21T09:06:00Z
95,OpenAI allegedly wants TSMC 1.6nm for in-house...,OpenAI's first custom-designed silicon chips a...,2024-09-04T02:29:19Z
96,Double Debian update: 11.11 and 12.7 arrive at...,"The latest update to Debian ""Bookworm"" arrives...",2024-09-04T11:28:06Z
97,The tech industry is ready for robot taxis. Bu...,Waymo's fully autonomous Jaguar I-PACEBlue Pla...,2024-09-04T12:48:22Z


In [16]:
extracted_articles_df.to_csv(os.path.join('../DataFrames', 'extracted_articles_df.tsv'), index=False)