In this notebook we try to explore the frequency of the articles published within the time span of 2019/03-2024/03 using Apple.

First, we look at the daily news article counts, and the monthly article counts. Then we explore some days where the article count was very low or very high. In the instance with the highest article count, we take a breif look at the movement of the stock price around that date.

Key observations: 
1. While there seem to be at least one article reporting on stock movements each day, these articles seem to vary in tone depending on other market information around the day.
2. Perhaps there are instances where the combined sentiment score for the day is hiding a lot of important details.  For example, some days could have a majority of neutral or positive toned articles with a one or two significantly negative toned articles. In such an example the combined score may seem relatively neutral, but the fact that there are negatively toned articles (whereas perhaps most days this ticker may only get neutral or postive news), may still skew readers' perception in a negative direction.

In [64]:
import pandas as pd
import numpy as np
import datetime
from seaborn import set_style
import plotly as py

set_style("whitegrid")

In [65]:
news = pd.read_csv('../data/news_data_2019_2024_complete.csv')

In [66]:
#restricting to specific tickers
Apl = news.loc[news["Ticker"] == 'AAPL'].reset_index()
Msft = news.loc[news["Ticker"] == 'MSFT'].reset_index()
Amz = news.loc[news["Ticker"] == 'AMZN'].reset_index()
Ggl = news.loc[news["Ticker"] == 'GOOGL'].reset_index()
Nvd = news.loc[news["Ticker"] == 'NVDA'].reset_index()

In [67]:
Apl.head()

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
0,1,1,2019-03-15 10:47:26+00:00,2019-03-15,10:47:26,AAPL,Technology,The Motley Fool,0.0,0.748,0.252,0.5248,Don't Underestimate Apple's iPhone Business,The segment is an invaluable asset to Apple's ...,https://www.fool.com/investing/2019/03/14/dont...
1,11,11,2019-03-15 19:16:44+00:00,2019-03-15,19:16:44,AAPL,Technology,CNBC,0.0,0.651,0.349,0.9493,Apple's new streaming service seeks Oscar glory,Apple is reportedly hiring strategists to help...,https://www.cnbc.com/2019/03/15/apples-new-str...
2,15,15,2019-03-15 19:40:00+00:00,2019-03-15,19:40:00,AAPL,Technology,The Motley Fool,0.0,1.0,0.0,0.0,Apple Just Scored the $1 Billion It Was Asking...,But it won't be able to collect quite yet.,https://www.fool.com/investing/2019/03/15/appl...
3,22,22,2019-03-18 14:54:26+00:00,2019-03-18,14:54:26,AAPL,Technology,CNBC,0.0,0.912,0.088,0.4019,Apple unveils new iPad Air and iPad mini ahead...,The upgraded models offer keyboard support for...,https://www.cnbc.com/2019/03/18/apple-unveils-...
4,23,23,2019-03-18 15:29:38+00:00,2019-03-18,15:29:38,AAPL,Technology,Seeking Alpha,0.0,0.812,0.188,0.7717,Buy Apple For Great Total Return And Future 10...,This article is about Apple and why it's a buy...,https://seekingalpha.com/article/4249322-buy-a...


In [70]:
Apl_c = Apl.groupby('Date')['Ticker'].count().reset_index()

In [71]:
Apl_counts = Apl_c.rename(columns={"Ticker": "Article_Count" })


In [72]:
Apl_counts.head()

Unnamed: 0,Date,Article_Count
0,2019-03-15,3
1,2019-03-18,8
2,2019-03-19,10
3,2019-03-20,8
4,2019-03-21,12


Let's look at a histogram of "article count per day"

In [73]:
import plotly.express as px
#df = px.data.tips()
fig = px.histogram(Apl_counts, x="Article_Count",  labels={"Article_Count": "Article Count Per Day"})
fig.update_layout(yaxis_title="Days")
fig.show()

Let's look at Article Counts per Month

In [74]:
#Create a column for month
Apl_counts = Apl_counts.assign(month = Apl_counts.Date.str[0:7])

In [75]:
Apl_counts.tail()

Unnamed: 0,Date,Article_Count,month
1661,2024-03-11,7,2024-03
1662,2024-03-12,10,2024-03
1663,2024-03-13,13,2024-03
1664,2024-03-14,7,2024-03
1665,2024-03-15,12,2024-03


In [76]:
#create another dataframe containing article count by month
Apl_counts_month = Apl_counts.groupby('month')['Article_Count'].count().reset_index()

In [77]:
Apl_counts_month.head()

Unnamed: 0,month,Article_Count
0,2019-03,15
1,2019-04,27
2,2019-05,29
3,2019-06,27
4,2019-07,26


In [78]:
fig = px.histogram(Apl_counts_month, x="Article_Count", labels={"Article_Count" : "Article Count per Month"})
fig.update_layout(yaxis_title="Months")
fig.show()

Let's look at this from a different angle. How about a simple graph that plots article counts against date.

In [79]:
import plotly.graph_objects as go

fig = go.Figure([go.Scatter(x=Apl_counts['Date'], y=Apl_counts['Article_Count'])])
fig.show()


It seems there are lots of days where the article count is really low. Let's see how many days there are with 0 articles about Apple.

In [56]:
#number of days with 0 articles
Apl_counts.loc[Apl_counts["Article_Count"] == 0].count()

Date             0
Article_Count    0
month            0
dtype: int64

Apple really generates regular buzz, huh? Either that or, maybe there are just obligatory daily stock movement reports that we should perhaps(?) filter out. (I suppose we'll get more context when we explore other tickers)

Let's see what the minimum article count for a day is.

In [57]:
Apl_counts["Article_Count"].min()

1

Alright, then let's see how many days we have only one article.

In [58]:
Apl_counts.loc[Apl_counts["Article_Count"] == 1].count()

Date             206
Article_Count    206
month            206
dtype: int64

In [89]:
#double checking the total article count
#Apl_counts_month["Article_Count"].sum()

Let's look at a date with just one article

In [81]:
Apl.loc[Apl["Date"] == "2020-08-29"]

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
4363,17593,17593,2020-08-29 15:31:54+00:00,2020-08-29,15:31:54,AAPL,Technology,Zacks Investment Research,0.0,1.0,0.0,0.0,Why Is Apple (AAPL) Up 29.8% Since Last Earnin...,Apple (AAPL) reported earnings 30 days ago. Wh...,https://www.zacks.com/stock/news/1049587/why-i...


As presumed, it is a report on stock movement (and interestingly it has a neutral sentiment score). Let's look at another one.

In [82]:
Apl.loc[Apl["Date"] == "2020-10-21"]

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
4794,19692,19692,2020-10-21 17:21:25+00:00,2020-10-21,17:21:25,AAPL,Technology,InvestorPlace,0.057,0.775,0.168,0.7845,Greater Than Expected iPhone 12 Demand Makes A...,"AAPL stock could take off again, if preliminar...",https://investorplace.com/2020/10/greater-than...


In [92]:
#not sure how to print the entire headline
df = Apl.loc[Apl["Date"] == "2020-10-21"]
print(df.get("Headline"))

4794    Greater Than Expected iPhone 12 Demand Makes A...
Name: Headline, dtype: object


Huh, that one is reporting on stock movement. But it contains additional commentary on the products as well. So if we were planning on throwing out articles reporting simply on stock movement, this situation is quite tangled. This is very much something we should keep. Interestingly enough, unlike the previous article, this one does not display neutral sentiment.

Maybe we should look at a few others.

In [83]:
Apl.loc[Apl["Date"] == "2023-10-28"]

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
11966,58033,58033,2023-10-28 12:38:00+00:00,2023-10-28,12:38:00,AAPL,Technology,The Motley Fool,0.043,0.834,0.123,0.4497,Can Apple Stock Double in 5 Years? Here's What...,Apple shares have tripled in the past five yea...,https://www.fool.com/investing/2023/10/28/can-...


This one seems "mostly" neutral

In [84]:
Apl.loc[Apl["Date"] == "2022-07-03"]

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
8828,40704,40704,2022-07-03 11:30:04+00:00,2022-07-03,11:30:04,AAPL,Technology,InvestorPlace,0.0,0.77,0.23,0.861,The Next Apple Product Launch Paves the Way fo...,The best way to play a new Apple product launc...,https://investorplace.com/hypergrowthinvesting...


High neutral score. But no negativity at all, pushing the compound score to be high.

It seems that these "one per day" obligatory stock movement reports can't be thrown out altogether. The way these are reported, still can impact the next day's price movements.

There aren't matter of fact reports. The writer's optimism or pessimism impacted by other market factors may influence the writing and hence the reader's attitude.

Before we move onto other tickers, let's look at the date with the highest article count as well.

In [85]:
Apl_counts["Article_Count"].max()

63

In [87]:
Apl_counts.loc[Apl_counts["Article_Count"] == 63]

Unnamed: 0,Date,Article_Count,month
161,2019-09-10,63,2019-09


In [88]:
Apl.loc[Apl["Date"] == "2019-09-10"]

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
1308,4917,4917,2019-09-10 00:27:00+00:00,2019-09-10,00:27:00,AAPL,Technology,CNBC,0.000,1.000,0.000,0.0000,Apple's new iPhones might not be able to charg...,The revised prediction indicates that Apple mi...,https://www.cnbc.com/2019/09/09/apple-2019-iph...
1309,4918,4918,2019-09-10 00:45:00+00:00,2019-09-10,00:45:00,AAPL,Technology,The Motley Fool,0.000,1.000,0.000,0.0000,Apple's Big Event: What to Watch,Here's what investors will be watching during ...,https://www.fool.com/investing/2019/09/09/appl...
1310,4922,4922,2019-09-10 12:00:00+00:00,2019-09-10,12:00:00,AAPL,Technology,The Motley Fool,0.112,0.736,0.152,0.0953,Why Apple's In-Screen Fingerprint Reader Could...,Tech companies are collecting more information...,https://www.fool.com/investing/2019/09/10/why-...
1311,4923,4923,2019-09-10 12:51:04+00:00,2019-09-10,12:51:04,AAPL,Technology,Reuters,0.000,0.942,0.058,0.3182,Apple to reveal streaming service prices while...,Apple Inc is set on Tuesday to announce prici...,https://www.reuters.com/article/us-apple-iphon...
1312,4928,4928,2019-09-10 13:40:00+00:00,2019-09-10,13:40:00,AAPL,Technology,Zacks Investment Research,0.000,0.837,0.163,0.6808,Can Smartphone Market Rebound in 2020? AAPL & ...,"Per IDC, the global smartphone market witnessi...",https://www.zacks.com/stock/news/511071/can-sm...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1366,5009,5009,2019-09-10 21:25:00+00:00,2019-09-10,21:25:00,AAPL,Technology,The Motley Fool,0.087,0.913,0.000,-0.2960,Apple Bolsters Its Fast-Growing Wearables Busi...,The tech giant has no intention of slowing dow...,https://www.fool.com/investing/2019/09/10/appl...
1367,5010,5010,2019-09-10 21:30:43+00:00,2019-09-10,21:30:43,AAPL,Technology,The Guardian,0.033,0.808,0.158,0.6690,iPhone 11: Apple's most ambitious bid yet to c...,iPhone 11 Pro is being touted as ‘the most adv...,https://www.theguardian.com/technology/2019/se...
1368,5011,5011,2019-09-10 21:40:10+00:00,2019-09-10,21:40:10,AAPL,Technology,Market Watch,0.000,0.945,0.055,0.3384,MarketWatch First Take: Apple iPhone event rev...,The most surprising new feature Apple Inc. deb...,https://www.marketwatch.com/story/apple-iphone...
1369,5012,5012,2019-09-10 21:59:00+00:00,2019-09-10,21:59:00,AAPL,Technology,CNBC,0.000,1.000,0.000,0.0000,Apple unveils the iPhone 11 – three experts on...,Apple unveiled its latest smartphone and a ran...,https://www.cnbc.com/2019/09/10/apple-unveils-...



On this day, it turns out there was a big event promoting iphone 11 pro. :https://www.youtube.com/watch?v=-rAeqN-Q7x4

Makes me curious about the prices next day

In [104]:
Apl_stock = pd.read_csv('..\Stock_data\Apple_stock.csv')

In [109]:
Apl_test = Apl_stock.loc[Apl_stock["Date"] <= "2019-09-20"]
Apl_stock_19_09_10 = Apl_test.loc[Apl_test["Date"] >= "2019-09-01"]

In [111]:
#closing prices against date
fig = go.Figure([go.Scatter(x=Apl_stock_19_09_10['Date'], y=Apl_stock_19_09_10['Close'])])
fig.show()


We can see the closing price rising around the date but slowly starts droppint after that. I wonder if the news for next day would display some reasons as to why.

Let's take a look at the news articles for the following day.

In [112]:
Apl.loc[Apl["Date"] == "2019-09-11"]

Unnamed: 0.1,index,Unnamed: 0,Date & Time,Date,Time,Ticker,Sector,Source,sentiment_neg,sentiment_neu,sentiment_pos,sentiment_tot,Headline,Text,URL
1371,5015,5015,2019-09-11 02:26:00+00:00,2019-09-11,02:26:00,AAPL,Technology,The Motley Fool,0.0,0.928,0.072,0.3384,The Biggest iPhone News on Tuesday Wasn't the ...,One of the most surprising narratives from the...,https://www.fool.com/investing/2019/09/10/the-...
1372,5017,5017,2019-09-11 05:38:44+00:00,2019-09-11,05:38:44,AAPL,Technology,Reuters,0.067,0.865,0.068,0.2732,Tepid reaction for Apple's new iPhones in Chin...,A lower price tag and new features may not be ...,https://www.reuters.com/article/us-apple-iphon...
1373,5018,5018,2019-09-11 08:30:45+00:00,2019-09-11,08:30:45,AAPL,Technology,The Guardian,0.0,1.0,0.0,0.0,Apple to launch most expensive iPhone ever in ...,Brexit-battered pound means iPhones 11 Pro and...,https://www.theguardian.com/technology/2019/se...
1374,5019,5019,2019-09-11 08:47:00+00:00,2019-09-11,08:47:00,AAPL,Technology,CNBC,0.215,0.785,0.0,-0.8658,"After being forced to cut prices in China, App...",Apple's new iPhone 11 range commands a premium...,https://www.cnbc.com/2019/09/11/apple-iphone-1...
1375,5020,5020,2019-09-11 09:32:23+00:00,2019-09-11,09:32:23,AAPL,Technology,Reuters,0.174,0.826,0.0,-0.7469,"Apple's new, lower priced iPhone draws tepid r...","Apple Inc's new, lower priced iPhone that com...",https://www.reuters.com/article/us-apple-iphon...
1376,5022,5022,2019-09-11 11:00:41+00:00,2019-09-11,11:00:41,AAPL,Technology,24/7 Wall Street,0.0,1.0,0.0,0.0,Can Apple Afford to Sell Video for $4.99?,Apple Inc. (NASDAQ: AAPL) has announced its ne...,https://247wallst.com/media/2019/09/11/can-app...
1377,5026,5026,2019-09-11 12:18:00+00:00,2019-09-11,12:18:00,AAPL,Technology,CNBC,0.0,0.963,0.037,0.1603,Apple didn't release its rumored smartwatch sl...,"Apple introduced new iPhones, a new iPad and i...",https://www.cnbc.com/2019/09/11/no-new-apple-w...
1378,5028,5028,2019-09-11 13:03:00+00:00,2019-09-11,13:03:00,AAPL,Technology,Zacks Investment Research,0.072,0.718,0.211,0.4404,Can Competitive Pricing Boost Apple's Revenues?,The tech behemoth has lowered the price of its...,https://www.zacks.com/stock/news/512890/can-co...
1379,5029,5029,2019-09-11 13:19:00+00:00,2019-09-11,13:19:00,AAPL,Technology,CNBC,0.0,1.0,0.0,0.0,Why Apple made the unusual move to sell its st...,Apple TV+ is cheaper than any other major stre...,https://www.cnbc.com/2019/09/11/the-apple-tv-p...
1380,5030,5030,2019-09-11 13:22:41+00:00,2019-09-11,13:22:41,AAPL,Technology,InvestorPlace,0.137,0.863,0.0,-0.6597,Owners of Apple Stock Should Be Afraid,"With AAPL showing many signs of fear, the owne...",https://investorplace.com/2019/09/owners-of-ap...


While there are many articles on this day with positive sentimenent scores, it is worth noting that there are few articles with negative sentiment scores as well. It seems the new iphone had a price drop. 

Given how people intake news (and given that positively or negatively skewed takes tend to stick out compared to the more neutral takes), I wonder how much of the piture we are missing by combining a given day's sentiment into one total score. (It comes to mind that our overall scores, after combining, were mostly between 0 and 0.3 but these articles with overall positive or negative sentiments alone could have had a significant impact.) Perhaps two more pieces of information we can retain per day per ticker, could be:
1. The highest compound sentiment score an article had about the ticker that day
2. The lowest compound sentiment score an article had about the ticker that day


More things we could explore in this vein:
1. Looking at other tickers in this manner
2. Comparing vader scores to finvader scores