# Data Parsing & Time Series

## Michael Mallon - UCD

#### This is an anaylsis of Google's performance in the stock market over the past 5 years, 2013-2017

Import the required libraries for later use

In [None]:
import pandas as pd
import matplotlib
import matplotlib.pylab as plt
%matplotlib inline 
from datetime import datetime
import matplotlib.dates as mdates
import numpy as np
import seaborn as sns
import matplotlib.gridspec as gridspec
import calendar

# Data Collection

Creating a universal url to use for each pd.read_html

In [None]:
url = 'http://mlg.ucd.ie/modules/COMP30760/stocks/goog.html'

Reading in each individual year rather than the entire 5 years for ease of later use

In the graph above its very clear that both the end and the start of the year seem to be a more unstable time at Google with its returns fluctuating up and down.The closer the value is to 0 the more stable it is. In the final graph the highest point of return is shown towering over the others.

#### Seperating all the years for analysis now all the columns I wish to add now included

In [None]:
Google2013 = Google['2013']
Google2014 = Google['2014']
Google2015 = Google['2015']
Google2016 = Google['2016']
Google2017 = Google['2017']

### Quarterly Returns

Creating a new dataframes for each year based on daily Opening value

In [None]:
OpenQ2017 = Google2017.Open.resample('D').last().ffill()
Quarters2017 = pd.concat([OpenQ2017])

OpenQ2016 = Google2016.Open.resample('D').last().ffill()
Quarters2016 = pd.concat([OpenQ2016], axis=1)
                                   
OpenQ2015 = Google2015.Open.resample('D').last().ffill()
Quarters2015 = pd.concat([OpenQ2015], axis=1)

OpenQ2014 = Google2014.Open.resample('D').last().ffill()
Quarters2014 = pd.concat([OpenQ2014], axis=1)

OpenQ2013 = Google2013.Open.resample('D').last().ffill()
Quarters2013 = pd.concat([OpenQ2013], axis=1)

In [None]:
Quarters2017 = Quarters2017.reset_index()
Quarters2016 = Quarters2016.reset_index()
Quarters2015 = Quarters2015.reset_index()
Quarters2014 = Quarters2014.reset_index()
Quarters2013 = Quarters2013.reset_index()

This graph shows when Googles Third Quarter report was released to the public (red line) according to 'https://www.nasdaq.com/earnings/report/googl'. 
An imediate increase can be seen after this date in 2017 Googles share prices, obviously more people began to invest then. In the case of 2016 stock prices dropped significantly.

This graph also shows when Googles First Quarter report was released to the public (green line) according to 'https://www.nasdaq.com/earnings/report/googl'. 
Once again an increase can be seen after this date in Googles 2017 share prices with more people investing.
In 2016 it seems Googles stocks held its value and neither dropped nor raised.

In 2015 in both cases Googles stocks seemed to stay stable and only fluctuates a little.

In [None]:
fig, ax = plt.subplots(figsize=(24,12))

Quarters2017['Open'].plot(ax=ax,label='2017')
plt.axvline(x=[298],color='red',label='Q3')
plt.axvline(x=[116],color='green',label='Q1')

Quarters2016['Open'].plot(ax=ax,label='2016')
Quarters2015['Open'].plot(ax=ax,label='2015')
plt.legend(loc='best')
plt.title('First and Third Quarter Analysis (2015-2017) - GOOG',fontsize=25)

### In what months does Google perform the best?

Creating a column Month and making abbreviating it.

In [None]:
Google = Google.reset_index()
Google['Month'] = pd.to_datetime(Google['Date'], format='%m/%d/%y').dt.month
Google['Month'] = Google['Month'].apply(lambda x: calendar.month_abbr[x])

Finding all unique month values

In [None]:
Google.Month.unique()

Making a new dataframe for every month, getting the mean of growth in each month and then merging them into the one dataframe.

In [None]:
Jan = Google[Google.Month.str.contains("Jan") == True]
Feb = Google[Google.Month.str.contains("Feb") == True]
Mar = Google[Google.Month.str.contains("Mar") == True]
Apr = Google[Google.Month.str.contains("Apr") == True]
May = Google[Google.Month.str.contains("May") == True]
Jun = Google[Google.Month.str.contains("Jun") == True]
Jul = Google[Google.Month.str.contains("Jul") == True]
Aug = Google[Google.Month.str.contains("Aug") == True]
Sep = Google[Google.Month.str.contains("Sep") == True]
Oct = Google[Google.Month.str.contains("Oct") == True]
Nov = Google[Google.Month.str.contains("Nov") == True]
Dec = Google[Google.Month.str.contains("Dec") == True]

Jan = Jan['Growth'].mean()
Feb = Feb['Growth'].mean()
Mar = Mar['Growth'].mean()
Apr = Apr['Growth'].mean()
May = May['Growth'].mean()
Jun = Jun['Growth'].mean()
Jul = Jul['Growth'].mean()
Aug = Aug['Growth'].mean()
Sep = Sep['Growth'].mean()
Oct = Oct['Growth'].mean()
Nov = Nov['Growth'].mean()
Dec = Dec['Growth'].mean()

columns =  ['Mean_Growth_Per_Month']
Month = pd.DataFrame(columns=columns)
Month.loc[1] = [Jan]
Month.loc[2] = [Feb]
Month.loc[3] = [Mar]
Month.loc[4] = [Apr]
Month.loc[5] = [May]
Month.loc[6] = [Jun]
Month.loc[7] = [Jul]
Month.loc[8] = [Aug]
Month.loc[9] = [Sep]
Month.loc[10] = [Oct]
Month.loc[11] = [Nov]
Month.loc[12] = [Dec]

Resetting the index and reapplying month abrreviations

Then setting the month as an index and sorting the table in desceding value

In [None]:
Month = Month.reset_index()
Month['index'] = Month['index'].apply(lambda x: calendar.month_abbr[x])
Month = Month.set_index('index')
Month = Month.sort_values(['Mean_Growth_Per_Month'], ascending=False)

This graph shows Googles average best and worst performing months. October,July and May are their best performing months wheras June, August and March are their worst.

In [None]:
Google2013 = pd.read_html(url)[0]
Google2014 = pd.read_html(url)[1]
Google2015 = pd.read_html(url)[2]
Google2016 = pd.read_html(url)[3]
Google2017 = pd.read_html(url)[4]

Creating an array of all of the years and using concatenate the tables to create a full table

In [None]:
Google.head(3)

In [None]:
Google['AdjustedClose'] = Google['Open']
Google.AdjustedClose = Google.AdjustedClose.shift(-1)

Adjusted close now added and works well

In [None]:
Google.head(20)

# Analysis & Interpretation

### Next task is to be able to visualse and analyse this information at daily, monthly and annual frequencies

Using DataFrame.resample on each indivdual column and concatenate them to one overall table.

###### Daily

In [None]:
OpenD = Google.Open.resample('D').last()
CloseD = Google.AdjustedClose.resample('D').last()
HighD = Google.High.resample('D').max()
LowD = Google.Low.resample('D').min()
Daily = pd.concat([OpenD, CloseD, HighD, LowD], axis=1)
Daily.head(7)

I have noticed that there is no values entered in for the weekends (stock markets are closed). This will cause gaps when visualsing. Best resolve I feel is to fill these with each fridays information using ffill().

In [None]:
OpenD = Google.Open.resample('D').last().ffill()
CloseD = Google.AdjustedClose.resample('D').last().ffill()
HighD = Google.High.resample('D').max().ffill()
LowD = Google.Low.resample('D').min().ffill()
Daily = pd.concat([OpenD, CloseD, HighD, LowD], axis=1)
Daily.head(7)

Using subplots as its hard to distinguish between each line otherwise.

In [None]:
GoogleVolatility['Expected Change'] = GoogleVolatility['Volatility'] * Google['AdjustedClose'].shift()

To calculate the actually change you just minus the adjusted close from each day by eachother.

In [None]:
GoogleVolatility['Actual Change'] = Google['AdjustedClose'] - Google['AdjustedClose'].shift()

Graphing the expected against the actual will show an accurate enough prediction but can be subject to lag meaning its predictions could be late as it uses a 10 day rolling average

In [None]:
fig, ax = plt.subplots(figsize=(15,8))
GoogleVolatility.plot(ax=ax,y=['Actual Change','Expected Change'],color =['lightgreen','magenta'],figsize=(10,10),lw=2)

## Extra Work - NASDAQ Comparision

I had a bit of extra time with this assignments extention so I was interested to compare how GOOG performed against NASDAQ (American Stock Exchange Market)

I researched to see if I could find any information on the market between the years of 2013 and 2017. While searching I came across 'investing.com'.This site allowed me to download a CSV file of the NASDAQ from 2013-2017 which I have included in my submission.

In [None]:
Google2 = Google

Reading in the data and getting a first look at what I have.

In [None]:
Nasdaq = pd.read_csv("NASDAQ Composite Historical Data.csv")
Nasdaq.head(7)

Changing the files dates to datetime then converting it to an object.

In [None]:
Nasdaq['Dates'] = pd.to_datetime(Nasdaq['Date'])
Nasdaq['Dates'] = Nasdaq['Dates'].astype(object)
Nasdaq.head(7)

Dropping all irrelevant columns from Nasdaq

In [None]:
Nasdaq = Nasdaq.drop(['Vol.','Price','High','Low','Change %','Date'], axis=1)

Making the column Dates an object

In [None]:
Google2 = Google2.reset_index()
Google2['Dates'] = Google2['Date']
Google2['Dates'] = Google2['Dates'].astype(object)

Dropping all irrelevant columns from Google

No need to use subplots as results more definitive. I feel this is the best represtation of the Companys performance over the past 5 years, the information is much clearer and easier to read.

In [None]:
fig, ax = plt.subplots(figsize=(10,10))
Annual.plot(ax=ax,grid = True)

## Further Anaylsis

What is the best day of trade over the past 5 years?
To do this I am going to find the sum of the open for everyday and compare them 

In [None]:
Monday = Google[Google.Day.str.contains("Monday") == True]
Tuesday = Google[Google.Day.str.contains("Tuesday") == True]
Wednesday = Google[Google.Day.str.contains("Wednesday") == True]
Thursday = Google[Google.Day.str.contains("Thursday") == True]
Friday = Google[Google.Day.str.contains("Friday") == True]

Monday    = Monday['Open'].sum()
Tuesday   = Tuesday['Open'].sum()
Wednesday = Wednesday['Open'].sum()
Thursday  = Thursday['Open'].sum()
Friday    = Friday['Open'].sum()

columns =  ['Day Total']
Week = pd.DataFrame(columns=columns)
Week.loc[1] = [Monday]
Week.loc[2] = [Tuesday]
Week.loc[3] = [Wednesday]
Week.loc[4] = [Thursday]
Week.loc[5] = [Friday]

Monday seems to be a non day for trading with Friday and Thursday following. Tuesday and Wednesday seem to be the main days of trade. Changes in open

Two reasons for Monday being the worst day for trading are:
1. Monday is the first trading day after two whole days of the stock market being closed.Inturn there is a lot of time for news to come out which can affect a companys stock price.
2. A bad trading day could be because of the psychological effect Mondays have on people making the more negative and not risking trades.

In [None]:
explode = (0.15, 0, 0, 0,0)
labels = ["Monday","Tuesday","Wednesday","Thursday","Friday"]
Week.plot(subplots=True,kind='pie',figsize=(9,9),fontsize=14,explode=explode,autopct='%1.1f%%',shadow=True,labels=labels,labeldistance=1.4)
plt.legend(loc='right', bbox_to_anchor=(1.45, .55))
plt.title('Best Days Of Trade - GOOG',fontsize=10)

## Stock Return

One way of analysising a stock is by seeing its stock return which is how it performs over time in a set period. To calculate this I will be using the start of a period as 1 and when the stock return fulctuates above or below this figure I will have a good idea on how it is performing.

In [None]:
DailyReturn = Daily.drop(Daily.columns[[0,2,3]], axis=1)
MonthlyReturn = Monthly.drop(Daily.columns[[0,2,3]], axis=1)
AnnualReturn = Annual.drop(Daily.columns[[0,2,3]], axis =1)
DailyReturn.head(5)

Seperating each year

In [None]:
DailyReturn2017 = DailyReturn['2017']
MonthlyReturn2017 = MonthlyReturn['2017']

DailyReturn2016 = DailyReturn['2016']
MonthlyReturn2016 = MonthlyReturn['2016']

DailyReturn2015 = DailyReturn['2015']
MonthlyReturn2015 = MonthlyReturn['2015']

DailyReturn2014 = DailyReturn['2014']
MonthlyReturn2014 = MonthlyReturn['2014']

DailyReturn2013 = DailyReturn['2013']
MonthlyReturn2013 = MonthlyReturn['2013']

Calculating stock return for every year

In [None]:
Daily_Return2017 = DailyReturn2017.apply(lambda x: x / x[0])
Monthly_Return2017 = MonthlyReturn2017.apply(lambda x: x / x[0])

Daily_Return2016 = DailyReturn2016.apply(lambda x: x / x[0])
Monthly_Return2016 = MonthlyReturn2016.apply(lambda x: x / x[0])

Daily_Return2015 = DailyReturn2015.apply(lambda x: x / x[0])
Monthly_Return2015 = MonthlyReturn2015.apply(lambda x: x / x[0])

Daily_Return2014 = DailyReturn2014.apply(lambda x: x / x[0])
Monthly_Return2014 = MonthlyReturn2014.apply(lambda x: x / x[0])

Daily_Return2013 = DailyReturn2013.apply(lambda x: x / x[0])
Monthly_Return2013 = MonthlyReturn2013.apply(lambda x: x / x[0])

Annual_Return = AnnualReturn.apply(lambda x:x /x[0])
Daily_Return = DailyReturn.apply(lambda x:x /x[0])
Monthly_Return = MonthlyReturn.apply(lambda x:x / x[0])

First tried plot using time series dataframes but it doesn't look great

In [None]:
fig, ax = plt.subplots()

Monthly_Return2017.plot(ax=ax,label='2017',legend=False)
Monthly_Return2016.plot(ax=ax,label='2016',legend=False)
Monthly_Return2015.plot(ax=ax,label='2015',legend=False)
Monthly_Return2014.plot(ax=ax,label='2014',legend=False)
Monthly_Return2013.plot(ax=ax,label='2013',legend=False).axhline(y = 1, color = "black", lw = 2)

Resetting the index of every year and plotting them against each other giving a better representation

In [None]:
Years  = [Google2013,Google2014,Google2015,Google2016,Google2017]
Google = pd.concat(Years)
Google.head(7)

# Data Parsing & Cleaning

Preparing the columns Day, Month and Year to be changed to a datetime.
Merging the three columns using a '-' to join complying to datetimes syntax and droping the three columns.

In [None]:
Google.Year = Google.Year.astype(str)
Google.Month = Google.Month.astype(str)
Google.Day = Google.Day.astype(str)
Google['Date'] = Google[['Year', 'Month','Day']].apply(lambda x: '-'.join(x), axis=1)
Google.drop(['Year', 'Month','Day'], axis=1, inplace=True)

In [None]:
Google.head(5)

Changing the new column Date to a time series

In [None]:
Google['Date'] = pd.to_datetime(Google['Date'])
Google.head(5)

Added a new column day

In [None]:
Google['Day'] = pd.to_datetime(Google['Date'], format='%m/%d/%y').dt.weekday_name

Need to move day to first column.

In [None]:
Daily_Return2017 = Daily_Return2017.reset_index()
Daily_Return2016 = Daily_Return2016.reset_index()
Daily_Return2015 = Daily_Return2015.reset_index()
Daily_Return2014 = Daily_Return2014.reset_index()
Daily_Return2013 = Daily_Return2013.reset_index()

In [None]:
fig, ax = plt.subplots(figsize=(5,5))

Daily_Return2017['AdjustedClose'].plot(ax=ax,label='2017').axhline(y = 1, color = "black", lw = 3)
Daily_Return2016['AdjustedClose'].plot(ax=ax,label='2016')
Daily_Return2015['AdjustedClose'].plot(ax=ax,label='2015')
Daily_Return2014['AdjustedClose'].plot(ax=ax,label='2014')
Daily_Return2013['AdjustedClose'].plot(ax=ax,label='2013')
plt.legend()

Daily looks too hard to draw information from so I will use Monthly

In [None]:
Google2 = Google2.drop(['Date','High','Low','Close','AdjustedClose','Growth','Range','Day','Month','Return Change','index'], axis=1)

Checking both Nasdaq and Google are in the same format

In [None]:
Google2.head(5)

In [None]:
Nasdaq.head(5)

Merging Google and Nasdaq on the common column Dates

In [None]:
Common = pd.merge(Google2, Nasdaq, on=['Dates'])
Common.columns = ['Open-Google','Date','Open-NASDAQ']

Setting Date as the index and changing it back to a datetime.

In [None]:
Common.set_index('Date')
Common['Date'] = pd.to_datetime(Common['Date'])

In [None]:
Common.head(5)

I noticed Open-NASDAQ contained commas when in thousands so I just removed them completely.

In [None]:
GoogleRange.head(7)

This shows when Alphabets stocks fluctuated the most negatively in the stock market in the last 5 years

In [None]:
GoogleRange.tail(7)

Using this line of best fit graph you can see how as the growth figure gets bigger/smaller so does the range. Using this you could predict the range when you have the growth and visa versa.

When the growth exceeds +-5 its much harder to predict the other as these growths don't happen as often.

In [None]:
fig = plt.figure(figsize=(26, 6))
fig.suptitle("Line Of Best Fit (Range/Growth) - GOOG", fontsize=14)
gs = gridspec.GridSpec(100,100)
ax = fig.add_subplot(gs[:,40:130])
sns.regplot(GoogleSpikes['Growth'],GoogleRange['Range'],marker="+",scatter_kws={"color": "blue"}, line_kws={"color":"red"})

Below I have graphed Googles Highest and Worst ever growth which all happened in the same period of two months. As the companys stocks increased so much in July when its growth reversed it actually was not that bad of an impact as it had already grown so much.

A great rise can be seen after Googles worst growth this shows the companys stability in the stock market.

In [None]:
fig, axes = plt.subplots(nrows=2, ncols=1)
fig.subplots_adjust(hspace=0.4, wspace=0.4)
GoogleSpikes.plot(ax=axes[0],y='Growth',subplots=False,figsize=(10,10),color='green',label='015-07-10 - 2015-07-31',alpha=.75); axes[0].set_title('Googles Highest Ever Growth')
axes[0].set_xlim('2015-07-10', '2015-07-31')
axes[0].set_ylim(-13, 14)
GoogleSpikes.plot(ax=axes[1],y='Growth',subplots=False,figsize=(10,10),color='orange',label='2015-08-11 - 2015-08-31',alpha=1); axes[1].set_title('Googles Worst Ever Growth')
axes[1].set_xlim('2015-08-11', '2015-08-31')
axes[1].set_ylim(-13, 14)
plt.legend(loc='best')

Looking from a further out perspective like below this increase and decrease was still a huge raise in Googles stock prices.

In [None]:
fig, ax = plt.subplots(figsize=(20, 5))
Google['Open'].plot(ax=ax,label='2015',color='magenta')
ax.set_xlim('2015-07', '2015-09')
ax.set_ylim(500, 700)

In [None]:
fig, axes = plt.subplots(nrows=3, ncols=1)
fig.suptitle("Return Per Year (2015-2017) - GOOG", fontsize=16)
fig.subplots_adjust(hspace=0.4, wspace=0.4)

Google.plot(ax=axes[0],y= 'Return Change',figsize=(20,10), color='red',legend=True, linestyle='--', marker='o',label='2017')
axes[0].set_xlim('2017-01','2017-12')
axes[0].set_ylim(-.14, 0.15)

Google.plot(ax=axes[1],y= 'Return Change',figsize=(20,10), color='blue',legend=True, linestyle='--', marker='o',label='2016')
axes[1].set_xlim('2016-01','2016-12')
axes[1].set_ylim(-.14, 0.15)

Google.plot(ax=axes[2],y= 'Return Change',figsize=(20,12), color='green',legend=True, linestyle='--', marker='o',label='2015')
axes[2].set_xlim('2015-01','2015-12')
axes[2].set_ylim(-.14, 0.15)

plt.axvline(x=['2015-7-16'],color='red',label='High Point')

In [None]:
fig, ax = plt.subplots(figsize=(20,8))
Month.plot(ax=ax,subplots=True,fontsize=14,kind='bar',color = 'blue',alpha=0.7,legend=False)
plt.title('Average Growth Per Month - GOOG',fontsize=20)
plt.show()

## Moving Average

In [None]:
GoogleRolling = Google

This graphs shows the Rolling Average Of Google at both 15 day and 50 day intervals, this could be used to predict when it is a good time to invest in Google.

In [None]:
GoogleRolling["15day"] = np.round(GoogleRolling["Close"].rolling(window = 15, center = False).mean(), 2)
GoogleRolling["50day"] = np.round(GoogleRolling["Close"].rolling(window = 50, center = False).mean(), 2)
GoogleRolling.plot(y=['15day','50day','Close'],color =['red','blue','grey'],figsize=(15,8),lw=1.5)
plt.title('15day and 50d Moving Average (2013-2017) - GOOG',fontsize=20)

In [None]:
Google = Google.drop(['15day','50day'], axis=1)

## Volatility

Volatility the degree of variation of a trading price series over time as measured by the standard deviations.

After researching online on websites like 'http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:standard_deviation_volatility' 

I found you could calculate Volatility and use it to predict change.

In [None]:
GoogleVolatility = Google
GoogleVolatility = GoogleVolatility.drop(['Open','High','Low','Close','Growth','Range','Day','AdjustedClose','Month'], axis=1)
GoogleVolatility.columns = ['Date', 'Change']
GoogleVolatility['Volatility'] = GoogleVolatility.Change.rolling(10).std().shift()

This graph shows the Volatility of Google using a ten day rolling standard devisation.

In [None]:
Monthly_Return2017 = Monthly_Return2017.reset_index()
Monthly_Return2016 = Monthly_Return2016.reset_index()
Monthly_Return2015 = Monthly_Return2015.reset_index()
Monthly_Return2014 = Monthly_Return2014.reset_index()
Monthly_Return2013 = Monthly_Return2013.reset_index()

In [None]:
fig, ax = plt.subplots(figsize=(15,8))
fig.suptitle("Stock Return (2013-2017) - GOOG", fontsize=16)

Monthly_Return2017['AdjustedClose'].plot(ax=ax,label='2017').axhline(y = 1, color = "black", lw = 2)
Monthly_Return2016['AdjustedClose'].plot(ax=ax,label='2016')
Monthly_Return2015['AdjustedClose'].plot(ax=ax,label='2015')
Monthly_Return2014['AdjustedClose'].plot(ax=ax,label='2014')
Monthly_Return2013['AdjustedClose'].plot(ax=ax,label='2013')
plt.legend()

In [None]:
Annual_Return.plot(grid = True,color='red',figsize=(15,5)).axhline(y = 1, color = "black", lw = 2)

In [None]:
Daily_Return.plot(grid = True,color='black',figsize=(5,5),alpha=0.7).axhline(y = 1, color = "red", lw = 2)
plt.legend(bbox_to_anchor=(1, 1), shadow=True)

In [None]:
Google.head(7)

## Growth, Return and Range

The first step I believe nesseary to get more in depth answers is to create a column Growth % which will show how much the stock of a company has grown or srunk in that time period. I found two ways to calculate this one using returns and the other comparing closes.


In [None]:
Google['Growth'] = np.where(Google['Open'] < 0, Google['Open'], (((Google['AdjustedClose']-Google['AdjustedClose'].shift(1))/Google['AdjustedClose'])*100))

In [None]:
Google['Return Change'] = Daily_Return.apply(lambda x: np.log(x) - np.log(x.shift(1))) # shift moves dates back by 1.
Google.head(5)

Dropping all rows with null values incase it skews results.

Once again using resample to get Annual reports

In [None]:
OpenA = Google.Open.resample('A').first()
CloseA = Google.AdjustedClose.resample('A').last()
HighA = Google.High.resample('A').max()
LowA = Google.Low.resample('A').min()
Annual = pd.concat([OpenA, CloseA, HighA, LowA], axis=1)
Annual.head(5)


In [None]:
Google = Google.dropna()
Google.head(5)

Ordering the growth percentage by desceding showing the biggest days first

In [None]:
GoogleSpikes = Google.sort_values(['Growth'], ascending=False)

This shows Alphabets best days on the stock market in the last 5 years

In [None]:
GoogleSpikes.head(7)


Looking into Googles best day of trade in the last 5 years '2015-07-16' some very interesting information can be found.
On the website 'http://money.cnn.com/2015/07/16/technology/google-earnings-q2/index.html' there is an artical written about how Googles stock rised astronomically after Google announched its second quarter earnings and sales.This rise seemed to be attributed to the more people were using their phones had doubled and Google was profiting form ads.

This shows Alphabets worst days on the stock market in the last 5 years

In [None]:
GoogleSpikes.tail(7)

Looking into Googles worst day of trade in the last 5 years the two dates '2015-08-21' and '2013-04-19' some very interesting information can be found.

_'2015-08-21'_-  It seems there are two reasons for this huge drop in stock price. One was that only a month before Googles stock had increased around 11% and two the stock markets in general took a huge around this period according to the artical at 'https://www.nytimes.com/2015/08/22/business/dealbook/global-markets-fall-for-second-day.html'

_'2013-04-19'_-  On the website 'http://money.cnn.com/2014/04/16/technology/google-earnings/index.html' there is an artical written about how Googles stock dropped dangeriously after Google announched its first quarter earnings and sales and it didn't hit its projected target.

Another column I feel could be usefull is range which will be able to tell the difference in the open and close for each time period.

In [None]:
Google['Range'] = np.where(Google['Open'] < 0, Google['Open'], (Google['Close']-Google['Open']))

Ordering the range by desceding showing the biggest days first

In [None]:
GoogleRange = Google.sort_values(['Range'], ascending=False)

This shows when Alphabets stocks fluctuated the most positively in the stock market in the last 5 years

In [None]:
Google.describe()

I noticed that the close value of one day does not always equal the same open of the next day so I have created a column adjusted close which contains the correct closing value (adjusted close)

In [None]:
Daily.plot(subplots=True,grid = True,figsize=(10,10),layout=(2,2))

###### Monthly

Using resample to get only months.

In [None]:
OpenM = Google.Open.resample('M').first()
CloseM = Google.AdjustedClose.resample('M').last()
HighM = Google.High.resample('M').max()
LowM = Google.Low.resample('M').min()
Monthly = pd.concat([OpenM, CloseM, HighM, LowM], axis=1)
Monthly.head(7)

Much easier to read now overall performance can be clearly seen.
It seems Googles stock prices have almost trebled in the last 5 years.

In [None]:
cols = Google.columns.tolist()
cols = cols[-2:] + cols[:-2]
Google = Google[cols]

Setting date as the index

In [None]:
Google = Google.set_index('Date')
Google.head(5)

First look at the dataframe, not much can be seen excecpt for what looks to the stock rising at a contant rate

In [None]:
Google.plot(grid = True,figsize=(10,10))

In [None]:
GoogleVolatility['Volatility'].plot(figsize=(5,5))

One way of predicting change would be to multiply Googles Volatility at any stage by its AdjustedClose the next day.


In [None]:
Monthly.plot(subplots=True,grid = True,figsize=(10,10),layout=(2,2))

###### Annual

In [None]:
Common['Open-NASDAQ'] = Common['Open-NASDAQ'].str.replace(',','').astype(np.float64)

In [None]:
Common.set_index('Date',drop=True,inplace=True)
Common.head(5)

#### Plotting NASDAQ against GOOG using two different Y axis to make them comparitable

Red = Google

Green = NASDAQ