In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt


# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

(1)
Read the data.  
Change your index to the date column.  
Sort the df by indices. 
Drop the old date column.  
Rename any column you want to make things easier later.  
Display the df.

In [None]:
df = pd.read_csv('/kaggle/input/apple-aapl-historical-stock-data/HistoricalQuotes.csv')
df.Date = pd.to_datetime(df.Date, format='%m/%d/%Y')
df = df.set_index('Date')
df = df.sort_index()
df

In [None]:
df.columns

(2) Get rid of the $ sign and make each column a number.  
We many not use each column for this particular task, but it would be useful in a future ML exercise for prediction.  
Display the df.

In [None]:
df['Open'] = df[' Open'].str.replace('$','')
df['High'] = df[' High'].str.replace('$','')
df['Low'] = df[' Low'].str.replace('$','')
df['Close'] = df[' Close/Last'].str.replace('$','')
df['Volume'] = df[' Volume']
df = df[['Close','Volume','Open','High','Low']]
df

(3) Add a new column representing the daily increase or decrease in the closing value of the stock.  
For instance, on the second date, the stock went down by -0.0200.  
On the third date, the stock went up by 0.0686.  
Display the df.

In [None]:
df['Change'] = df.Close.astype('float32').diff()
df

(4) Add a new column with the rolling mean, use a 20 day window.  
Also add a column for the rolling standard deviation with the same 20 day window.  
Display the df.

In [None]:
df['RollingMean'] = df.Close.astype('float32').rolling(20).mean()
df['RollingDeviation'] = df.Close.astype('float32').rolling(20).std()

(5) Add two new columns, one for the upper and one for the lower Bollinger bands.  
The typical Bollinger bands are calculated by adding two standard deviations away from the mean for the upper,  
and subtracting two standard deviations away from the mean for the lower.  
Remove all the NaN rows now.  
Display the df.

In [None]:
df = df.dropna()
df['Upper'] = df.RollingMean + 2 * df.RollingDeviation
df['Lower'] = df.RollingMean - 2 * df.RollingDeviation
df

(6) Create a smaller dataframe using only data over a 6 month period from Jan 1, 2014 to July 1, 2014.  
Use matplotlib to plot the following: Closing Value, Rolling Mean, Upper Bollinger, Lower Bollinger.   
Use a different color for each plot.  
Use the function fill_between to color between the Upper/Lower bands with a green color using alpha of 0.1 so it is see through.  
Show your plot.

In [None]:
df2 = df.loc[(df.index >= '2014-01-01') & (df.index <= '2014-07-01')]
df2

In [None]:
df2.dtypes
df2.Close = df2.Close.astype('float32')
df2.Open = df2.Open.astype('float32')
df2.High = df2.High.astype('float32')
df2.Low = df2.Low.astype('float32')
df2.dtypes


In [None]:
plt.plot(df2.Close)
plt.plot(df2.RollingMean)
plt.plot(df2.Upper)
plt.plot(df2.Lower)
plt.title('Stock Graph')
plt.xlabel('Date')
plt.ylabel('Price')
plt.legend(['Close', 'Roling Mean', 'Upper Band', 'Lower Band'])
plt.fill_between(df2.index, y1 = df2.Lower, y2 = df2.Upper, alpha = 0.1)
plt.show()

(7) One theory of stocks is to buy when the current value is below the lower Bollinger band  
and sell when the current value is above the upper Bollinger band.  
This does get a bit trickier because it is potentially better to buy or sell after the moment  
the stock rebounds inside the Bollinger bands, that way you do not buy or sell a stock that is in  
a gigantic freefall or a huge spike, but we will ignore that.  
Create a new column called Buy that is true each time the current closing value is lower than the lower band  
and another column for Sell that is true each time the current closing value is higher than the upper band.  
Display your df.

In [None]:
df2['Buy'] = (df2.Close < df2.Lower)
df2['Sell'] = (df2.Close > df2.Upper)
df2

(8) Display all of the rows where Buy is true.

In [None]:
df2.loc[df2.Buy == True]

(9) Display all of the rows where Sell is true.

In [None]:
df2.loc[df2.Sell == True]

(10) Display the final closing value of the last day.

In [None]:
df2.tail(1).Close

(11) You can calculate this by hand or you can think through the code to automate this task.  
Automating the task is ideal, but for this small enough set it will take a lot more time  
to figure out the logic.  
  
How much money would you have gained or lost in the following two situations:

a) You bought 500 shares of Apple on January 1st of 2014 and sold it on July 1st of 2014.

b) You bought 500 shares of Apple the first day your 'buy' column told you to. 
You sold the 500 shares the next date after that when your 'sell' column told you to. 
You keep repeating this process, either buying 500 shares or selling the 500 shares you own. 
In this exercise, you may never have more than 500 shares or less than 0.  
If you have 500 shares on July 1st, you will sell it for its closing cost.  
(Not realistic, just an exercise).

In [None]:
# df2 = df.loc[(df.index >= '2014-01-01') & (df.index <= '2014-07-01')]
# startDate = '2014-01-01'
# df.loc[df.index<='2014-01-01'].tail(1).Close
df.loc[df.index>='2014-07-01'].head(1).Close

a) You would have made $6687.15

b) There are only a few times where there is "optimal oppurtunity to buy" so you don't make as much. You buy early, sell two months later, and never buy again. You would only make $290.