# Capstone Project 2
**Predicting stock prices with a neural network**

## Introduction

The objective of this project is to use deep learning to predict movement in Mircosoft stock using a deep learning model. In order to fully capture as much relevant data as possible, much more than just the historical data for microsoft will be included in the model. We will include 3 of Microsoft's major competetors (Apple, Amazon, and Google), technical indicators for Microsoft, as well as market indicies for NASDAQ and NYSE.

## Data Wrangling
Using the Aplpha Vantage API to pull data and technical indicators for the model.

### Import Packages

In [1]:
import pandas as pd
import requests

In [2]:
api_key = ''

### Define function to pull stock data

In [3]:
def get_stock_data(symbol, start_date='2010-01-01', end_date='2018-12-31'):
    
    """
    symbol is a string representing a stock symbol, e.g. 'AAPL'
    
    start_date is the start date of the time series, defaults to '1/1/2010'
    
    end_date is the end date of the time series, defaults to '12/31/2018'
    
    start_date & end_date must be in string format
    """
    
    url = 'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=' + symbol + '&outputsize=full&apikey=' + api_key
        
    r = requests.get(url)
    
    df = r.json()
    
    df = pd.DataFrame(df['Time Series (Daily)']).T
    
    df = df[(df.index > start_date) & (df.index < end_date)]
 
    return df

In [24]:
def get_ma(symbol, period, start_date='2010-01-01', end_date='2018-12-31'):
    
    """
    funciton to pull simple moving average data for closing prices of a stock using the alpha vantage API
    
    symbol is a string representing a stock symbol, e.g. 'AAPL'
    
    period is the number of days per moving average, i.e. 7-day moving average -> period = 7
    
    start_date is the start date of the time series, defaults to '1/1/2010'
    
    end_date is the end date of the time series, defaults to '12/31/2018'
    
    start_date & end_date must be in string format
    """
    
    url = 'https://www.alphavantage.co/query?function=SMA&symbol=' + symbol + '&interval=daily&time_period='+ str(period) + '&series_type=close&apikey=' + api_key
    
    r = requests.get(url)
    
    df = r.json()
    
    df = pd.DataFrame(df['Technical Analysis: SMA']).T
    
    df = df[(df.index > start_date) & (df.index < end_date)]
 
    return df

In [17]:
symbol = 'MSFT'

url = 'https://www.alphavantage.co/query?function=SMA&symbol=' + symbol + '&interval=daily&time_period=7&series_type=close&apikey=' + api_key
    
r = requests.get(url)
    
df = r.json()

In [31]:
def get_macd(symbol, start_date='2010-01-01', end_date='2018-12-31'):
    
    """
    funciton to pull the moving average convergence/divergence data for closing prices of a stock using the alpha vantage API
    
    symbol is a string representing a stock symbol, e.g. 'AAPL'
    
    start_date is the start date of the time series, defaults to '1/1/2010'
    
    end_date is the end date of the time series, defaults to '12/31/2018'
    
    start_date & end_date must be in string format
    """
    
    url = 'https://www.alphavantage.co/query?function=MACD&symbol=' + symbol + '&interval=daily&series_type=close&apikey=' + api_key
    
    r = requests.get(url)
    
    df = r.json()
    
    df = pd.DataFrame(df['Technical Analysis: MACD']).T
    
    df = df[(df.index > start_date) & (df.index < end_date)]
 
    return df

In [37]:
def get_bbands(symbol, start_date='2010-01-01', end_date='2018-12-31'):
    """
    funciton to pull bollinger bands data for closing prices of a stock with a 21 day period using the alpha vantage API
    
    symbol is a string representing a stock symbol, e.g. 'AAPL'
    
    start_date is the start date of the time series, defaults to '1/1/2010'
    
    end_date is the end date of the time series, defaults to '12/31/2018'
    
    start_date & end_date must be in string format
    """
    
    url = 'https://www.alphavantage.co/query?function=BBANDS&symbol=' + symbol + '&interval=daily&time_period=21&series_type=close&apikey=' + api_key
    
    r = requests.get(url)
    
    df = r.json()
    
    df = pd.DataFrame(df['Technical Analysis: BBANDS']).T
    
    df = df[(df.index > start_date) & (df.index < end_date)]
 
    return df

### Create datasets

In [4]:
apple_df = get_stock_data('AAPL')

In [5]:
apple_df.head()

Unnamed: 0,1. open,2. high,3. low,4. close,5. volume
2018-12-28,157.5,158.52,154.55,156.23,42291424
2018-12-27,155.84,156.77,150.07,156.15,53117065
2018-12-26,148.3,157.23,146.72,157.17,58582544
2018-12-24,148.15,151.55,146.59,146.83,37169232
2018-12-21,156.86,158.16,149.63,150.73,95744384


In [6]:
google_df = get_stock_data('GOOGL')

In [7]:
google_df.head()

Unnamed: 0,1. open,2. high,3. low,4. close,5. volume
2018-12-28,1059.5,1064.23,1042.0,1046.68,1718352
2018-12-27,1026.2,1053.34,1007.0,1052.9,2299806
2018-12-26,997.99,1048.45,992.645,1047.85,2315862
2018-12-24,984.32,1012.12,977.6599,984.67,1817955
2018-12-21,1032.04,1037.67,981.19,991.25,5232490


In [8]:
msft_df = get_stock_data('MSFT')

In [9]:
msft_df.head()

Unnamed: 0,1. open,2. high,3. low,4. close,5. volume
2018-12-28,102.09,102.41,99.52,100.39,38169312
2018-12-27,99.3,101.19,96.4,101.18,49498509
2018-12-26,95.14,100.69,93.96,100.56,51634793
2018-12-24,97.68,97.97,93.98,94.13,43935192
2018-12-21,101.63,103.0,97.46,98.23,111242070


In [10]:
amzn_df = get_stock_data('AMZN')

In [11]:
amzn_df.tail()

Unnamed: 0,1. open,2. high,3. low,4. close,5. volume
2010-01-08,130.56,133.68,129.03,133.52,9830500
2010-01-07,132.01,132.32,128.8,130.0,11030200
2010-01-06,134.6,134.73,131.65,132.25,7178800
2010-01-05,133.43,135.479,131.81,134.69,8851900
2010-01-04,136.25,136.61,133.14,133.9,7599900


### Market indicators
We will also include some composite index data to supplement the raw stock data. This will help improve the predictive power of the model. This will be downloaded from Yahoo Finance in csv format and added to the raw data folder. 

### Technical indicators
In addition to the composite index data, we will supplement our model with some common technical indicators: 7 & 21-day moving average, moving average convergence/divergence (MACD), and bollinger bands. These will only be constructed for our target stock, Microsoft.

In [25]:
msft_ma7 = get_ma('MSFT', period = 7)

In [26]:
msft_ma7.head()

Unnamed: 0,SMA
2018-12-28,99.9557
2018-12-27,100.4671
2018-12-26,100.7114
2018-12-24,101.4929
2018-12-21,103.6814


In [27]:
msft_ma21 = get_ma('MSFT', period = 21)

In [28]:
msft_ma21.tail()

Unnamed: 0,SMA
2010-01-08,30.5082
2010-01-07,30.4563
2010-01-06,30.4248
2010-01-05,30.3871
2010-01-04,30.3333


In [33]:
msft_macd = get_macd('MSFT')

In [35]:
msft_macd.tail()

Unnamed: 0,MACD,MACD_Hist,MACD_Signal
2010-01-08,0.3346,-0.085,0.4196
2010-01-07,0.3645,-0.0764,0.4409
2010-01-06,0.4185,-0.0414,0.46
2010-01-05,0.4491,-0.0213,0.4703
2010-01-04,0.4621,-0.0135,0.4757


In [38]:
msft_bbands = get_bbands('MSFT')

In [39]:
msft_bbands.tail()

Unnamed: 0,Real Lower Band,Real Middle Band,Real Upper Band
2010-01-08,29.5024,30.5082,31.514
2010-01-07,29.3773,30.4563,31.5353
2010-01-06,29.3091,30.4248,31.5404
2010-01-05,29.2673,30.3871,31.507
2010-01-04,29.2202,30.3333,31.4465


In [39]:
msft_bbands.tail()

Unnamed: 0,Real Lower Band,Real Middle Band,Real Upper Band
2010-01-08,29.5024,30.5082,31.514
2010-01-07,29.3773,30.4563,31.5353
2010-01-06,29.3091,30.4248,31.5404
2010-01-05,29.2673,30.3871,31.507
2010-01-04,29.2202,30.3333,31.4465


### Push raw data to csv files
Now that we've got our raw data, let's dump them to CSV files for further analysis.

In [41]:
apple_df.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/apple.csv')
google_df.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/google.csv')
msft_df.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/msft.csv')
amzn_df.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/amzn.csv')
msft_ma7.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/msft_ma7.csv')
msft_ma21.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/msft_ma21.csv')
msft_macd.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/msft_macd.csv')
msft_bbands.to_csv('/Users/jessemailhot/Documents/GitHub/springboard/Capstone 2/raw data/msft_bbands.csv')