# Download Historical Stock Data

Use this notebook to download daily trading data for a group of the stocks from Yahoo Finance. The trading data is stored in a designated sub-directory (default `./data/stocks/`) as individual `.csv` files for each stock. Subsequent notebooks can read and consolidate the stock price data. 

Run the cells in the notebook once to create data sets for use by other notebook, or to refresh a previously stored set of data. The function will overwrite any existing data sets.

## Installing and Testing `pandas_datareader`

The notebook uses the `pandas_datareader` module to read data from Yahoo Finance. Web interfaces for financial services are notoriously fickle and subject to change, and a particular issue with Google Colaboratory. The following cell tests if `pandas_datareader` installed and functional. It will attempt to upgrade and restart the Python kernel. If you encounter repeated errors please report this as an issue for this notebook. 

In [3]:
import sys
import os

# attempt to install. If not found then try install
try:
    import pandas_datareader as pdr
    
except:
    !pip install -q pandas_datareader
    from pandas_datareader import data, wb, DataReader
    
# test download. If fail, then upgrade and restart kernal
try: 
    goog = pdr.DataReader("GOOG", "yahoo")
    print("pandas_datareader is installed and appears to be working correctly.")
except:
    !pip install pandas_datareader --upgrade
    os.kill(os.getpid(), 9)
    

pandas_datareader is installed and appears to be working correctly.


## Stocks to Download

Edit the following cell to download a list of stock symbols from Yahoo Finance,  `n_years` to change the historical period, or change the data directory.

In [1]:
import os

# list of stock symbols
assets = ['AXP', 'AAPL', 'AMGN', 'BA', 'CAT', 'CRM', 'CSCO', 'CVX', 'DIS', 'DOW', \
         'GS', 'HD', 'IBM', 'INTC', 'JNJ', 'JPM', 'KO', 'MCD', 'MMM', 'MRK', \
         'MSFT', 'NKE', 'PG','TRV', 'UNH', 'V', 'VZ', 'WBA', 'WMT', 'XOM']

# number of years
n_years = 3.0

# create data directory
data_dir = os.path.join('data', 'stocks')
os.makedirs(data_dir, exist_ok=True)


## Downloads

Run the following cell to download the historical stock data.

In [2]:
import pandas as pd
import datetime as datetime

# historical period
end_date = datetime.datetime.today().date()
start_date = end_date - datetime.timedelta(round(n_years*365))

# get daily price data from yahoo finance
def get_stock_data(s, path=data_dir):
    try:
        print(f"Downloading {s:6s}", end="")
        data = pdr.DataReader(s, "yahoo", start_date, end_date)
        try:
            filename = os.path.join(data_dir, s + '.csv')
            data.to_csv(filename) 
            print(f" saved to {filename}")
        except: 
            print("save failed")
    except:
        print(f"download failed")      
    
for s in assets:
    get_stock_data(s)
    

Downloading AXP    saved to data/stocks/AXP.csv
Downloading AAPL   saved to data/stocks/AAPL.csv
Downloading AMGN   saved to data/stocks/AMGN.csv
Downloading BA     saved to data/stocks/BA.csv
Downloading CAT    saved to data/stocks/CAT.csv
Downloading CRM    saved to data/stocks/CRM.csv
Downloading CSCO   saved to data/stocks/CSCO.csv
Downloading CVX    saved to data/stocks/CVX.csv
Downloading DIS    saved to data/stocks/DIS.csv
Downloading DOW    saved to data/stocks/DOW.csv
Downloading GS     saved to data/stocks/GS.csv
Downloading HD     saved to data/stocks/HD.csv
Downloading IBM    saved to data/stocks/IBM.csv
Downloading INTC   saved to data/stocks/INTC.csv
Downloading JNJ    saved to data/stocks/JNJ.csv
Downloading JPM    saved to data/stocks/JPM.csv
Downloading KO     saved to data/stocks/KO.csv
Downloading MCD    saved to data/stocks/MCD.csv
Downloading MMM    saved to data/stocks/MMM.csv
Downloading MRK    saved to data/stocks/MRK.csv
Downloading MSFT   saved to data/stocks/