# Yahoo Finance Stock Data with Python

Data is the new oil. However just like oil, we have to extract data from somewhere before we can do anything to it. One easily accessible data repository for stock data is <a href="https://finance.yahoo.com" target="_blank">Yahoo Finance</a>. While it is possible to download .csv files of various individual stock data directly, using the <a href="https://pypi.org/project/yfinance/" target="_blank">yfinance</a> library allows for more efficient downloading of the data directly into python. 

The following python code provides a wrapper function which uses the yfinance API to download stock data directly from Yahoo Finance, and returns the data as a pandas DataFrame.

In [1]:
import pandas as pd
import yfinance

def download_stock_data(Stocks, start_date, end_date):
    """
    Wrapper function for yfinance.download()

    Inputs
    ------
    Stocks: list
        list of stocks to download from Yahoo finance
    start_date: datetime
        start date to download
    end_date: datetime
        end date to download

    Returns
    -------
    stocks: pd.DataFrame
        pd.DataFrame of stock data
    """
    # The downloaded stock data has the columns:
    # "Stock", "Open", "High", "Low", "Close", "Adj Close", "Volume"
    want = ["Stock", "Open", "High", "Low", "Close", "Adj Close", "Volume"]
    stocks = pd.DataFrame(columns = want)
    for i in Stocks:  
        print("Downloading:", i)
        stock = yfinance.download(i, start = start_date, end = end_date)
        # reset index to force the index to become "Date"
        stock = stock.reset_index()
        # add the stock's name into the data
        stock["Stock"] = i
        stocks = pd.concat([stocks, stock])  
    # re-arrange the columns as follows:
    want = ["Stock", "Date", "Open", "High", "Low", "Close", "Adj Close", "Volume"]
    stocks = stocks[want] 
    return stocks

The wrapper function `download_stock_data()` can be easily used to download data by inputting a list of stock data to download, as well as the start and end dates. Note that both the start and end dates must be datetime objects and not strings!

In [2]:
start_date = pd.to_datetime("2020-12-1") # Start date: 2020 Dec 01
end_date = pd.to_datetime("2020-12-30") # End date: 2020 Dec 30
Stocks = ["AAPL"] # Download Apple stock data

stocks = download_stock_data(Stocks, start_date, end_date)
display(stocks)

Downloading: AAPL
[*********************100%***********************]  1 of 1 completed


Unnamed: 0,Stock,Date,Open,High,Low,Close,Adj Close,Volume
0,AAPL,2020-12-01,121.010002,123.470001,120.010002,122.720001,122.146095,127728200
1,AAPL,2020-12-02,122.019997,123.370003,120.889999,123.080002,122.504417,89004200
2,AAPL,2020-12-03,123.519997,123.779999,122.209999,122.940002,122.365074,78967600
3,AAPL,2020-12-04,122.599998,122.860001,121.519997,122.25,121.678299,78260400
4,AAPL,2020-12-07,122.309998,124.57,122.25,123.75,123.17128,86712000
5,AAPL,2020-12-08,124.370003,124.980003,123.089996,124.379997,123.79834,82225500
6,AAPL,2020-12-09,124.529999,125.949997,121.0,121.779999,121.210495,115089200
7,AAPL,2020-12-10,120.5,123.870003,120.150002,123.239998,122.663666,81312200
8,AAPL,2020-12-11,122.43,122.760002,120.550003,122.410004,121.837555,86939800
9,AAPL,2020-12-14,122.599998,123.349998,121.540001,121.779999,121.210495,79184500
