# BA886 - Capstone 2021
# Sample Python Code for Downloading Yahoo! Finance Data for U.S. Stocks
- Professor Peter Wysocki
- Perpetual Investors Projects
- February 2021





**Description:**
- The following code cells contain example code that uses the *yfinance* package to download stock market data for U.S. Companies
- Please see the short video posted on the BA886 site on QuestromTools (under Media Gallery) that gives an explanation of this code
- You may run this "ipynb" on Google Colab
or in Jupyter Notebooks

**Important: Documentation on the "yfinance" package can be found here:**

- https://aroussi.com/post/python-yahoo-finance

- https://pypi.org/project/yfinance/

- https://www.datasciencelearner.com/yahoo-finance-api-python/


In [1]:
# Install yfinance package using pip (use pip3 for installing a package for Python3)
!pip3 install yfinance

Collecting yfinance
  Downloading yfinance-0.1.55.tar.gz (23 kB)
Collecting multitasking>=0.0.7
  Downloading multitasking-0.0.9.tar.gz (8.1 kB)
Building wheels for collected packages: yfinance, multitasking
  Building wheel for yfinance (setup.py) ... [?25ldone
[?25h  Created wheel for yfinance: filename=yfinance-0.1.55-py2.py3-none-any.whl size=22618 sha256=eb798d35392b73966d92148f0560066631fb1110c79bda654ef8592b02637541
  Stored in directory: /Users/maralinetorres/Library/Caches/pip/wheels/b4/c3/39/9c01ae2b4726f37024bba5592bec868b47a2fab5a786e8979a
  Building wheel for multitasking (setup.py) ... [?25ldone
[?25h  Created wheel for multitasking: filename=multitasking-0.0.9-py3-none-any.whl size=8367 sha256=ba625200ac06b769bd633590038678a50727f691f53f21a7a586e9ed95c2e6de
  Stored in directory: /Users/maralinetorres/Library/Caches/pip/wheels/57/6d/a3/a39b839cc75274d2acfb1c58bfead2f726c6577fe8c4723f13
Successfully built yfinance multitasking
Installing collected packages: multitaski

In [2]:
# We will use the following packages: "pandas" (for dataframes) and yfinance (to get data)
# So import these packages (reference as "pd" and "yf" for the code).
import pandas as pd
import yfinance as yf

In [3]:
# We will download stock market data for 10 European companies
# This example uses the 3 companies (you have more to do)
# IMPORTANT!!!! - See instructions from Prof. Wysocki to learn how to get the correct tickers

# This list has the 3 tickers separated by spaces
tickers = 'MSFT AAPL IBM'

# We will create a pandas dataframe with the name 'df_yahoo'.
# We will use the yf (yfinance) download function to get the data from the Yahoo! Finance website.
# The yf function uses the following parameters:
#   -> tickers: The list of tickers
#   -> period: How many years of data to download? In this case 10 years '10y' from today
#   -> interval: Trequency of data. In this case, we will get monthly data '1mo'
#   -> group_by: data will be grouped by ticker

df_yahoo = yf.download(tickers, period="10y", interval="1mo", group_by = 'ticker')

[*********************100%***********************]  3 of 3 completed


In [4]:
# List the first 5 rows of the dataframe

df_yahoo.head()

Unnamed: 0_level_0,AAPL,AAPL,AAPL,AAPL,AAPL,AAPL,IBM,IBM,IBM,IBM,IBM,IBM,MSFT,MSFT,MSFT,MSFT,MSFT,MSFT
Unnamed: 0_level_1,Open,High,Low,Close,Adj Close,Volume,Open,High,Low,Close,Adj Close,Volume,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2
2011-02-15,,,,,,,,,,,,,,,,,,
2011-03-01,12.695357,12.916786,11.652143,12.446786,10.721204,11306460000.0,163.149994,167.720001,151.710007,163.070007,116.558899,129714300.0,26.6,26.780001,24.68,25.389999,20.306141,1310885000.0
2011-04-01,12.539643,12.683214,11.434286,12.504643,10.771044,9253829000.0,163.699997,173.0,162.190002,170.580002,121.926849,100769500.0,25.530001,26.870001,24.719999,25.92,20.730015,1313845000.0
2011-05-01,12.490714,12.565357,11.765,12.4225,10.700285,6912060000.0,172.110001,173.539993,165.899994,168.929993,120.747505,110804200.0,25.940001,26.25,24.030001,25.01,20.002222,1364063000.0
2011-05-06,,,,,,,,,,,,,,,,,,


In [5]:
# If we just wish to list the data for the stock ticker AAPL, we would do:

dd = df_yahoo['AAPL']

In [8]:
tickname='AAPL'
dd['Ticker'] = tickname
dd

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dd['Ticker'] = tickname


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume,Ticker
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2011-02-15,,,,,,,AAPL
2011-03-01,12.695357,12.916786,11.652143,12.446786,10.721204,1.130646e+10,AAPL
2011-04-01,12.539643,12.683214,11.434286,12.504643,10.771044,9.253829e+09,AAPL
2011-05-01,12.490714,12.565357,11.765000,12.422500,10.700285,6.912060e+09,AAPL
2011-05-06,,,,,,,AAPL
...,...,...,...,...,...,...,...
2021-01-01,133.520004,145.089996,126.379997,131.960007,131.763107,2.239426e+09,AAPL
2021-02-01,133.750000,137.880005,130.929993,135.389999,135.187988,6.600613e+08,AAPL
2021-02-05,,,,,,,AAPL
2021-02-09,,,,,,,AAPL


In [9]:
name_CSV_file = tickname + '_yfinance.csv'

dd.to_csv(name_CSV_file)