# Tutorial 16 - `pandas_datareader`

The purpose of this short tutorial is to introduce the `pandas_datareader` package. It is a convenient way to download a variety of online data sources such as Yahoo Finance, Quandl, and the Federal Reserve (FRED).

This functionality used to be part of `pandas.io` submodule of `pandas` but now lives inside a separate package.

### Loading Packages

Let's begin by loading the packages that we will need.

In [1]:
import pandas as pd
import pandas_datareader as pdr

### Yahoo Finance

The function for retrieving data from Yahoo is `pdr.get_data_yahoo()`.

The following code retrieves `SPY` data from 2014-2018.

In [5]:
df_spy = pdr.get_data_yahoo('AAPL', start='2019-01-01', end='2019-12-01')
df_spy.head()

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2019-01-02,158.850006,154.229996,154.889999,157.919998,37039700.0,155.582367
2019-01-03,145.720001,142.0,143.979996,142.190002,91312200.0,140.08522
2019-01-04,148.550003,143.800003,144.529999,148.259995,58607100.0,146.065353
2019-01-07,148.830002,145.899994,148.699997,147.929993,54777800.0,145.740265
2019-01-08,151.820007,148.520004,149.559998,150.75,41025300.0,148.518509


The following code retrieves VIX data from 2014-2018.

In [57]:
df_vix = pdr.get_data_yahoo('^VIX', start='2014-01-01', end='2019-01-01')
df_vix.head()

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2014-01-02,14.59,14.0,14.32,14.23,0,14.23
2014-01-03,14.22,13.57,14.06,13.76,0,13.76
2014-01-06,14.0,13.22,13.41,13.55,0,13.55
2014-01-07,13.28,12.16,12.38,12.92,0,12.92
2014-01-08,13.24,12.86,13.04,12.87,0,12.87


**Code Challenge:** Grab the 2018 prices for `XLF`.

### Federal Reserve (FRED)

The function for retrieving data from FRED is `pdr.get_data_fred()`.

This code grabs the VXN index (a VIX-like calculation for the Nasdaq 100) for 2014-2018.

In [11]:
df_vxn = pdr.get_data_fred('VXNCLS', start='2019-01-01', end='2019-12-12')
df_vxn

Unnamed: 0_level_0,VXNCLS
DATE,Unnamed: 1_level_1
2019-01-01,
2019-01-02,30.09
2019-01-03,32.18
2019-01-04,28.57
2019-01-07,28.53
...,...
2019-12-06,16.44
2019-12-09,18.15
2019-12-10,18.02
2019-12-11,17.41


### Further Reading

At the moment, there isn't a lot of great documentation about `pandas_datareader`.  

Here is a link to the [offical docs](https://pydata.github.io/pandas-datareader/stable/), which aren't the best.

In [52]:
import os
from datetime import datetime
import pandas_datareader.data as web


In [53]:
os.environ['ALPHAVANTAGE_API_KEY'] = 'MUOT1BPLVAMSCTC9'


In [54]:
from datetime import datetime

In [62]:
f = web.DataReader("C", "av-intraday", start=datetime(2019, 10, 1),
                   end=datetime(2019, 12, 15))
                   

In [63]:
f

Unnamed: 0,open,high,low,close,volume
2019-12-09 09:31:00,75.240,75.2400,75.2400,75.2400,28426
2019-12-09 09:32:00,75.283,75.3521,75.1514,75.2330,243026
2019-12-09 09:33:00,75.240,75.2800,75.1700,75.2400,63740
2019-12-09 09:34:00,75.250,75.3700,75.2500,75.3326,46113
2019-12-09 09:35:00,75.350,75.4100,75.3000,75.3600,31751
...,...,...,...,...,...
2019-12-13 15:56:00,76.370,76.3750,76.3300,76.3300,100899
2019-12-13 15:57:00,76.330,76.3600,76.3150,76.3500,73932
2019-12-13 15:58:00,76.350,76.3900,76.3500,76.3600,76291
2019-12-13 15:59:00,76.370,76.4500,76.3650,76.4002,100983
