## Web APIs for data
We have been loading data from files using `read_csv()` and `read_excel()`. A second way to input data to python/pandas is by directly downloading data from a web server through an *application programming interface* or api. 

The [wikipedia page](https://en.wikipedia.org/wiki/Web_API) isn't that insightful, but an api is a way to directly querry a webserver and (in our case) ask for data. An api provides several advantages
1. You only download the data you need
2. You do not need to distribute data files with your code
3. You have access to the 'freshest data'

There are downsides, to using apis, too.

1. You need to be online to retrive the data
2. The group hosting the data may 'revise' the data, making it difficult to replicate you results

On the whole, I find apis very convienent and useful. Let's dig in. 

### The packages
The package `pandas_datareader` collects functions that interact with several popular data sources to access their apis. Thes include
* Google finance
* Morningstar
* St. Louis Fed's Fred (one of my favorites)
* The World Bank
* Eurostat
* Quandl


### Installing packages with pip
We use the Anaconda distribution which bundles python with many other useful packages. pandas_datareader, however, is not one of them. 

We will install the package using 'pip' the python package manager. 
1. Open a command window (open the start menu and type: 'cmd'). 
2. Run the command `pip install --user pandas_datareader` and hit enter

That should do it. It might take a minute, and fill the command window with text, but in the end it should have installed. You will probably see a message about updating pip. We can safely ignore it.

Now that the package is installed, we can import it into our program like usual. 

In [None]:
import pandas as pd                       # pandas, shortened to pd

# If you receive an error while trying to load data_reader try uncommenting the line below
# This is/was a problem with an older version of pandas_datareader
# pd.core.common.is_list_like = pd.api.types.is_list_like

from pandas_datareader import data, wb    # we are grabbing the data and wb functions from the package
import matplotlib.pyplot as plt           # for plotting
import datetime as dt                     # for time and date

### FRED
The FRED database is hosted by the St. Louis FRB. It house lots of economic and financial data. It is US-centric but has some international data, too. 

To use the FRED api you need to know the variable codes. The easiest way to do it to search on the [FRED website](https://fred.stlouisfed.org/).  

The pandas_datareader documentation for FRED is [here](https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#fred).

In [None]:
codes = ['GDPCA', 'LFWA64TTUSA647N']  # these codes are for real US gdp and the working age poplulation
                                      # the first code seems intuitive. the second does not
    
# We have the codes. Now go get the data. The DataReader() function returns a DataFrame
# Create datetime objects for the start date. If you do not spec an end date it returns up to the most
# recent date
start = dt.datetime(1970, 1, 1)
fred = data.DataReader(codes, 'fred', start)

fred.head()


In [None]:
fred.columns = ['gdp', 'wap']

# Let's plot real gdp per working age person
fred['gdp_wap'] = fred['gdp']*1000000000/fred['wap']  # gdp data is in billions

fred.head()

In [None]:

fig, ax = plt.subplots(figsize=(10,5))
ax.plot(fred.index, fred['gdp_wap'], color='red')

ax.set_ylabel('2012 dollars')
ax.set_title('U.S. real GDP per working-age person')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.show()



### Stock prices with Google Finance

The [documentation](https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#google-finance) now features this warning: 

>Google’a API has become less reliable during 2017. While the google datareader often works as expected, it is not uncommon to experience a range of errors when attempting to read data, especially in bulk.

Another potential downside of apis: They can change, breaking old code. 

### Stock prices with iex

According to the [docs](https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#morningstar):
>The Investors Exchange (IEX) provides a wide range of data through an API. Historical stock prices are available for up to 5 years.

In [None]:
start = dt.datetime(2017, 1, 1)
end = dt.datetime(2018, 10, 1)

sym = 'HOG'
iex = data.DataReader(sym, 'iex', start, end)

In [None]:
iex.head()

In [None]:
fig, ax = plt.subplots(figsize=(10,5))
ax.plot(iex.index, iex['close'], color='blue')

ax.set_ylabel('closing price')
ax.set_title('Harley Davidson stock prices')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.show()

In [None]:
# Ahhh! Not a good looking figure.

# We need to set the index to a datetime object so mpl can get the axis right...
iex.index = pd.to_datetime(iex.index)

In [None]:
fig, ax = plt.subplots(figsize=(10,5))
ax.plot(iex.index, iex['close'], color='blue')

ax.set_ylabel('closing price')
ax.set_title('Harley Davidson stock prices')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)

plt.show()

## Practice: APIs

Take a few minutes and try the following. Feel free to chat with those around if you get stuck. The TA and I are here, too.

How has inflation in the United States evolved over the last 60 years? Let's investigate.

1. Go the FRED website and find the code for the 'Consumer price index for all urban consumers: All items less food and energy' 
2. Use the api to get the data from 1960 to the most recent. 

3. Create a variable in your DataFrame that hold the growth rate of the CPI --- the inflation rate. Compute it in percentage terms.

4. Plot it. What patterns do you see? 

5. Challenging. We computed the month-to-month inflation rate above. This is not the inflation rate we usually care about. Can you compute and plot the year-over-year inflation rate? For example, the inflation rate for 1962-05-01 would be the cpi in 1962-05-01 divided by the cpi in 1961-05-01. \[Hint: Check the documentation for `pct_change()`.

6. Annotate the decrease in inflaton around 1983 as 'Volker disinflation'