# Pandas Datareader Module

The `pandas_datareader` module is designed to interact with some of the world's most popular finance data APIs and import their data into an easily digestible pandas DataFrame.

Each API is accessed using a different function: generally accessing each API requires a different set of arguments and information provided by the programmer.

Click here for the [pandas_datareader documentation.](https://pandas-datareader.readthedocs.io/en/latest/index.html)

In [None]:
%pip install pandas-datareader datetime

In [None]:
from pandas_datareader import wb
from datetime import datetime

start = datetime(2005, 1, 1)
end = datetime(2008, 1, 1)
indicator_id = 'NY.GDP.PCAP.KD'

gdp_per_capita = wb.download(indicator=indicator_id, start=start, end=end, country=['US', 'CA', 'MX'])

print(gdp_per_capita)

## Getting NASDAQ Symbols

The NASDAQ stock exchange identifies each of its stocks using a unique identifier such as **APPL** for Apple.

The `pandas_datareader` module provides several functions for importing data from NASDAQ's API through the `nasdaq_trader` sub-module.

In [None]:
from pandas_datareader.nasdaq_trader import get_nasdaq_symbols

"""
Throws TypeError: read_csv() takes 1 positional argument but 2 positional arguments (and 3 keyword-only arguments) were given
"""
symbols = get_nasdaq_symbols()

print(symbols)

## Filtering Data by Date

One API that keeps track of data dating back several decades is the Federal Reserve Bank of St. Louis (FRED), whose API marker is `"fred"`. 

In [None]:
import pandas_datareader.data as web

start = datetime(2019, 1, 1)
end = datetime(2019, 2, 1)

# S&P 500 Daily Market Cap using FRED
sap_data = web.DataReader("SP500", "fred", start, end)

print(sap_data)

## API Keys

An API key is a unique string used to identify and authenticate entities requesting data.

003026bbc133714df1834b8638bb496e-8f4b3d9a-e931

Like the example above, API keys are generally long, randomly generated strings provided by the API.

Some APIs don’t require a key to access data, but in general, most do.

You can obtain a key by signing up with the website or organization hosting the API.

A good rule is to treat your API keys like you would a password. You don’t want to share them with anyone, and in the case of software development, you don’t want to check them into source control systems like GitHub

One of the risks of using public APIs is that you’re relying on an external service to work as expected at all times, and they often don’t.

When an API is intermittently offline or not working we call it **flaky**.

You can’t control if an API acts flaky, but here’s a few tips to help ensure it doesn’t prevent you from building something great.

1. Test your code - Testing as often as possible will ensure your code works from day to day and will help to identify any APIs that are consistently acting flaky.

2. Keep up to date with the datareader documentation - Because we’re accessing these finance APIs through pandas-datareader, that’s a good place to look if an API starts acting unexpectedly.

3. Actively monitoring the pandas-datareader project on GitHub - Sometimes there are bugs in the pandas-datareader project, instead of the APIs it’s calling. The github page for the project is a good place to ask questions and stay up to date on the latest issues identified in the project.

## Using the Shift Operation

Shift can be called on a single column or an entire DataFrame where all columns will be shifted. Shifting can happen in both positive and negative directions, and is useful when dealing with financial data. Below is code using the shift operation to detemine annual gdp growth.

In [None]:
import pandas_datareader.data as web
from datetime import datetime

start = datetime(2008, 1, 1)
end = datetime(2018, 1, 1)

gdp = web.DataReader("GDP", "fred", start, end)

print(gdp.head(10))

gdp["growth"] = gdp["GDP"] - gdp["GDP"].shift(1)

print(gdp.head(10))

## Variance

Variance measures how far a set of numbers are spread out from their average. In finance, this is used to determine the volatility of investments.

Can be performed on a dataframe using `dataframe.var()`.

## Covariance

Covariance, in a financial context, describes the relationship between the returns on two different investments over a period of time, and can be used to help balance a portfolio.

Calling `cov()` on our dataframe produces a matrix which defines the covariance values between each column pair in the DataFrame.