Everyday trillions of bytes of financial data is sent over the internet. Whether it’s the price of a stock, an ecommerce transaction, or even information about the GDP of a country.

All of this data, when organized and managed properly can be used to build some amazing and insightful software applications.

Pandas - used to organize and format complex data into table structures called DataFrames.

Pandas-datareader - used to access public financial data from the internet and import it into Python as a DataFrame

## Importing CSV Data Manually

The easiest way to import financial data into Python is to get it from a file that is stored locally on your computer.

In order to import data from a csv like this, the pandas module has a useful function: read_csv.

import pandas as pd

data = pd.read_csv("data.csv")

![p](https://i.imgur.com/m3RoClf.jpg)

## Importing Data Using Datareader

Many financial institutions, stock markets, and world banks provide large amounts of the data they store to the public.

Most of this data is well organized, live updated, and accessible through the use of an application programming interface (API), which gives programming languages like Python a way to download and import it.

### Pandas-Datareader Module

The pandas-datareader module is designed specifically to interact with some of the world’s most popular finance data APIs, and import their data into an easily digestible pandas DataFrame.

Each finance API is accessed using a different function exposed by pandas-datareader. Generally accessing each API requires a different set of arguments and information that needs to be provided by the programmer.

![i](https://i.imgur.com/lKFo7Pl.jpg)

## Getting NASDAQ Symbols

The NASDAQ stock exchange identifies each of it’s stocks using a unique symbol:

Apple - APPL

Google - GOOGL

Tesla - TSLA

It also provides a useful API for accessing the symbols that are currently trading on it.

Pandas-datareader provides several functions for importing data from NASDAQ’s API through it’s nasdaq_trader sub-module.

from pandas-datareader.nasdaq_trader import func

In the code above we’re importing a function called func from the NASDAQ submodule.

In order to import the list of stock symbols, we’ll want to use nasdaq_trader’s get_nasdaq_symbols function.

symbols = get_nasdaq_symbols()

When called, it will go off to NASDAQ’s API, and import the list of symbols trading at that moment.

The benefit of using pandas-datareader is that all of the logic for interacting with NASDAQ’s API or any other API is encapsulated into easy to use sub-modules and functions like the ones above.

![p](https://i.imgur.com/3vNgLgJ.jpg)

## Filtering Data by Date

Many of the APIs pandas-datareader connects with allow us to filter the data we get back by time.

Financial institutions tend to keep track of data dating back several decades, and when we’re importing that data, it’s useful to be able to specify exactly when we want it to be from.

One API that does just that is the Federal Reserve Bank of St. Louis (FRED), which we can access by first importing the pandas_datareader.data sub-module and then calling it’s DataReader function:

import pandas_datareader.data as web

web.DataReader(‘MORTGAGE30US’, ‘fred’, start_date, end_date)

The DataReader function takes 4 arguments:

'MORTGAGE30US' - An identifier provided by the API specifying the data we want back, in this case 30 year mortgage data in the US

'fred' - The name of the API we want to access

start_date, end_date - The date range we want the data to be from

The start and end dates are special data types called datetimes, which can be created using the Python datetime module.

from datetime import datetime

start_date = datetime(2018, 7, 8) # year, month, day

end_date = datetime(2019, 4, 13)

![p](https://i.imgur.com/RfY19pH.jpg)

## API Keys

Many finance APIs require us to pass along extra information when requesting data, one common argument is an API key.

An API key is a unique string used to identify and authenticate entities requesting data.

API keys are generally long, randomly generated strings provided by the API.

Some APIs, like the ones we’ve looked at so far, don’t require a key to access data, but in general, most do.

You can obtain a key by signing up with the website or organization hosting the API.

A good rule is to treat your API keys like you would a password. You don’t want to share them with anyone, and in the case of software development, you don’t want to check them into source control systems like GitHub.

### Using API Keys

In some cases you’ll pass the API key directly into the pandas-datareader function you’re using to access the API.

Other times you’ll be required to set the API key as a more secure operating system (os) environment variable like with the quandl API below:

os.environ["QUANDL_API_KEY"] = "demo"

df = web.DataReader('AAPL.US', 'quandl', start, end)

***Note***: It’s never a good idea to enter an API key on a website you don’t own. If you want to run this code with an actual API key do so on your local machine. Writing code with an API key on any online code editor including Codecademy is a bad practice.



## Flaky APIs and Changing Contracts

One of the risks of using public APIs is that you’re relying on an external service to work as expected at all times, and they often don’t.

When an API is intermittently offline or not working we call it flaky.

We can’t control if an API acts flaky, but here’s a few tips to help ensure it doesn’t prevent us from building something great.

1. Test your code - Testing as often as possible will ensure your code works from day to day and will help to identify any APIs that are consistently acting flaky.

2. Keep up to date with the datareader documentation - Because we’re accessing these finance APIs through pandas-datareader, that’s a good place to look if an API starts acting unexpectedly.

3. Actively monitoring the pandas-datareader project on GitHub - Sometimes there are bugs in the pandas-datareader project, instead of the APIs it’s calling. The github page for the project is a good place to ask questions and stay up to date on the latest issues identified in the project.

## Using the Shift Operation

Once we’ve imported a DataFrame full of finance data, there’s some pretty cool ways we can manipulate it.

Shift can be called on a single column, or on the entire DataFrame where all columns will be shifted.

You can also shift by more than one row, and in either direction.

Comment - # shifts all rows down by 1

dataframe.shift(1); 

Comment - # shifts all rows in name column up 5

dataframe['name'].shift(-5); 

Comment - # shifts all rows in the price column down 3

dataframe['price'].shift(3); 

Shift is particularly useful when dealing with financial data. For example, it can be used to help calculate the percentage of growth between one row and the next, or find the difference in stock prices over a series of days.

![p](https://i.imgur.com/pr0KInt.jpg)

![p](https://i.imgur.com/WIVNOz9.jpg)

## Calculating Basic Financial Statistics

Two useful calculations that can be made on financial data are variance and covariance.

### Variance

Variance measures how far a set of numbers are spread out from their average. In finance, this is used to determine the volatility of investments.

dataframe['stocks'].var(); # 106427

dataframe['bonds'].var(); # 2272

In the variance calculations above, stocks have a larger value than bonds.

That’s because the stock prices are more spread out than bonds, indicating that stocks are a more volatile investment.

### Covariance

Covariance, in a financial context, describes the relationship between the returns on two different investments over a period of time, and can be used to help balance a portfolio.

Calling cov() on our stocks/bonds produces a matrix which defines the covariance values between each column pair in the DataFrame.

In our example data, when stock prices go up, bonds go down. We can use the covariance function to see this numerically.

![p](https://i.imgur.com/vzk28Y8.jpg)

## Review or Summary of whatever has been discussed above

1. Python is able to import financial data from csv files as well as public financial APIs.
2. The pandas read_csv function can be used to import data from a csv file into a pandas dataframe.
3. Pandas-datareader makes it easy to import data from public financial APIs.
4. Python’s datetime function can be used to create datetime objects which are often used to specify time ranges for financial data.
5. API keys are unique identifiers required for some APIs in order to access data.
6. Sometimes APIs can be flaky. To mitigate the damage this might cause it’s best to test your code often and keep up to date with the pandas-datareader documentation and GitHub page.
7. The shift function can be used on the rows in a DataFrame column to shift them up or down.
8. Pandas provides common statistical functions like var and cov to make it easy to calculate variance and covariance on a dataset.

![p](https://i.imgur.com/W62htg1.jpg)

![p](https://i.imgur.com/WpghjaT.jpg)

# Quiz and it's solution

![p](https://i.imgur.com/z0j2kog.jpg)

![p](https://i.imgur.com/wzAewe5.jpg)

![p](https://i.imgur.com/vclUgdE.jpg)

![p](https://i.imgur.com/cOHwtfv.jpg)

![p](https://i.imgur.com/OY8GHbv.jpg)

![p](https://i.imgur.com/LhJZvRo.jpg)

![p](https://i.imgur.com/B2Ai5Rh.jpg)

![p](https://i.imgur.com/RamElnZ.jpg)

![p](https://i.imgur.com/xXwWkPf.jpg)