# Jagwinder Singh

## Complete the notebook
(version 10/27/2020)

## Resampling

### Percent Change - Single Starting Point & Resampling

1) Enter all imports needed & code to read the Quandl API key from a text file and place it into a variable:

In [None]:
import quandl
import pandas as pd
import pickle
import matplotlib.pyplot as plt
from matplotlib import style
style.use('default')

api_key = open('quandlapikey.txt','r').read()

2) Bulk download data:
* Write the code to get the postal codes website HTML code
* Write the code to create an empty Pandas Dataframe
* Write a loop using the U.S. postal codes to bulk download 'Housing Price Index' for each State from Quandl's API server, do a percent change from a single starting point as the data bulk is downloading and store the downloaded percent data into the empty Pandas Dataframe

In [None]:
fifty_states = pd.read_html('https://www.infoplease.com/us/postal-information/state-abbreviations-and-state-postal-codes')
fifty_states = fifty_states[0]['Postal Code']

main_df = pd.DataFrame()

for abbv in fifty_states:
    query = "FMAC/HPI_"+str(abbv)
    df = quandl.get(query, authtoken=api_key)
    df = df[['NSA Value']]
    df.rename(columns={'NSA Value':abbv}, inplace=True)
    df[abbv] = (df[abbv]-df[abbv][0]) / df[abbv][0] * 100.0
        
    if main_df.empty: 
        main_df = df 
    else:
        main_df = main_df.join(df) 

main_df.to_pickle('fifty_states_pc_ssp.pickle')

HPI_data = pd.read_pickle('fifty_states_pc_ssp.pickle')

3) Print the head:

In [None]:
print(HPI_data.head())

4) Print the data description of the Dataframe:

In [None]:
print(HPI_data.describe())

### Resampling

5) Print the head of the 'FL' column only:

In [None]:
print(HPI_data['FL'].head())

6) Graph column 'FL' only and show legend:

In [None]:
HPI_data['FL'].plot()
plt.legend()
plt.show()

7) Resample the 'FL' column quarterly and place in a new Dataframe called 'FL_Quarterly':

In [None]:
FL_Quarterly = HPI_data['FL'].resample('Q').mean() 

8) Print the head of the 'FL_Quarterly':

In [None]:
print(FL_Quarterly.head())

9) Resample the 'FL' column annually and place in a new Dataframe labeled 'FL_Annually':

In [None]:
FL_Annually = HPI_data['FL'].resample('A').mean() 

10) Print the head of the 'FL_Annually' only:

In [None]:
print(FL_Annually.head())

11) Graphically compare the 'FL' data, the 'FL_Quarterly' data and 'FL_Annually' data and show the legend:

In [None]:
fig = plt.figure()
ax1 = plt.subplot2grid((1,1), (0,0))

HPI_data['FL'].plot(ax=ax1)
FL_Quarterly.plot(color='g',ax=ax1, label='FL_Quarterly')
FL_Annually.plot(color='r',ax=ax1, label='FL_Annually')
plt.legend()
plt.show()

### Percent Change - Point to Point & Resampling

12) Bulk download data:
* Write the code to get the postal codes website HTML code
* Write the code to create an empty Pandas Dataframe
* Write a loop using the U.S. postal codes to bulk download 'Housing Price Index' for each State from Quandl's API server, calculate the point to point percent change and store the calculated downloaded data into the empty Pandas Dataframe

In [None]:
fifty_states = pd.read_html('https://www.infoplease.com/us/postal-information/state-abbreviations-and-state-postal-codes')
fifty_states = fifty_states[0]['Postal Code']

main_df = pd.DataFrame()

for abbv in fifty_states:
    query = "FMAC/HPI_"+str(abbv)
    df = quandl.get(query, authtoken=api_key) 
    df = df[['NSA Value']]
    df.rename(columns={'NSA Value':abbv}, inplace=True)
    df = df.pct_change() 

    if main_df.empty:
        main_df = df
    else:
        main_df = main_df.join(df, lsuffix=abbv)

main_df.to_pickle('fifty_states_pc_p2p_get.pickle')

HPI_data = pd.read_pickle('fifty_states_pc_p2p_get.pickle')

13) Print the head:

In [None]:
print(HPI_data.head())

14) Print the data description of the Dataframe:

In [None]:
print(HPI_data.describe())

### Resampling

15) Print the head of the 'FL' column only:

In [None]:
print(HPI_data['FL'].head())

16) Resample the 'FL' column annually and place in a new Dataframe labeled 'FL_Annually':

In [None]:
FL_Annually = HPI_data['FL'].resample('A').mean() 

17) Print the head of the 'FL_Annually':

In [None]:
print(FL_Annually.head())

18) Compare Graphically the column 'FL' and DataFrame 'FL_Annually'; show the legend:

In [None]:
fig = plt.figure()
ax1 = plt.subplot2grid((1,1), (0,0))

HPI_data['FL'].plot(ax=ax1)
FL_Annually.plot(color='r',ax=ax1, label='FL_Annually')
plt.legend()
plt.show()