<a href="https://colab.research.google.com/github/Extremumone/Colab_things/blob/main/Extract_Economic_US_FED_Data_using_Python_Mikityuk.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Extract Economic US Federal Reserve Economic Data

## Overview
| Detail Tag            | Information                                                                                        |
|-----------------------|----------------------------------------------------------------------------------------------------|
| Originally Created By | Ariel Herrera arielherrera@analyticsariel.com                                                      |
| External References   | Pandas Datareader & Federal Reserve Economic Data (FRED) |
| Input Datasets        | FRED API doc key                                                                                    |
| Output Datasets       | Series values for time range |
| Input Data Source     | String |
| Output Data Source    | CSV |

## History
| Date         | Developed By  | Reason                                                |
|--------------|---------------|-------------------------------------------------------|
| 26th Sep 2021 | Ariel Herrera | Create notebook. |
|nn Nov 2021 | Tatyana Mikityuk | Change to NYT Unemployment Data |

## Getting Started
1. Copy this notebook -> File -> Save a Copy in Drive
2. Request [FRED API Key](https://fred.stlouisfed.org/docs/api/api_key.html)

## Useful Resources
- [Google Collab Cheat Sheet](https://towardsdatascience.com/cheat-sheet-for-google-colab-63853778c093)
- [NLP Resource](https://stackabuse.com/python-for-nlp-parts-of-speech-tagging-and-named-entity-recognition/)
- [Pandas Datareader & Federal Reserve Economic Data (FRED)](https://medium.com/swlh/pandas-datareader-federal-reserve-economic-data-fred-a360c5795013)
- [Predicting The Housing Market Is Easier Than You Think](https://medium.com/swlh/predicting-the-housing-market-is-easier-than-you-think-45239a366dc1)
- [Plotly Express](https://plotly.com/python/basic-charts/)

In [None]:
# Hello, world! 

## <font color="blue">Install Packages</font>

In [None]:
!pip install pandas-datareader -q

## <font color="blue">Imports</font>

In [None]:
from google.colab import drive, files # specific to Google Colab
import pandas_datareader as pdr # access fred
import pandas as pd
import requests # data from api
import plotly.express as px # visualize
from datetime import datetime

## <font color="blue">Functions</font>

In [None]:
def get_fred_series_data(api_key,
                         series):
  # url
  url = "https://api.stlouisfed.org/geofred/series/data?series_id={0}&api_key={1}&file_type=json".format(series, api_key)
  # response
  response = requests.request("GET", url)
  return response

In [None]:
def transform_series_response(response):
  latest_date = list(response.json()['meta']['data'].keys())[0]
  return pd.DataFrame(response.json()['meta']['data'][latest_date])

In [None]:
def get_fred_data(param_list, start_date, end_date):
  df = pdr.DataReader(param_list, 'fred', start_date, end_date)
  return df.reset_index()

## <font color="blue">Locals & Constants</font>

In [None]:
# get keys
fred_api_key = '7a12ccbff12da095763c142f3995c79d' # replace this with your own key

## <font color="blue">Data</font>

In [None]:
series = 'NYUR' # https://fred.stlouisfed.org/series/UNRATE

In [None]:
# get data for series
df = get_fred_data(param_list=['NYUR'], 
                   start_date='2018-12-31', 
                   end_date=datetime.today().strftime('%Y-%m-%d'))
df.head()

Unnamed: 0,DATE,NYUR
0,2019-01-01,4.0
1,2019-02-01,3.9
2,2019-03-01,3.9
3,2019-04-01,3.8
4,2019-05-01,3.8


In [None]:
# plot
fig = px.line(df, x="DATE", y="NYUR", title='Unemployment Rate in New York')
fig.show()

In [None]:
# get all series ids per series
response = get_fred_series_data(fred_api_key, series)
# transform response into a dataframe
df_all_series_ids = transform_series_response(response)
print(df_all_series_ids)

                  region code value series_id
0                Alabama   01   3.1      ALUR
1                 Alaska   02   6.0      AKUR
2                Arizona   04   4.7      AZUR
3               Arkansas   05   3.4      ARUR
4             California   06   6.9      CAUR
5               Colorado   08   5.1      COUR
6            Connecticut   09   6.0      CTUR
7               Delaware   10   5.1      DEUR
8   District of Columbia   11   6.0      DCUR
9                Florida   12   4.5      FLUR
10               Georgia   13   2.8      GAUR
11                Hawaii   15   6.0      HIUR
12                 Idaho   16   2.6      IDUR
13              Illinois   17   5.7      ILUR
14               Indiana   18   3.0      INUR
15                  Iowa   19   3.7      IAUR
16                Kansas   20   3.6      KSUR
17              Kentucky   21   4.1      KYUR
18             Louisiana   22   5.1      LAUR
19                 Maine   23   4.8      MEUR
20              Maryland   24   5.

In [None]:
# get all series to a list
series_list = df_all_series_ids['series_id'].tolist()
print('Length of series list:', len(series_list) + 1)
series_list[:5] # show first five in list

Length of series list: 52


['ALUR', 'AKUR', 'AZUR', 'ARUR', 'CAUR']

In [None]:
# set range for time
start_date = '2019-01-01'
end_date = datetime.today().strftime('%Y-%m-%d') # today

# get series data
df_rate_all_series = get_fred_data(param_list=series_list, # all series to get data for
                                      start_date=start_date, # start date
                                      end_date=end_date) # get latest date
df_rate_all_series.head()

Unnamed: 0,DATE,ALUR,AKUR,AZUR,ARUR,CAUR,COUR,CTUR,DEUR,DCUR,FLUR,GAUR,HIUR,IDUR,ILUR,INUR,IAUR,KSUR,KYUR,LAUR,MEUR,MDUR,MAUR,MIUR,MNUR,MSUR,MOUR,MTUR,NEUR,NVUR,NHUR,NJUR,NMUR,NYUR,NCUR,NDUR,OHUR,OKUR,ORUR,PAUR,RIUR,SCUR,SDUR,TNUR,TXUR,UTUR,VTUR,VAUR,WAUR,WVUR,WIUR,WYUR
0,2019-01-01,3.6,5.8,5.0,3.6,4.3,3.0,3.7,3.5,5.8,3.5,3.9,2.7,2.9,4.5,3.5,2.7,3.3,4.2,4.7,3.0,3.7,3.2,4.2,3.2,5.4,3.2,3.7,3.0,4.2,2.6,3.6,5.2,4.0,4.0,2.3,4.3,3.2,4.2,4.3,3.8,3.2,3.0,3.4,3.7,2.8,2.2,2.9,4.5,5.0,3.1,3.7
1,2019-02-01,3.5,5.7,5.0,3.6,4.3,2.9,3.7,3.5,5.8,3.4,3.8,2.7,2.9,4.5,3.4,2.7,3.3,4.2,4.6,2.8,3.6,3.2,4.2,3.2,5.4,3.2,3.6,3.0,4.1,2.6,3.5,5.1,3.9,4.0,2.3,4.2,3.2,4.1,4.3,3.7,3.2,3.0,3.4,3.6,2.7,2.2,2.9,4.5,5.0,3.1,3.6
2,2019-03-01,3.3,5.6,5.0,3.5,4.2,2.8,3.6,3.4,5.7,3.4,3.7,2.6,2.8,4.4,3.3,2.7,3.2,4.1,4.4,2.7,3.6,3.1,4.2,3.2,5.4,3.2,3.6,3.0,4.1,2.6,3.4,5.1,3.9,4.0,2.3,4.1,3.1,4.0,4.3,3.6,3.2,3.0,3.3,3.5,2.7,2.2,2.8,4.4,4.9,3.2,3.5
3,2019-04-01,3.2,5.5,4.9,3.4,4.1,2.7,3.6,3.4,5.6,3.3,3.6,2.6,2.8,4.2,3.3,2.7,3.1,4.1,4.4,2.6,3.5,3.1,4.2,3.2,5.4,3.1,3.6,3.0,4.0,2.6,3.3,5.1,3.8,3.9,2.3,4.0,3.1,3.9,4.3,3.5,3.1,3.0,3.3,3.5,2.6,2.2,2.8,4.3,4.8,3.2,3.5
4,2019-05-01,3.1,5.4,4.9,3.4,4.1,2.6,3.5,3.4,5.4,3.3,3.6,2.5,2.8,4.1,3.2,2.7,3.1,4.1,4.3,2.6,3.5,3.0,4.2,3.1,5.4,3.1,3.6,3.0,3.9,2.6,3.2,5.0,3.8,3.9,2.3,4.0,3.0,3.8,4.3,3.5,2.9,3.0,3.3,3.4,2.6,2.2,2.7,4.2,4.8,3.3,3.5


In [None]:
# transform columns to single column
df_melt = pd.melt(df_rate_all_series, id_vars=['DATE'], value_vars=series_list, var_name='STATE', value_name='RATE')
df_melt.head()

Unnamed: 0,DATE,STATE,RATE
0,2019-01-01,ALUR,3.6
1,2019-02-01,ALUR,3.5
2,2019-03-01,ALUR,3.3
3,2019-04-01,ALUR,3.2
4,2019-05-01,ALUR,3.1


In [None]:
# modify state abbreviation
df_plot = df_melt.copy() # copy df
df_plot['STATE'] = df_plot.apply(lambda x: x['STATE'][:2], axis=1)
df_plot.head()

Unnamed: 0,DATE,STATE,RATE
0,2019-01-01,AL,3.6
1,2019-02-01,AL,3.5
2,2019-03-01,AL,3.3
3,2019-04-01,AL,3.2
4,2019-05-01,AL,3.1


In [None]:
# plot
fig = px.line(df_plot, 
              x="DATE", # horizontal axis
              y="RATE", # vertical axis
              color='STATE', # split column
              title='Unemployment Rate by States in US')
fig.show()

In [None]:
# download file
file_name = f'{series}_{start_date}-{end_date}.csv'
df_plot.to_csv(file_name, index=False)
files.download(file_name)
print('Download {0}'.format(file_name))

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Download NYUR_2019-01-01-2021-12-18.csv
