# Modeling US Dollar Inflation

This project notebook is a playground for creating and testing different kinds of models to forecast US Dollar Inflation.  The vast majority of the data I use is extracted from the St. Louis Federal Reserve's FRED API, and some of the stock market data and data on gold prices are from the Wall Street Journal.  This project uses the same data that a different, but related project of mine has already extracted, so the extraction code is copied at the bottom to be re-used for this notebook specifically.

Inflation is incredibly difficult to model, as there are numerous variables simultaneously placing upward and downward pressures on the value of the US Dollar.  My ultimate goals are 1) to create precise models using existing techniques, and 2) to create two of my own inflation models by combining techniques or by implementing an idea I have to create a model based on vectors.  I will test the performance of all models, and highlight those that perform the best.  

I expect that neural network regressions will have the greatest success in modeling inflation due to their ability to account for the complexity surrounding the conflicting pressures mentioned above.

#### Models created below:
- Model 1: Basic Neural Network Regression using SKlearn.  Grade: F.  Used as a baseline.
- Model 2: Basic Neural Network Regression using Keras; fed raw values.  Grade: C-.
- Model 3: Neural Network Regression using Keras; fed scaled values. Grade: B.
- Model 4: Deep Neural Network Regression using Keras; fed scaled values.  Grade: A-
- Model 5: Custom model based on variants of the Quantity Theory of Money (QTM).  Grade: TBD.  Under long-term construction.  Model will ultimately be an ensemble model, bringing together regression models and neural networks that model different components of the QTM.

#### In no particular order, the models that I want to build but have not yet done:
- Various types of linear and non-linear regression; scaling, splitting, and manipulating the data in different ways
- MARS: Multivariate Adaptive Regression Splines
- Support Vector Machine; could be useful given presence of outliers in many predictors
- Creation of models by economic sector using a mix of techniques, and combining them to create an ensemble model

In [1]:
# Import Dependencies and API Key

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import requests
import os
import json
from tensorflow import keras
# Import API key. Git ignore to be used in the future to hide FRED API key, but isn't utilized yet
from api_keys import fred_key

In [20]:
from tensorflow import keras

In [23]:
from sklearn.linear_model import LinearRegression

## Model 1: Basic Neural Network using SK-Learn

#### Grade: F

These models are flawed, somewhat purposely so.  I wanted a baseline for model accuracy using a simple model, given how difficult inflation is to model.  The 8 input variables were manually chosen by me instead of using any feature selection methods.  Additionally, I did not scale the data from the main series.

In [373]:
m2_df = m1m2_df[['m2','m2_change','m2_pct_change']].reset_index()
government_expenditures_df = government_quarterly_df[['government_expenditures','government_expenditures_change','government_expenditures_pct_change']].reset_index()
net_exports_df = foreign_trade_month_quarter_df[['net_exports','net_exports_change','net_exports_pct_change']].reset_index()
inflation_expectation_df = cpi_monthly_df[['inflation_expectation','inflation_expectation_change','inflation_expectation_pct_change']].reset_index()
gdp_df = gdp_quarterly_df[['gdp','gdp_change','gdp_pct_change']].reset_index()
reserve_balances_df = banks_week_month_df[['reserve_balances','reserve_balances_change','reserve_balances_pct_change']].reset_index()
unemployment_df = consumer_monthly_df[['unemployment','unemployment_change','unemployment_pct_change']].reset_index()
real_output_hour_df = consumers_quarterly_df[['real_output_hour','real_output_hour_change','real_output_hour_pct_change']].reset_index()
ppi_all_commodities_df = ppi_monthly_df[['ppi_all_commodities','ppi_all_commodities_change','ppi_all_commodities_pct_change']].reset_index()
cpi_df = cpi_monthly_df[['cpi','cpi_change','cpi_pct_change']].reset_index()
basic_neural_predictor_df_list = [government_expenditures_df,net_exports_df,inflation_expectation_df,gdp_df,reserve_balances_df,\
                                 unemployment_df,real_output_hour_df,ppi_all_commodities_df,cpi_df]
non_df_list = ['government_expenditures','net_exports','inflation_expectation','gdp','reserve_balances','unemployment','real_output_hour',\
              'ppi_all_commodities','cpi']

In [267]:
# m2_change_array = m2_df['m2_change'].values.reshape(-1,1)
# m2_df['m2_scaled_change']= StandardScaler().fit_transform(m2_change_array)
# m2_change_array = m2_df['m2_pct_change'].values.reshape(-1,1)
# m2_df['m2_scaled_pct_change']= StandardScaler().fit_transform(m2_change_array)
# scaled_change_df = m2_df
# # basic_neural_all_df = m2_df
# for i in range(len(basic_neural_predictor_df_list)):
#     change_array = basic_neural_predictor_df_list[i].iloc[:,2].values.reshape(-1,1)
#     pct_change_array = basic_neural_predictor_df_list[i].iloc[:,3].values.reshape(-1,1)
#     basic_neural_predictor_df_list[i][f'{non_df_list[i]}_scaled_change'] = StandardScaler().fit_transform(change_array)
#     basic_neural_predictor_df_list[i][f'{non_df_list[i]}_scaled_pct_change'] = StandardScaler().fit_transform(pct_change_array)
#     scaled_change_df = scaled_change_df.merge(basic_neural_predictor_df_list[i], how="inner", on="Date")
# scaled_change_df
    
# m1_change_array = pre_pandemic_m1m2_df['m1_change'].values.reshape(-1,1)
# pre_pandemic_m1m2_df['m1_change_scaled']= StandardScaler().fit_transform(m1_change_array)
# pre_pandemic_m1m2_df

In [268]:
# basic_neural_all_df = basic_neural_all_df.set_index("Date")
# scaled_change_df = scaled_change_df.loc[scaled_change_df['Date']<='2021-01-01',:]
# scaled_change_df = scaled_change_df.loc[scaled_change_df['Date']>='1971-09-01',:]
# scaled_change_df = scaled_change_df.reset_index()
# scaled_change_df

In [136]:
# def clean_dataset(df):
#     assert isinstance(df, pd.DataFrame)
#     df.dropna(inplace=True)
#     indices_to_keep = ~df.isin([np.nan, np.inf, -np.inf]).any(1)
#     return df[indices_to_keep].astype(np.float64)

In [158]:
# X = scaled_change_df[['m2_scaled_change','government_expenditures_scaled_change','net_exports_scaled_change',\
#                          'gdp_scaled_change','reserve_balances_scaled_change','unemployment_scaled_change','real_output_hour_scaled_change',\
#                          'ppi_all_commodities_scaled_change']]
# y = scaled_change_df[['cpi_scaled_change']]

In [254]:
Z = scaled_change_df[['m2','government_expenditures','net_exports',\
                         'gdp','reserve_balances','unemployment','real_output_hour',\
                         'ppi_all_commodities']]
q = scaled_change_df[['cpi']]

In [270]:
from sklearn.neural_network import MLPRegressor
# basic_neural = MLPRegressor(hidden_layer_sizes=(256))

In [269]:
# basic_neural_model = basic_neural.fit(X,y)

In [188]:
## m2, government_expenditures, net_exports, gdp, reserve_balances, unemployment, real_output_hour, ppi_all_commodities
Z_predict = [[17039.1,10910.4,-538.879,19477.4,2953.6,14.8,110.639,185.5]]#256.192
Z_predict2 = [[20122.7,9245.72,-881.689,22741,3887.3,6.1,112.669,217.9]]#266.832
# X_predict=[[15.93,]]
# X_predict2 = [[2.897]]

In [202]:
basic_neural_no_scaling_main_series = MLPRegressor(hidden_layer_sizes=(256))
no_scaling_main_series_model = basic_neural_no_scaling_main_series.fit(Z,q)

  return f(**kwargs)


In [203]:
main_prediction = no_scaling_main_series_model.predict(Z_predict2)
main_prediction

array([396.20055939])

## Model 2: Neural Network Regression using Keras

#### Grade: C-

I fed this model raw input values from each of the eight chosen inputs.  The dates range from 9/1/1971 to 1/1/2020 to reflect the post-Bretton Woods era to the beginning of the COVID-19 pandemic.  I wanted to see if a basic, un-tuned neural network could predict CPI after the massive increases in the monetary base, reserve balances, and the all-time low in net exports.  The model predicted a CPI of 271.97 and 277.64 for 4/1/2020 and 1/1/2021, when the actual values were 256.19 and 266.83, respectively.  Though far from the actual CPI, the model itself is not necessarily that bad given the unprecedented changes in the input variables.  Since the model is a basic regression neural network, it does not factor in time series lag, so it associates the large increases/decreases in certain variables with an immediate response in the target variable.  Another factor influencing performance is that the model was fed the raw values for all variables instead of scaled values.

In [228]:
from keras import Sequential
model = Sequential()

In [229]:
from keras.layers import Dense
model.add(Dense(100, activation='relu',input_dim=8))
model.add(Dense(units=1))

In [230]:
model.compile(optimizer='adam',loss='mean_squared_error',metrics='mean_squared_error')

In [255]:
Z = np.array(Z).astype("float32")

In [256]:
q = np.array(q).astype("float32")

In [265]:
model.fit(Z,q,epochs=100,shuffle=True,verbose=2)

Epoch 1/100
7/7 - 0s - loss: 0.0376 - mean_squared_error: 0.0376
Epoch 2/100
7/7 - 0s - loss: 0.0125 - mean_squared_error: 0.0125
Epoch 3/100
7/7 - 0s - loss: 0.0067 - mean_squared_error: 0.0067
Epoch 4/100
7/7 - 0s - loss: 0.0044 - mean_squared_error: 0.0044
Epoch 5/100
7/7 - 0s - loss: 0.0020 - mean_squared_error: 0.0020
Epoch 6/100
7/7 - 0s - loss: 0.0017 - mean_squared_error: 0.0017
Epoch 7/100
7/7 - 0s - loss: 0.0013 - mean_squared_error: 0.0013
Epoch 8/100
7/7 - 0s - loss: 0.0011 - mean_squared_error: 0.0011
Epoch 9/100
7/7 - 0s - loss: 9.5192e-04 - mean_squared_error: 9.5192e-04
Epoch 10/100
7/7 - 0s - loss: 8.2736e-04 - mean_squared_error: 8.2736e-04
Epoch 11/100
7/7 - 0s - loss: 6.3802e-04 - mean_squared_error: 6.3802e-04
Epoch 12/100
7/7 - 0s - loss: 5.0939e-04 - mean_squared_error: 5.0939e-04
Epoch 13/100
7/7 - 0s - loss: 4.1504e-04 - mean_squared_error: 4.1504e-04
Epoch 14/100
7/7 - 0s - loss: 3.3594e-04 - mean_squared_error: 3.3594e-04
Epoch 15/100
7/7 - 0s - loss: 2.8813e

<keras.callbacks.History at 0x1539510f220>

In [259]:
model.predict(Z_predict)

array([[271.96713]], dtype=float32)

In [260]:
model.predict(Z_predict2)

array([[277.63806]], dtype=float32)

In [435]:
model.save('Models')

INFO:tensorflow:Assets written to: Models\assets


## Model 3: Neural Network Regression using Keras - scaled input and output values

#### Grade: B

After training the model on SK-lean's MinMaxScalar scaled values for all predictors, the model predicted CPI for 4/1/2020 to be 250.3, when the actual value is 256.192.  The previous model predicted 271.97.  The scaled data led to a big improvement in accuracy.  For the second test, the model predicted CPI for 4/1/2021 to be 262.52, when the actual value is 266.832.  The previous model predicted 277.64.  This is also a huge improvement with scaled data.  These improvements are impressive given the massive changes in trends of the input variables starting after 2/1/2020.

In [277]:
predictor_df = m2_df
for each_df in basic_neural_predictor_df_list:
    predictor_df = predictor_df.merge(each_df, how='inner', on='Date')
predictor_df = predictor_df[['Date','m2','government_expenditures','net_exports','gdp','reserve_balances','unemployment','real_output_hour',\
              'ppi_all_commodities','cpi']]
predictor_df

Unnamed: 0,Date,m2,government_expenditures,net_exports,gdp,reserve_balances,unemployment,real_output_hour,ppi_all_commodities,cpi
0,1959-01-01,286.6,,0.519,510.33,18.9,6,32.375,31.7,29.01
1,1959-04-01,290.1,,-0.768,522.653,18.7,5.2,32.686,31.8,28.98
2,1959-07-01,295.2,,1.211,525.034,18.7,5.1,32.73,31.7,29.15
3,1959-10-01,296.5,,0.627,528.6,18.6,5.7,32.627,31.6,29.35
4,1960-01-01,298.2,144.233,2.858,542.648,18.8,5.2,33.389,31.6,29.37
...,...,...,...,...,...,...,...,...,...,...
245,2020-04-01,17039.1,10910.4,-538.876,19477.4,2953.6,14.8,110.639,185.5,256.192
246,2020-07-01,18316.6,9706.16,-725.723,21138.6,2718.5,10.2,111.895,193,258.604
247,2020-10-01,18747.9,8471.92,-798.431,21477.6,2876.6,6.9,110.92,196.5,260.462
248,2021-01-01,19393.1,10790.8,-872.54,22038.2,3153.8,6.3,112.096,204.8,262.231


In [280]:
# predictor_df = predictor_df.loc[predictor_df['Date']<='2021-01-01',:]
predictor_df = predictor_df.loc[predictor_df['Date']>='1971-09-01',:]
predictor_df = predictor_df.set_index('Date')
predictor_df

Unnamed: 0_level_0,m2,government_expenditures,net_exports,gdp,reserve_balances,unemployment,real_output_hour,ppi_all_commodities,cpi
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1971-10-01,698.4,383.306,-1.92,1190.3,30.8,5.8,45.079,38.3,40.9
1972-01-01,717.7,399.428,-3.534,1230.61,32.9,5.8,45.785,38.8,41.2
1972-04-01,738.4,403.929,-4.258,1266.37,32.6,5.7,46.69,39.3,41.5
1972-07-01,759.5,404.908,-2.638,1290.57,33.1,5.6,46.961,40,41.8
1972-10-01,786.9,419.285,-3.061,1328.9,33.8,5.6,47.363,40.1,42.2
...,...,...,...,...,...,...,...,...,...
2020-04-01,17039.1,10910.4,-538.876,19477.4,2953.6,14.8,110.639,185.5,256.192
2020-07-01,18316.6,9706.16,-725.723,21138.6,2718.5,10.2,111.895,193,258.604
2020-10-01,18747.9,8471.92,-798.431,21477.6,2876.6,6.9,110.92,196.5,260.462
2021-01-01,19393.1,10790.8,-872.54,22038.2,3153.8,6.3,112.096,204.8,262.231


In [300]:
### Scale the data, filter to training and target variables, filter to training and test dates, convert to arrays
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
scaler.fit(predictor_df)
predictor_scaled_array = scaler.transform(predictor_df)
predictor_scaled_df = pd.DataFrame(predictor_scaled_array)
scaled_predictors_df = predictor_scaled_df.iloc[:194,:8]
scaled_target_df = predictor_scaled_df.iloc[:194,8]
scaled_test_predictor1_df = predictor_scaled_df.iloc[194,:8]
scaled_test_predictor2_df = predictor_scaled_df.iloc[198,:8]
scaled_test_target1_df = predictor_scaled_df.iloc[194,8]
scaled_test_target2_df = predictor_scaled_df.iloc[198,8]
scaled_predictor_array = np.array(scaled_predictors_df).astype("float32")
scaled_target_array = np.array(scaled_target_df).astype("float32")
scaled_test_array1 = np.array(scaled_test_predictor1_df).astype("float32")
scaled_test_array2 = np.array(scaled_test_predictor2_df).astype("float32")

In [303]:
scaled_test_array1 = scaled_test_array1.reshape(1,-1)
scaled_test_array2 = scaled_test_array2.reshape(1,-1)
scaled_test_array2.shape

(1, 8)

In [293]:
from keras import Sequential
from keras.layers import Dense
model2 = Sequential()
model2.add(Dense(100, activation='relu',input_dim=8))
model2.add(Dense(units=1))
model2.compile(optimizer='adam',loss='mean_squared_error',metrics='mean_squared_error')

In [294]:
model2.fit(scaled_predictor_array,scaled_target_array,epochs=100,shuffle=True,verbose=2)

Epoch 1/100
7/7 - 0s - loss: 0.2133 - mean_squared_error: 0.2133
Epoch 2/100
7/7 - 0s - loss: 0.0846 - mean_squared_error: 0.0846
Epoch 3/100
7/7 - 0s - loss: 0.0297 - mean_squared_error: 0.0297
Epoch 4/100
7/7 - 0s - loss: 0.0170 - mean_squared_error: 0.0170
Epoch 5/100
7/7 - 0s - loss: 0.0105 - mean_squared_error: 0.0105
Epoch 6/100
7/7 - 0s - loss: 0.0039 - mean_squared_error: 0.0039
Epoch 7/100
7/7 - 0s - loss: 0.0017 - mean_squared_error: 0.0017
Epoch 8/100
7/7 - 0s - loss: 0.0020 - mean_squared_error: 0.0020
Epoch 9/100
7/7 - 0s - loss: 0.0016 - mean_squared_error: 0.0016
Epoch 10/100
7/7 - 0s - loss: 0.0013 - mean_squared_error: 0.0013
Epoch 11/100
7/7 - 0s - loss: 9.9487e-04 - mean_squared_error: 9.9487e-04
Epoch 12/100
7/7 - 0s - loss: 8.0751e-04 - mean_squared_error: 8.0751e-04
Epoch 13/100
7/7 - 0s - loss: 7.5458e-04 - mean_squared_error: 7.5458e-04
Epoch 14/100
7/7 - 0s - loss: 6.6322e-04 - mean_squared_error: 6.6322e-04
Epoch 15/100
7/7 - 0s - loss: 6.1543e-04 - mean_squar

<keras.callbacks.History at 0x153962a44f0>

In [323]:
prediction1 = model2.predict(scaled_test_array1)
predicted_array1 = np.insert(scaled_test_array1,8,prediction1[0])
predicted1_array = predicted_array1.reshape(1,-1)
predictions1 = scaler.inverse_transform(predicted1_array)
print('The predicted CPI according to this model is: ',predictions1[0][8])

The predicted CPI according to this model is:  250.34177


In [324]:
prediction2 = model2.predict(scaled_test_array2)
predicted_array2 = np.insert(scaled_test_array2,8,prediction2[0])
predicted2_array = predicted_array2.reshape(1,-1)
predictions2 = scaler.inverse_transform(predicted2_array)
print('The predicted CPI according to this model is: ',predictions2[0][8])

The predicted CPI according to this model is:  262.51837


In [327]:
loss_per_epoch = model2.history.history['loss']
plt.plot(range(len(loss_per_epoch)),loss_per_epoch)

KeyError: 'loss'

In [436]:
model2.save('Models')

INFO:tensorflow:Assets written to: Models\assets


## Model 4: Deep Neural Network using Keras

### Grade: A-

This model performs the best yet, having predicted CPI for the most recent quarter for which all data has been released (4/1/2021) with perfect accuracy--predicted value: 266.83; actual value: 266.83.  The reason it gets an A- is that it predicted a much higher CPI for 4/1/2020 than the actual value, which I attribute to the bizarre conditions surrounding the start of the COVID-19 pandemic.

The neural network uses 11 input variables instead of the 8 used in the prior model.  It also uses 2 hidden layers instead of just one.

In [415]:
real_gdp_df = gdp_quarterly_df.reset_index()[['Date','real_gdp']]
federal_debt_df = government_quarterly_df.reset_index()[['Date','federal_debt']]
fed_funds_rate_df = banks_week_month_df.reset_index()[['Date','fed_funds_rate']]
loans_df = banks_week_month_df.reset_index()[['Date','commercial_industrial_loans','consumer_loans_com_banks']]
loans_df['combined_commercial_loans'] = loans_df['commercial_industrial_loans']+loans_df['consumer_loans_com_banks']
com_loans_df = loans_df[['Date','combined_commercial_loans']]
personal_savings_df = consumer_monthly_df.reset_index()[['Date','personal_savings']]
personal_savings_df

Unnamed: 0,Date,personal_savings
0,1947-01-01,
1,1947-02-01,
2,1947-03-01,
3,1947-04-01,
4,1947-05-01,
...,...,...
892,2021-05-01,1790.8
893,2021-06-01,1614.4
894,2021-07-01,1824.6
895,2021-08-01,1710.9


In [417]:
complex_neural_predictor_df_list = [real_gdp_df,federal_debt_df,government_expenditures_df,fed_funds_rate_df,com_loans_df,\
                                    unemployment_df,personal_savings_df,net_exports_df,cpi_df,reserve_balances_df,ppi_all_commodities_df]
predictor4_df = m2_df
for each_df in complex_neural_predictor_df_list:
    predictor4_df = predictor4_df.merge(each_df, how='inner', on='Date')
predictor4_df = predictor4_df[['Date','m2','federal_debt','government_expenditures','fed_funds_rate','combined_commercial_loans',\
                               'net_exports','real_gdp','reserve_balances','unemployment','personal_savings',\
                               'ppi_all_commodities','cpi']]
predictor4_df

Unnamed: 0,Date,m2,federal_debt,government_expenditures,fed_funds_rate,combined_commercial_loans,net_exports,real_gdp,reserve_balances,unemployment,personal_savings,ppi_all_commodities,cpi
0,1959-01-01,286.6,,,2.48,55.9412,0.519,3123.98,18.9,6,39.6,31.7,29.01
1,1959-04-01,290.1,,,2.96,57.0474,-0.768,3194.43,18.7,5.2,40,31.8,28.98
2,1959-07-01,295.2,,,3.47,60.0901,1.211,3196.68,18.7,5.1,38.7,31.7,29.15
3,1959-10-01,296.5,,,3.98,61.996,0.627,3205.79,18.6,5.7,34.1,31.6,29.35
4,1960-01-01,298.2,,144.233,3.99,63.9039,2.858,3277.85,18.8,5.2,40.6,31.6,29.37
...,...,...,...,...,...,...,...,...,...,...,...,...,...
245,2020-04-01,17039.1,2.64772e+07,10910.4,0.05,4484.49,-538.876,17258.2,2953.6,14.8,6392.5,185.5,256.192
246,2020-07-01,18316.6,2.69454e+07,9706.16,0.09,4379.85,-725.723,18560.8,2718.5,10.2,3359.4,193,258.604
247,2020-10-01,18747.9,2.77478e+07,8471.92,0.09,4208.39,-798.431,18767.8,2876.6,6.9,2370.9,196.5,260.462
248,2021-01-01,19393.1,2.81326e+07,10790.8,0.09,4092.14,-872.54,19055.7,3153.8,6.3,3798.6,204.8,262.231


In [418]:
predictor4_df = predictor4_df.loc[predictor4_df['Date']>='1971-09-01',:]
predictor4_df = predictor4_df.set_index('Date')
predictor4_df

Unnamed: 0_level_0,m2,federal_debt,government_expenditures,fed_funds_rate,combined_commercial_loans,net_exports,real_gdp,reserve_balances,unemployment,personal_savings,ppi_all_commodities,cpi
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1971-10-01,698.4,424131,383.306,5.2,188.839,-1.92,5154.55,30.8,5.8,113.1,38.3,40.9
1972-01-01,717.7,427344,399.428,3.51,192.114,-3.534,5249.34,32.9,5.8,107.6,38.8,41.2
1972-04-01,738.4,426435,403.929,4.17,198.289,-4.258,5368.48,32.6,5.7,100.5,39.3,41.5
1972-07-01,759.5,433946,404.908,4.55,204.801,-2.638,5419.18,33.1,5.6,104.8,40,41.8
1972-10-01,786.9,448473,419.285,5.05,212.607,-3.061,5509.93,33.8,5.6,122,40.1,42.2
...,...,...,...,...,...,...,...,...,...,...,...,...
2020-04-01,17039.1,2.64772e+07,10910.4,0.05,4484.49,-538.876,17258.2,2953.6,14.8,6392.5,185.5,256.192
2020-07-01,18316.6,2.69454e+07,9706.16,0.09,4379.85,-725.723,18560.8,2718.5,10.2,3359.4,193,258.604
2020-10-01,18747.9,2.77478e+07,8471.92,0.09,4208.39,-798.431,18767.8,2876.6,6.9,2370.9,196.5,260.462
2021-01-01,19393.1,2.81326e+07,10790.8,0.09,4092.14,-872.54,19055.7,3153.8,6.3,3798.6,204.8,262.231


In [423]:
scaler2 = MinMaxScaler()
scaler2.fit(predictor4_df)
predictor4_scaled_array = scaler2.transform(predictor4_df)
predictor4_scaled_df = pd.DataFrame(predictor4_scaled_array)
scaled_predictors4_df = predictor4_scaled_df.iloc[:194,:11]
scaled_target4_df = predictor4_scaled_df.iloc[:194,11]
scaled_test4_predictor1_df = predictor4_scaled_df.iloc[194,:11]
scaled_test4_predictor2_df = predictor4_scaled_df.iloc[198,:11]
scaled_test4_target1_df = predictor4_scaled_df.iloc[194,11]
scaled_test4_target2_df = predictor4_scaled_df.iloc[198,11]
scaled_predictor4_array = np.array(scaled_predictors4_df).astype("float32")
scaled_target4_array = np.array(scaled_target4_df).astype("float32")
scaled_test4_array1 = np.array(scaled_test4_predictor1_df).astype("float32")
scaled_test4_array2 = np.array(scaled_test4_predictor2_df).astype("float32")
scaled_test4_array1 = scaled_test4_array1.reshape(1,-1)
scaled_test4_array2 = scaled_test4_array2.reshape(1,-1)

In [424]:
model4 = Sequential()
model4.add(Dense(100, activation='relu',input_dim=11))
model4.add(Dense(100, activation='relu'))
model4.add(Dense(units=1))
model4.compile(optimizer='adam',loss='mean_squared_error',metrics='mean_squared_error')

In [425]:
model4.fit(scaled_predictor4_array,scaled_target4_array,epochs=100,shuffle=True,verbose=2)

Epoch 1/100
7/7 - 0s - loss: 0.1943 - mean_squared_error: 0.1943
Epoch 2/100
7/7 - 0s - loss: 0.0350 - mean_squared_error: 0.0350
Epoch 3/100
7/7 - 0s - loss: 0.0204 - mean_squared_error: 0.0204
Epoch 4/100
7/7 - 0s - loss: 0.0060 - mean_squared_error: 0.0060
Epoch 5/100
7/7 - 0s - loss: 0.0077 - mean_squared_error: 0.0077
Epoch 6/100
7/7 - 0s - loss: 0.0023 - mean_squared_error: 0.0023
Epoch 7/100
7/7 - 0s - loss: 0.0017 - mean_squared_error: 0.0017
Epoch 8/100
7/7 - 0s - loss: 9.3450e-04 - mean_squared_error: 9.3450e-04
Epoch 9/100
7/7 - 0s - loss: 6.5680e-04 - mean_squared_error: 6.5680e-04
Epoch 10/100
7/7 - 0s - loss: 2.4483e-04 - mean_squared_error: 2.4483e-04
Epoch 11/100
7/7 - 0s - loss: 1.9398e-04 - mean_squared_error: 1.9398e-04
Epoch 12/100
7/7 - 0s - loss: 1.6053e-04 - mean_squared_error: 1.6053e-04
Epoch 13/100
7/7 - 0s - loss: 1.2056e-04 - mean_squared_error: 1.2056e-04
Epoch 14/100
7/7 - 0s - loss: 1.1376e-04 - mean_squared_error: 1.1376e-04
Epoch 15/100
7/7 - 0s - loss:

<keras.callbacks.History at 0x153976315e0>

In [431]:
prediction1_4 = model4.predict(scaled_test4_array1)
predicted_array1_4 = np.insert(scaled_test4_array1,11,prediction1_4[0])
predicted1_array_4 = predicted_array1_4.reshape(1,-1)
predictions1_4 = scaler2.inverse_transform(predicted1_array_4)
print('The predicted CPI according to this model is: ',predictions1_4[0][11])

The predicted CPI according to this model is:  284.36438


In [432]:
prediction2_4 = model4.predict(scaled_test4_array2)
predicted_array2_4 = np.insert(scaled_test4_array2,8,prediction2_4[0])
predicted2_array_4 = predicted_array2_4.reshape(1,-1)
predictions2_4 = scaler2.inverse_transform(predicted2_array_4)
print('The predicted CPI according to this model is: ',predictions2_4[0][11])

The predicted CPI according to this model is:  266.832


In [437]:
model4.save('Models')

INFO:tensorflow:Assets written to: Models\assets


## Model 5: Custom Model, Quantity Theory of Money Variation

### Grade: TBD

The Quantity Theory of Money (QTM) is a very general model relating the money supply and how fast it moves to the price level and level of output in the economy.  The equation is MV = PY, wher M is the money supply, V is the velocity of money (an average of how many times each dollar in the money supply changes hands in a given time period, usually a year), P is an index for the price level, and Y is output (essentially GDP).  So, for example, if the money supply increases, while velocity and output stay the same, the price level would have to increase to balance the equation because more dollars are competing for the same number of goods and services.  There are numerous factors that can affect each of these four variables individually--they are not each isolated to this equation, which makes them incredibly hard to model.  Given the massive increases in M1 and M2 money supplies in recent years while at the same time GDP has grown modestly, why hasn't the overall price level increased to match the increases in money supply?  

This is a question puzzling many economists.  There are many possible explanations:
 - As our trade deficit has grown over the last several decades, lots of the new money created has flowed out of the country, so it's not affecting prices domestically
 - Velocity is typically calculated as a residual.  We can measure money supply, prices and output, but it is hard to measure velocity, so it is calculated as what's left over.  Essentially, V = PY/M.  Economists used to assume that velocity was constant, yet that is not true.  I wonder how velocity is affected by age demographics (do young people spend faster than old) and wealth gaps (as wealth has accumulated among the wealthiest in our society, do the spend it more slowly than those living paycheck-to-paycheck).  If velocity slows for these reasons, that could explain prices increasing more slowly than the difference between growth in the money supply and growth in output.
 - Inflation expectations play a large role in the salaries people demand and how quickly they want to spend money.  If they expect the dollars in their bank account will be worth less one month from now than they are today, they will spend the money while goods are relatively cheaper.
 - Money supplies have increased most rapidly during times of crisis like the 2008 financial crisis and the COVID-19 pandemic, yet because the dollar is viewed as "safe" around the world, many people demand more dollars during these times of crisis, which holds up the value of the dollar.
 - Energy, especially oil, have a large influence on prices because they play a role in manufacturing, supply chain transportation, and quality of life.  The United States shale boom led to a drastic decrease in global oil prices, which has also helped to hold down prices.
 - Finally much of the money that has been created through federal stimulus and through the Federal Reserve's Quantitative Easing policies is being held as excess reserves in banks.  The increase in reserves held is astonishing.  This increase is in part due to a low demand for loans, while at the same time the Federal Reserve is paying interest on reserves, so banks are holding them to earn interest.
 
All these factors likely contribute some to holding down inflation.  At the same time as the pressures holding down in flation, as of this writing on 10/13/2021, there are global supply chain shocks and shortages spanning numerous industries that are putting upward pressure on prices as the world emerges from the COVID-19 pandemic.

My goal is to make sense of this mixture of influences on the long-term value of the dollar.  As Milton Friedman once said, "Inflation is always and everywhere a monetary phenomenon."  If that is the case, we can expect inflation for years to come.  However, Friedman made that statement at a time when the globe was not nearly as connected internationally as it is today and when the dollar was not as dominant globally as it is today. 

In [402]:
m1_df = pd.DataFrame(all_data['m1']).transpose().rename(columns={0:'Date',1:'m1',2:'m1_change',3:'m1_pct_change'})
m1_df = m1_df[['Date','m1']]
net_export_df = net_exports_df[['Date','net_exports']]
money_supply_df = m1_df.merge(net_export_df, how="inner", on="Date")
money_supply_df = money_supply_df.dropna()
m1_list = money_supply_df['m1'].tolist()
exports_list = money_supply_df['net_exports'].tolist()
m1_change_list = [0]
# money_supply_change_list = [exports_list[0]]
change_index=1
for c in range(len(m1_list)-1):
    change = m1_list[change_index]-m1_list[c]
    m1_change_list.append(change)
#     money_supply_change = m1_change_list[c] + exports_list[c]
    change_index = change_index+1
money_supply_df['m1_change']=m1_change_list
money_supply_df['money_supply_change'] = money_supply_df['m1_change']+money_supply_df['net_exports']
# money_supply_df = money_supply_df.dropna()
money_supply_df

Unnamed: 0,Date,m1,net_exports,m1_change,money_supply_change
0,1959-01-01,138.9,0.519,0.0,0.519
1,1959-04-01,139.7,-0.768,0.8,0.032
2,1959-07-01,141.7,1.211,2.0,3.211
3,1959-10-01,140.5,0.627,-1.2,-0.573
4,1960-01-01,140,2.858,-0.5,2.358
...,...,...,...,...,...
727,2020-04-01,4774.4,-538.876,755.2,216.324
730,2020-07-01,16773.8,-725.723,11999.4,11273.7
733,2020-10-01,17346.8,-798.431,573.0,-225.431
736,2021-01-01,18100.6,-872.54,753.8,-118.74


In [403]:
money_supply_list = [138.38]
money_supply_change_list = money_supply_df['money_supply_change'].tolist()
for s in range(len(money_supply_change_list)-1):
    money_supply = money_supply_list[s] + money_supply_change_list[s+1]
    money_supply_list.append(money_supply)
money_supply_df['money_supply'] = money_supply_list
money_supply_df

Unnamed: 0,Date,m1,net_exports,m1_change,money_supply_change,money_supply
0,1959-01-01,138.9,0.519,0.0,0.519,138.380
1,1959-04-01,139.7,-0.768,0.8,0.032,138.412
2,1959-07-01,141.7,1.211,2.0,3.211,141.623
3,1959-10-01,140.5,0.627,-1.2,-0.573,141.050
4,1960-01-01,140,2.858,-0.5,2.358,143.408
...,...,...,...,...,...,...
727,2020-04-01,4774.4,-538.876,755.2,216.324,-48309.653
730,2020-07-01,16773.8,-725.723,11999.4,11273.7,-37035.976
733,2020-10-01,17346.8,-798.431,573.0,-225.431,-37261.407
736,2021-01-01,18100.6,-872.54,753.8,-118.74,-37380.147


In [407]:
net_m1_change = money_supply_df['m1_change'].sum()
net_exports_leaked = money_supply_df['net_exports'].sum()
net_exports_leaked + net_m1_change + 28872

-8712.596999999972

In [405]:
money_supply_df['net_exports'].sum()

-56361.396999999975

## Extract Economic Data

The code below is copied from a related, but different project--Macroeconomic research tool.

It is used to extract data from the St. Louis Federal Reserve's FRED database and the Wall Street Journal.

This code is inlcuded, because although I may be able to load csv's stored on my local machine containing the same data, a user of this notebook would not be able to do the same, so this code can be run to make the data that the above models depend on accessible.

In [2]:
#### Dictionary containing the FRED code to pull data from the API as the keys, and the name of the corresponsing main series
#### as named in my database as the values.

data_extract_dict = {'M1SL':'m1',
                      'M2SL':'m2',
                      'NONM1':'non_m1_components_m2',
                      'M1V':'m1v',
                      'M2V':'m2v',
                      'BUSLOANS':'commercial_industrial_loans',
                      'FEDFUNDS':'fed_funds_rate',
                      'DPSACBW027SBOG':'commercial_bank_deposits',
                      'TLAACBW027SBOG':'commercial_bank_assets',
                      'TOTRESNS':'reserve_balances',
                      'TOTBKCR':'commercial_bank_credit',
                      'MORTGAGE30US':'_30yr_fixed_rate_mortgage',
                      'CONSUMER':'consumer_loans_com_banks',
                      'CASACBW027SBOG':'commercial_bank_cash_assets',
                      'POPTHM':'pop',
                      'PCEPI':'pce_index',
                      'UNRATE':'unemployment',
                      'PSAVERT':'personal_savings_rate',
                      'CES0500000003':'average_hourly_wage',
                      'PMSAVE':'personal_savings',
                      'CUUR0000SETA01':'cpi_vehicles',
                      'APU0000708111':'cpi_eggs',
                      'CPIAPPSL':'cpi_apparel_cities',
                      'CPIHOSNS':'cpi_housing_cities',
                      'PCEDGC96':'real_pce_durable_goods',
                      'CPITRNSL':'cpi_urban_transportation',
                      'PCE':'pce',
                      'CIVPART':'labor_participation_rate',
                      'PCEC96':'real_pce',
                      'PCEDG':'pce_durable_goods',
                      'JTSJOL':'job_openings_nonfarm',
                      'PCEND':'pce_nondurable_goods',
                      'DSPIC96':'real_disposable_personal_income',
                      'ECOMPCTSA':'ecommerce_pct_of_totalsales',
                      'MSPUS':'median_house_sale_price',
                      'HDTGPDUSQ163N':'house_debt_gdp_ratio',
                      'OPHNFB':'real_output_hour',
                      'RRVRUSQ156N':'rental_vacancy_rate',
                      'DRSFRMACBS':'mortgage_delinquency',
                      'TDSP':'household_debt_service_pmtpctgdp',
                      'RHORUSQ156N':'homeownership_rate',
                      'DRCCLACBS':'creditcard_delinquency_rate',
                      'WFRBST01134':'wealth_share_top1pct',
                      'GPSAVE':'gross_private_saving',
                      'QUSR628BIS':'real_residential_property_price',
                      'WFRBLB50107':'bottom_50pct_net_worth',
                      'NCBCMDPMVCE':'debt_as_pct_corporate_equities',
                      'WFRBLT01026':'wealth_total_top1pct',
                      'DRCLACBS':'consumer_loan_delinquency_rate',
                      'CPIAUCSL':'cpi',
                      'MICH':'inflation_expectation',
                      'CPILFESL':'cpi_core',
                      'CPIMEDSL':'cpi_medical',
                      'CUUR0000SA0R':'cpi_urban',
                      'CPIFABSL':'cpi_food_bev',
                      'STLFSI2':'financial_stress',
                      'WALCL':'fed_assets',
                      'TREAST':'fed_res_held_treasuries',
                      'WTREGEN':'fed_liabilities_non_reserve_deposits',
                      'RESPPANWW':'total_fed_assets',
                      'BOPGSTB':'net_trade',
                      'IMPGSC1':'real_imports',
                      'IMPGS':'imports_goods_services',
                      'INDCPIALLMINMEI':'cpi_india',
                      'IMPCH':'imports_from_china',
                      'IR':'all_commodities_import_price_index',
                      'GDP':'gdp',
                      'A939RC0Q052SBEA':'nom_gdpcap',
                      'GDPC1':'real_gdp',
                      'A939RX0Q048SBEA':'real_gdpcap',
                      'GDPDEF':'gdp_deflator',
                      'GFDEBTN':'federal_debt',
                      'GFDEGDQ188S':'debt_pct_gdp',
                      'W068RCQ027SBEA':'government_expenditures',
                      'FYGFDPUN':'federal_debt_held_by_public',
                      'FDHBFRBN':'fr_held_debt',
                      'B087RC1Q027SBEA':'government_transfer_payments',
                      'M318501Q027NBEA':'federal_surplus_deficit',
                      'B075RC1Q027SBEA':'corporate_income_tax_receipts',
                      'TTLCONS':'construction_spending',
                      'HOUST':'housing_starts',
                      'GPDIC1':'real_gross_domestic_private_investment',
                      'FYFSD':'deficit_surplus',
                      'MEHOINUSA672N':'real_median_house_income',
                      'FPCPITOTLZGUSA':'inflation_consumer_price',
                      'USEPUINDXD':'economic_uncertainty',
                      'PPIACO':'ppi_all_commodities',
                      'WPU0911':'ppi_wood_pulp',
                      'WPU101707':'ppi_metals',
                      'PCU325211325211':'ppi_plastics_resins',
                      'WPU101':'ppi_iron_steel',
                      'PWHEAMTUSDQ':'global_wheat_price',
                      'WPU10170502':'ppi_steel_wire',
                      'PCU484121484121':'ppi_freight',
                      'PALUMUSDM':'global_aluminum_price',
                      'PCU44414441':'ppi_building_materials',
                      'WPU0811':'ppi_wood_lumber',
                      'PMAIZMTUSDM':'global_corn_price',
                      'PIORECRUSDM':'global_iron_price',
                      'PRUBBUSDM':'global_rubber_price',
                      'WPU081':'ppi_lumber',
                      'PCU32733273':'ppi_cement_concrete',
                      'PCU33443344':'ppi_semiconductors_electronics',
                      'UMCSENT':'consumer_sentiment',
                      'CP':'corporate_profits_after_tax',
                      'PCESV':'pce_services',
                      'CUUR0000SEHA':'cpi_primary_rent',
                      'WSHOMCB':'fed_mbs',
                      'NETEXP':'net_exports',
                      'A019RE1A156NBEA':'net_exports_pctofgdp',
                      'GNP':'gnp',
                      'GPDI':'gross_domestic_private_investment',
                      'DCOILWTICO':'price_per_barrel',
                      'T10YIE':'_10_year_breakeven_inflation',
                      'T5YIFR':'inf_expectation_5yr',
                      'PCOPPUSDM':'copper_price',
                      'PCUOMFGOMFG':'ppi_manufacturing',
                      'DDDM01USA156NWDB':'stock_market_cap',
                      'BOGMBASE':'monetary_base'}

In [7]:
#### Defines main function for extracting and transforming data.  This function is used on series with no known calculation errors
#### The function extracts raw data from the API in JSON format, pulls dates and observed valuse from the JSON, storing them in 
#### lists, runs calculations on the stored values for the _change and _pct_change columns for each series, storing those calculated
#### values in additional lists, and appends all three lists that correspond with each series to the all_data dictionary.  Error
#### handling is included with the try/except language so that the entire function does not stop if there is an unexpected error
#### when extracting or transforming a single series.

all_data = {}
def fred_extract(series_dict):
    for key, value in series_dict.items():
        try:
            data = requests.get(f'https://api.stlouisfed.org/fred/series/observations?series_id={key}&api_key={fred_key}&file_type=json')
            series_json = data.json()
            series_json_obs=series_json['observations']
            series_dates = []
            series_values = []
            series_change_values = [0]
            series_pct_change_values = [0]
            change_index = 1
            for each_item in range(len(series_json_obs)):
                item_date=series_json_obs[each_item]['date']
                item_value=series_json_obs[each_item]['value']
                if item_value != ".":
                    series_dates.append(series_json_obs[each_item]['date'])
                    series_values.append(float(series_json_obs[each_item]['value']))
        except:
            print(f'Error extracting {key}')
        for each_value in range(len(series_values)-1):
            try:
                if (series_values[change_index] > series_values[each_value]):
                    if (series_values[each_value] > 0):
                        change = series_values[change_index]-series_values[each_value]
                        pct_change = (change/series_values[each_value])*100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                    elif (series_values[each_value] < 0):
                        change = series_values[change_index]-series_values[each_value]
                        pct_change = abs(change/series_values[each_value])*100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                    elif (series_values[each_value] == 0):
                        change = series_values[change_index]
                        pct_change = 100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                elif (series_values[change_index] < series_values[each_value]):
                    if (series_values[each_value] > 0):
                        change = series_values[change_index]-series_values[each_value]
                        pct_change = (change/series_values[each_value])*100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                    elif (series_values[each_value] < 0):
                        change = series_values[change_index]-series_values[each_value]
                        pct_change = (abs(change)/series_values[each_value])*100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                    elif (series_values[each_value] == 0):
                        change = series_values[change_index]
                        pct_change = -100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                elif (series_values[change_index] == series_values[each_value]):
                    change = 0
                    pct_change = 0
                    series_change_values.append(change)
                    series_pct_change_values.append(pct_change)
                elif (series_values[change_index] < series_values[each_value]):
                    if (series_values[each_value] == 0):
                        change = series_values[change_index]
                        pct_change = -100
                        series_change_values.append(change)
                        series_pct_change_values.append(pct_change)
                change_index = change_index + 1
                all_data[value]=[series_dates,series_values,series_change_values,series_pct_change_values]
            except:
                print(f'Error running calculations on {value}') 
            

In [356]:
add_base = {'BOGMBASE':'monetary_base'}
fred_extract(add_base)

In [26]:
#### Creates 14 Pandas DataFrames that correspond with the first main series that appear in each of my database tables.
#### The loop in the cell below creates individual DataFrames for each remaining series and merges them with these 14

banks_week_month_df = pd.DataFrame(all_data['commercial_industrial_loans']).transpose().rename(columns={0:"Date",1:"commercial_industrial_loans",2:"commercial_industrial_loans_change",3:"commercial_industrial_loans_pct_change"}).set_index(['Date'])
m1m2_df = pd.DataFrame(all_data['m1']).transpose().rename(columns={0:"Date",1:"m1",2:"m1_change",3:"m1_pct_change"}).set_index(['Date'])
consumer_monthly_df = pd.DataFrame(all_data['pop']).transpose().rename(columns={0:"Date",1:"pop",2:"pop_change",3:"pop_pct_change"}).set_index(['Date'])
consumers_quarterly_df = pd.DataFrame(all_data['ecommerce_pct_of_totalsales']).transpose().rename(columns={0:"Date",1:"ecommerce_pct_of_totalsales",2:"ecommerce_pct_of_totalsales_change",3:"ecommerce_pct_of_totalsales_pct_change"}).set_index(['Date'])
cpi_monthly_df = pd.DataFrame(all_data['cpi']).transpose().rename(columns={0:"Date",1:"cpi",2:"cpi_change",3:"cpi_pct_change"}).set_index(['Date'])
federal_reserve_weekly_df = pd.DataFrame(all_data['financial_stress']).transpose().rename(columns={0:"Date",1:"financial_stress",2:"financial_stress_change",3:"financial_stress_pct_change"}).set_index(['Date'])
foreign_trade_month_quarter_df = pd.DataFrame(all_data['net_trade']).transpose().rename(columns={0:"Date",1:"net_trade",2:"net_trade_change",3:"net_trade_pct_change"}).set_index(['Date'])
gdp_quarterly_df = pd.DataFrame(all_data['gdp']).transpose().rename(columns={0:"Date",1:"gdp",2:"gdp_change",3:"gdp_pct_change"}).set_index(['Date'])
government_quarterly_df = pd.DataFrame(all_data['federal_debt']).transpose().rename(columns={0:"Date",1:"federal_debt",2:"federal_debt_change",3:"federal_debt_pct_change"}).set_index(['Date'])
investment_month_quarter_df = pd.DataFrame(all_data['construction_spending']).transpose().rename(columns={0:"Date",1:"construction_spending",2:"construction_spending_change",3:"construction_spending_pct_change"}).set_index(['Date'])
misc_annual_df = pd.DataFrame(all_data['deficit_surplus']).transpose().rename(columns={0:"Date",1:"deficit_surplus",2:"deficit_surplus_change",3:"deficit_surplus_pct_change"}).set_index(['Date'])
misc_daily_df = pd.DataFrame(all_data['inf_expectation_5yr']).transpose().rename(columns={0:"Date",1:"inf_expectation_5yr",2:"inf_expectation_5yr_change",3:"inf_expectation_5yr_pct_change"}).set_index(['Date'])
ppi_monthly_df = pd.DataFrame(all_data['ppi_manufacturing']).transpose().rename(columns={0:"Date",1:"ppi_manufacturing",2:"ppi_manufacturing_change",3:"ppi_manufacturing_pct_change"}).set_index(['Date'])
velocity_df = pd.DataFrame(all_data['m1v']).transpose().rename(columns={0:"Date",1:"m1v",2:"m1v_change",3:"m1v_pct_change"}).set_index(['Date'])

In [27]:
#### Lists of the main series columns for each table.  Additionally, for each of these series, "_change" and "pct_change" are 
#### calculated in the extraction and transformation functions.  These lists serve to pull specific series from the all_data 
#### dictionary where all data extracted from the FRED API is stored.

m1m2_column_list = ['m2','non_m1_components_m2']
velocity_column_list = ['m2v']
banks_week_month_column_list = ['fed_funds_rate','commercial_bank_deposits','commercial_bank_assets',\
                               'reserve_balances','commercial_bank_credit','_30yr_fixed_rate_mortgage','consumer_loans_com_banks',\
                               'commercial_bank_cash_assets']
consumer_monthly_column_list = ['pce_index','unemployment','personal_savings_rate','average_hourly_wage','personal_savings',\
                               'consumer_sentiment','cpi_vehicles','cpi_eggs','cpi_apparel_cities','cpi_housing_cities',\
                               'real_pce_durable_goods','cpi_urban_transportation','pce','labor_participation_rate','real_pce',\
                               'pce_durable_goods','job_openings_nonfarm','pce_nondurable_goods','real_disposable_personal_income']
consumers_quarterly_column_list = ['median_house_sale_price','house_debt_gdp_ratio','real_output_hour',\
                                  'corporate_profits_after_tax','pce_services','rental_vacancy_rate','mortgage_delinquency',\
                                  'household_debt_service_pmtpctgdp','homeownership_rate','creditcard_delinquency_rate','wealth_share_top1pct',\
                                  'gross_private_saving','real_residential_property_price','bottom_50pct_net_worth',\
                                  'debt_as_pct_corporate_equities','wealth_total_top1pct','consumer_loan_delinquency_rate']
cpi_monthly_column_list = ['inflation_expectation','cpi_core','cpi_medical','cpi_urban','cpi_primary_rent','cpi_food_bev']
federal_reserve_weekly_column_list = ['fed_assets','fed_res_held_treasuries','fed_liabilities_non_reserve_deposits',\
                                     'fed_mbs','total_fed_assets']
foreign_trade_month_quarter_column_list = ['net_exports','net_exports_pctofgdp','real_imports','imports_goods_services',\
                                          'cpi_india','imports_from_china','all_commodities_import_price_index']
gdp_quarterly_column_list = ['nom_gdpcap','real_gdp','real_gdpcap','gdp_deflator','gnp']
government_quarterly_column_list = ['debt_pct_gdp','government_expenditures','federal_debt_held_by_public','fr_held_debt',\
                                   'government_transfer_payments','federal_surplus_deficit','corporate_income_tax_receipts']
investment_month_quarter_column_list = ['housing_starts','real_gross_domestic_private_investment',\
                                       'gross_domestic_private_investment']
misc_annual_column_list = ['stock_market_cap','real_median_house_income','inflation_consumer_price']
misc_daily_column_list = ['price_per_barrel','economic_uncertainty','_10_year_breakeven_inflation']
ppi_monthly_column_list = ['ppi_all_commodities','ppi_wood_pulp','ppi_metals','copper_price','ppi_plastics_resins',\
                          'ppi_iron_steel','global_wheat_price','ppi_steel_wire','ppi_freight','global_aluminum_price',\
                          'ppi_building_materials','ppi_wood_lumber','global_corn_price','global_iron_price','global_rubber_price',\
                          'ppi_lumber','ppi_cement_concrete','ppi_semiconductors_electronics']
#stocks_gold_daily_column_list = ['djia_close','nasdaq_close','sp500_close','gold_price']
# all_table_column_dict = {m1m2_df:m1m2_column_list,velocity_df:velocity_column_list,banks_week_month_df:banks_week_month_column_list,\
#                          consumer_monthly_df:consumer_monthly_column_list,consumers_quarterly_df:consumers_quarterly_column_list,\
#                          cpi_monthly_df:cpi_monthly_column_list,federal_reserve_weekly_df:federal_reserve_weekly_column_list,\
#                          foreign_trade_month_quarter_df:foreign_trade_month_quarter_column_list,gdp_quarterly_df:gdp_quarterly_column_list,\
#                          government_quarterly_df:government_quarterly_column_list,investment_month_quarter_df:investment_month_quarter_column_list,\
#                          misc_annual_df:misc_annual_column_list,misc_daily_df:misc_daily_column_list,ppi_monthly_df:ppi_monthly_column_list}
# all_table_column_list = [m1m2_column_list, velocity_column_list]
# all_table_df_list = [m1m2_df,velocity_df]

In [28]:
for column in m1m2_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    m1m2_df = m1m2_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in velocity_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    velocity_df = velocity_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in banks_week_month_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    banks_week_month_df = banks_week_month_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in consumer_monthly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    consumer_monthly_df = consumer_monthly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in consumers_quarterly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    consumers_quarterly_df = consumers_quarterly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in cpi_monthly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    cpi_monthly_df = cpi_monthly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in federal_reserve_weekly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    federal_reserve_weekly_df = federal_reserve_weekly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in foreign_trade_month_quarter_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    foreign_trade_month_quarter_df = foreign_trade_month_quarter_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in gdp_quarterly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    gdp_quarterly_df = gdp_quarterly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in government_quarterly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    government_quarterly_df = government_quarterly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in investment_month_quarter_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    investment_month_quarter_df = investment_month_quarter_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in misc_annual_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    misc_annual_df = misc_annual_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in misc_daily_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    misc_daily_df = misc_daily_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])
for column in ppi_monthly_column_list:
    temp_df = pd.DataFrame(all_data[f'{column}']).transpose().rename(columns={0:'Date',1:f'{column}',2:f'{column}_change',3:f'{column}_pct_change'}).set_index(['Date'])
    ppi_monthly_df = ppi_monthly_df.merge(temp_df,how='outer',on='Date').sort_values(by=['Date'])

In [29]:
#### Import data manually pulled from WSJ (to be automated with web scraping at a later date), run change and pct_change calculations
#### organize into individual DataFrames, merge DataFrames.  Gold must be cleaned separately before merging
djia_path = os.path.join("../Manual Downloads/djia.csv")
nasdaq_path = os.path.join("../Manual Downloads/nasdaq.csv")
gold_path=os.path.join("../Manual Downloads/gold.csv")
sp500_path=os.path.join("../Manual Downloads/sp500.csv")
djia_data = pd.read_csv(djia_path)
nasdaq_data = pd.read_csv(nasdaq_path)
gold_data=pd.read_csv(gold_path)
sp500_data=pd.read_csv(sp500_path)
sp500_df=sp500_data.rename(columns={' Close':'sp500_close'}).drop(columns=[' Open',' High',' Low']).set_index('Date')
sp500_df=sp500_df[::-1]
sp500_values = sp500_df['sp500_close'].tolist()
sp500_close_change = [0]
sp500_close_pct_change =[0]
change_index = 1
for i in range(len(sp500_values)-1):
    change = sp500_values[change_index] - sp500_values[i]
    pct_change = (change/sp500_values[i])*100
    sp500_close_change.append(change)
    sp500_close_pct_change.append(pct_change)
    change_index=change_index + 1
sp500_df['sp500_close_change']=sp500_close_change
sp500_df['sp500_close_pct_change']=sp500_close_pct_change
djia_df=djia_data.rename(columns={' Close':'djia_close'}).drop(columns=[' Open',' High',' Low']).set_index('Date')
djia_df=djia_df[::-1]
djia_values = djia_df['djia_close'].tolist()
djia_close_change = [0]
djia_close_pct_change =[0]
change_index = 1
for j in range(len(djia_values)-1):
    change = djia_values[change_index] - djia_values[j]
    pct_change = (change/djia_values[i])*100
    djia_close_change.append(change)
    djia_close_pct_change.append(pct_change)
    change_index=change_index + 1
djia_df['djia_close_change']=djia_close_change
djia_df['djia_close_pct_change']=djia_close_pct_change
nasdaq_df=nasdaq_data.rename(columns={' Close':'nasdaq_close'}).drop(columns=[' Open',' High',' Low']).set_index('Date')
nasdaq_df=nasdaq_df[::-1]
nasdaq_values = nasdaq_df['nasdaq_close'].tolist()
nasdaq_close_change = [0]
nasdaq_close_pct_change =[0]
change_index = 1
for q in range(len(nasdaq_values)-1):
    change = nasdaq_values[change_index] - nasdaq_values[q]
    pct_change = (change/nasdaq_values[i])*100
    nasdaq_close_change.append(change)
    nasdaq_close_pct_change.append(pct_change)
    change_index=change_index + 1
nasdaq_df['nasdaq_close_change']=nasdaq_close_change
nasdaq_df['nasdaq_close_pct_change']=nasdaq_close_pct_change
gold_df=gold_data.rename(columns={'DATE':'Date','GOLDPMGBD228NLBM':'gold_price'}).set_index('Date')
gold_df=gold_df.loc[gold_df.loc[:,'gold_price']!='.',:]
gold_values = gold_df['gold_price'].tolist()
gold_price_change = [0]
gold_price_pct_change =[0]
change_index = 1
for g in range(len(gold_values)-1):
    change = float(gold_values[change_index]) - float(gold_values[g])
    pct_change = (change/float(gold_values[g]))*100
    gold_price_change.append(change)
    gold_price_pct_change.append(pct_change)
    change_index=change_index + 1
gold_df['gold_price_change']=gold_price_change
gold_df['gold_price_pct_change']=gold_price_pct_change
stocks_gold_daily_df = pd.merge(djia_df,nasdaq_df, how='outer',on='Date')
stocks_gold_daily_df = stocks_gold_daily_df.merge(sp500_df,how='outer',on='Date')
stocks_gold_daily_df = stocks_gold_daily_df.merge(gold_df,how='outer',on='Date').sort_values(by=['Date'])


In [278]:
#### Export all data to final .CSVs for upload to PostgreSQL
# m1m2_df.to_csv('../DFs_for_DB/m1m2.csv')
# velocity_df.to_csv('../DFs_for_DB/velocity.csv')
# stocks_gold_daily_df.to_csv('../DFs_for_DB/stocks_gold_daily.csv')
# banks_week_month_df.to_csv('../DFs_for_DB/banks_week_month.csv')
# consumer_monthly_df.to_csv('../DFs_for_DB/consumer_monthly.csv')
# consumers_quarterly_df.to_csv('../DFs_for_DB/consumers_quarterly.csv')
# cpi_monthly_df.to_csv('../DFs_for_DB/cpi_monthly.csv')
# federal_reserve_weekly_df.to_csv('../DFs_for_DB/federal_reserve_weekly.csv')
# foreign_trade_month_quarter_df.to_csv('../DFs_for_DB/foreign_trade_month_quarter.csv')
# gdp_quarterly_df.to_csv('../DFs_for_DB/gdp_quarterly.csv')
# government_quarterly_df.to_csv('../DFs_for_DB/government_quarterly.csv')
# investment_month_quarter_df.to_csv('../DFs_for_DB/investment_month_quarter.csv')
# misc_annual_df.to_csv('../DFs_for_DB/misc_annual.csv')
# misc_daily_df.to_csv('../DFs_for_DB/misc_daily.csv')
# ppi_monthly_df.to_csv('../DFs_for_DB/ppi_monthly.csv')