# Problem Set 5
## By Scott Behmer. February 11, 2018.
In this problem set the parameters of the Brock and Murman model are estimated using simulated method of moments. The code requires the dataset "NewMacroSeries.txt". If that data is not in the same folder, the import command will have to be changed.

The cell below imports the required packages and loads the data.

In [2]:
# Import packages and load the data
import numpy as np
import numpy.random as rnd
import numpy.linalg as lin
import scipy.stats as sts
import scipy.integrate as intgr
import scipy.optimize as opt
import matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
cmap1 = matplotlib.cm.get_cmap('summer')
# This next command is specifically for Jupyter Notebook
%matplotlib notebook

pts = np.loadtxt('NewMacroSeries.txt', delimiter = ",")

The function "drawnormal" takes a matrix of values generated from a uniform(0, 1) distribution and outputs a matrix of normally distributed values with mean zero and variance sigma. Each uniform value is plugged into the inverse cdf function of the normal distribution, which will output the corresponding normal value.

In [3]:
def drawnormal(unif_vals, sigma):
    norm_draws = sts.norm.ppf(unif_vals, loc=0, scale=sigma)
    return norm_draws

Six moments are used to identify the four parameters of the model:
$$mean(c_t)$$
$$mean(k_t)$$
$$mean\left(\frac{c_t}{y_t}\right)$$
$$var(y_t)$$
$$corr(c_{t}, c{t-1})$$
$$corr(c_{t}, k{t})$$

Seven functions are defined in the cell below:
1. The mom_data function takes in the macro dataset (which is a 100x4 array, corresponding to 100 time periods and 4 macro variables) and outputs the six moments. This function will be used for both the real dataset and the simulated datasets.
2. The sim_z function takes in parameter values and a matrix of normally distributed values. It outputs the time series for z (total factor productivity).
3. The sim_k function takes parameter values and the z-time series. It outputs the time series for k.
4. The sim_c function takes in parameter values and time series for z, k, r, and w. It outputs the time series for c.
5. The mom_model function takes in parameter values and our matrix of uniform values. It outputs the six simulated moments. To do this is first creates 1000 simulated datasets (each of 100 observations) by calling the drawnormal, sim_z, sim_k, and sim_c functions. Each of these simulated datasets are then fed into the mom_data function to calculate 1000 sets of moments. These are then averaged to find our six simulated moments.
6. The error function takes in real data, a matrix of uniform values, and parameter values. It calls the mom_data and mom_model functions to calculate the two moment vectors. The relative difference between these is outputted as the error vector.
7. The criterion function will be minimized to determine our parameter estimates. It takes in data, simulated values, and parameter values. It outputs the magnitude of the error vector.

In [4]:
def mom_data(data):
    moments = np.zeros(6)
    moments[0] = np.mean(data[:, 0])
    moments[1] = np.mean(data[:, 1])
    moments[2] = np.mean(data[:, 0]/data[:, 4])
    moments[3] = np.var(data[:, 4])
    moments[5] = np.corrcoef(data[:, 0], data[:, 1])[0, 1]
    lag = np.zeros(len(data)-1)
    dummy = np.zeros(len(data)-1)
    for i in range(0, len(data)-1):
        lag[i]=data[i][0]
        dummy[i]=data[i+1][0]
    moments[4] = np.corrcoef(lag, dummy)[0, 1]
    return moments

def sim_z(eps, rho, mu, sigma):
    z = np.zeros(np.shape(eps))
    z[0, :] = mu + eps[0, :]
    for i in range(1, len(eps)):
        z[i, :] = rho*z[i-1, :] + (1-rho)*mu + eps[i, :]
    return z

def sim_k(z, k1, alpha):
    k = np.zeros(np.shape(z))
    #k1 is the initial k value
    k[0, :] = k1
    for i in range(1, len(k)):
        k[i, :] = alpha*.99*np.exp(z[i, :])*(k[i-1, :]**alpha)
    return k

def sim_c(k, w, r, z, alpha):
    c = np.zeros(np.shape(k))
    for i in range(0, len(k)-1):
        c[i, :]= w[i, :] + r[i, :]*k[i, :] - k[i+1, :]
    f = len(k)-1
    c[f, :] = w[f, :]+ r[f, :]*k[f, :] - alpha*.99*np.exp(z[f, :])*(k[f, :]**alpha)
    return c
        
def mom_model(unif_values, alpha, rho, mu, sigma, k1):
    eps = drawnormal(unif_values, sigma)
    z = sim_z(eps, rho, mu, sigma)
    k = sim_k(z, k1, alpha)
    w = (1-alpha)*np.exp(z)*(k**alpha)
    r = alpha*np.exp(z)*(k**(alpha-1))
    c = sim_c(k, w, r, z, alpha)
    y = np.exp(z)*(k**alpha)
    moments = np.zeros(6)
    tempdata = np.zeros((100, 5))
    for i in range(0, np.shape(unif_values)[1]):
        tempdata[:, 0], tempdata[:, 1], tempdata[:,2], tempdata[:,3], tempdata[:,4] = c[:,i], k[:,i], w[:,i], r[:,i], y[:,i]
        moments += mom_data(tempdata)
    return moments/1000

def error(data, unif_values, alpha, rho, mu, sigma, k1):
    datamoments = mom_data(data)
    modelmoments = mom_model(unif_values, alpha, rho, mu, sigma, k1)
    errorvec = (datamoments - modelmoments)/modelmoments
    return errorvec
    
def crit(params, *args):
    alpha, rho, mu, sigma = params
    data, unif_values, W, k1 = args
    err = error(data, unif_values, alpha, rho, mu, sigma, k1)
    criterion = np.dot(np.dot(err.T, W), err)
    #Use the below command for trouble-shooting
    #print(alpha, rho, mu, sigma, criterion)
    return criterion

The cell below minimizes the criterion function and prints out the estimated parameter values. Note that the bounds for $\alpha$ are slightly different than those given in the problem set. If the bounds are set to be (0.01, .99), the optimizer eventually tries $\alpha = .99$, which leads to the investment series blowing up to infinity. To avoid this, the upper bound is set to .8. Because the resulting estimate is significantly lower than this upper bound anyway, it doesn't seem that this change in bounds is eliminating an important portion of the parameter space. 

Although the optimization takes about five minutes to run (perhaps this could be fixed by making the simulated moment code more efficient), it successfully finds a local minimum. The resulting parameter values seem to be robust to changes in initial conditions. The resulting criterion value is .0052. Notice that the rho value, representing the persistence of TFP shocks, is right up against its upper bound. This is why using an unconstrained optimization method would not yield proper results.

In [8]:
#Defining Parameter Bounds
bnds = ((0.01, .8), (-.99, .99), (5, 14), (.01, 1.1))
#Definining initial parameter values
alpha_init, rho_init, mu_init, sigma_init = 0.44, 0.72, 9.45, .09
params_init = np.array([alpha_init, rho_init, mu_init, sigma_init])
#Our weighting matrix is the identity matrix
W_hat = np.eye(6)
#The simulated initial investment is the mean of the actual data's investment time series.
k1 = np.mean(pts[:,1])
smm_args = (pts, unif_vals, W_hat, k1)
#Minimizing the criterion function
results = opt.minimize(crit, params_init, args=(smm_args), bounds = bnds)
#Outputting parameter values
alpha_gmm, rho_gmm, mu_gmm, sigma_gmm = results.x
print(' alpha_gmm:', alpha_gmm, ' rho_gmm:', rho_gmm, ' mu_gmm:', mu_gmm, ' sigma_gmm:', sigma_gmm)
#Outputting criterion values
print('The resulting criterion value is: ', results.fun)
results

 alpha_gmm: 0.4206700577  rho_gmm: 0.99  mu_gmm: 9.91399686613  sigma_gmm: 0.0507436686869
The resulting criterion value is:  0.0052044883753


      fun: 0.0052044883752979717
 hess_inv: <4x4 LbfgsInvHessProduct with dtype=float64>
      jac: array([ -3.48235329e-05,  -2.26879340e-01,  -1.97246733e-06,
        -8.85680418e-06])
  message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
     nfev: 275
      nit: 28
   status: 0
  success: True
        x: array([ 0.42067006,  0.99      ,  9.91399687,  0.05074367])