This project solves an optimal growth model using Dynamic Programming techniques by approximating iteratively the value function. It presents Reinforcement Learning algorithms to evaluate arbitrary policies on the agent control variable, and update the policy function ofthe agent using a greedy procedure.
Additionally, this project calibrates a Neural Network that approximates the optimal path of the policy function from an arbitrary amount of capital.
Consider an agent who allocates her time between producing the consumption good C, and accumulating human capital H. The agent seeks to maximize her discounted utility, as captured by the following objective:
The production function reads in (i). Normalizing the labor supply of the agent to one, the accumulation law for human capital is given by (ii), where 𝛿 is the depreciation rate of human capital, while 𝐿t is the share of the labor supply dedicated to the production of the consumption good.
We assume a CRRA Utility function: U(C) = C^(1-𝜎)/(1-𝜎).
The Bellman Equation of the agent is:
We iteratively approximate the value function, reached by the following optimal policy function:
We aim to calibrate a neural network that approximates the path starting from an arbitrary initial stock of human capital and converging toward the steady state. To calibrate the neural network, we minimize the distance between the simulated and theoretical values of the:
- Euler equation
- Accumulation law of human capital
- Initial stock of human capital
We estimate the optimal path using a deep neural network calibration with back propagation of the loss of the residuals associated with the three equations. We observe a convergence to the steady state of the model.