Notes:
- Discuss how we want to introduce the error term in the simplest 1st version of our model.
- Is there a more efficient way to record education level.

As a final product of this notebook we wish to obtain the highest value function among the choice specific value functions for the three labor market choices at each admissible state space point in each period of the model.

In [1]:
import pickle
import numpy as np
import math

As a first step, we need to ensure that all arguments we need to supply to the function are available.

We import the model specification parameters and externally defined constants here.

In [2]:
# Execute entire file and make all variables/functions/classes
# available for further use
from ipynb.fs.full.model_spec import (num_periods,
                                      num_choices,
                                      educ_max,
                                      educ_min,
                                      educ_range,
                                      mu,
                                      delta,
                                      optim_paras,
                                      num_draws_emax,
                                      seed_emax,
                                      shocks_cov)

# Import specified definitions only from given notebook
import ipynb.fs
from .defs.shared_constants import MISSING_INT, MISSING_FLOAT

from .defs.shared_auxiliary import draw_disturbances

In a final version of soepy, the pyth_create_state_space state space is called before the backward_induction procedure. Here, we import the final output of the pyth_create_state_function.

In [3]:
# Import the final output of pyth_create_state_space, args
file_name = "args_file.pkl"
# Open the file for reading
file_object = open(file_name,'rb')  
# load the object from the file into var args
args = pickle.load(file_object)

In [4]:
# Unpack objects from agrs
states_all, states_number_period, mapping_states_index, max_states_period = args[0], args[1], args[2], args[3]

The individuals in our model, solve their optimization problem by making a labor supply choice every period. They choose the option that is associated with the highest value function. The value function for each of the 3 alternatives is the sum of the current period flow utility of choosing alternative j and a continuation value.

The flow utility includes the current period wage shock, which the individual becomes aware of in the begining of the period and includes her calculations. To obtain an estimate of the continuation value the individual has to integrate out the distribution of the future shocks. In the model implementstion, we perform numerical integration via a Monte Carlo simulation.

As a next step, we generate draws of a bivariate standard normal distribution and transform them to the know error term distribution according to the corresponding paramters in the model specification.

In [5]:
draws_emax = draw_disturbances((num_periods, num_draws_emax), shocks_cov, seed_emax)

In the current formulation, we assume that the wage process is subject to additive measurement error. The disturbances for the part-time and the full-time wage are normally distributed with mean zero. The spesification assumes no serial and also no contemporaneous correlation across the two error terms.

Finally, we need to define additional functions that are called in the backward induction loop.

In [6]:
def calculate_wage_systematic(educ_level, exp_p, exp_f, optim_paras):
    """Calculate systematic wages, i.e. net of shock, for specified state."""
    
    # Initialize container
    wage_systematic = np.nan
    
    # Construct wage components
    gamma_s0 = np.dot(educ_level, optim_paras[0:3]) 
    gamma_s1 = np.dot(educ_level, optim_paras[3:6])
    period_exp_sum = exp_p * np.dot(educ_level, optim_paras[6:9]) + exp_f 
    depreciation = 1 - np.dot(educ_level, optim_paras[9:12])
    
    # Calculate wage in the given state
    period_exp_total = period_exp_sum * depreciation + 1
    returns_to_exp = gamma_s1 * period_exp_total
    wage_systematic = np.exp(gamma_s0)*returns_to_exp
    
    # Return function output
    return wage_systematic # This is a scalar, equal for all choices

In [7]:
def calculate_period_wages(wage_systematic, draws):
    """Calculate wages for each choice including choice specific productivty shock."""
    
    # Initialize container
    period_wages = np.tile(np.nan, num_choices)
    
    # Take the exponential of the disturbances
    exp_draws = np.exp(draws)
    
    # Calculate choice specific wages including productivity shock
    period_wages = wage_systematic * exp_draws
    
    # Return function output
    return period_wages # This is a vector, difference between choices comes from disturbance term.

Note:

In the toy model, consumption in any period is zero if the individual chooses non-employment. This is the case because consumption is simply the product of the period wage and the hours worked, and the hours worked in the case of non-employment are equal to zero. The calculation of the 1st part of the utility function related to consumption involves taking period consumption to the negative pover mu. In the programm, this would yield -inf. To avoid this complication, here the consumption utility of non-employment is normalized to zero.

In [8]:
def calculate_consumption_utilities(period_wages):
    """Calculate the first part of the period utilities related to consumption"""
    
    # Initialize container
    consumption_utilities = np.tile(np.nan, num_choices)
    
    # Define hours array, possibly move to another file
    hours = np.array([0, 18, 38])
    
    # Calculate choice specific wages including productivity shock
    consumption_utilities = hours * period_wages
    consumption_utilities[1]  = consumption_utilities[1]**mu/mu
    consumption_utilities[2]  = consumption_utilities[2]**mu/mu
    
    # Return function output
    return consumption_utilities

In [9]:
def calculate_total_utilities(consumption_utilities, optim_paras):
    """Calculate total flow utility for all choices."""
    
    # Initialize container for utilities at state space point and period
    total_utilities = np.tile(np.nan, num_choices)
    
    # Calculate U(.) for the three available choices
    U_ = np.array([math.exp(0.00),  math.exp(optim_paras[12]), math.exp(optim_paras[13])])
    
    # Calculate utilities for the avaibale joices N, P, F
    total_utilities = consumption_utilities * U_
    
    # Return function_output
    return total_utilities

test_total_utilities = calculate_total_utilities(test_consumption_utilities, optim_paras)
test_total_utilities

In [10]:
def calculate_utilities(educ_level, exp_p, exp_f, optim_paras, draws):
    """Calculate flow utilities for all choices given state, period, and shocks."""
    
    # Calculate wage net of period productivity shock
    wage_systematic = calculate_wage_systematic(educ_level, exp_p, exp_f, optim_paras)
    
    # Calculate period wages for the three choices includings chocks' realizations
    period_wages = calculate_period_wages(wage_systematic, draws)
    
    # Calculate 1st part of the period flow utility related to consumption
    consumption_utilities = calculate_consumption_utilities(period_wages)
    
    # Calculate total utility by multiplying U(.) component
    utilities = calculate_total_utilities(consumption_utilities, optim_paras)
    
    # Return function output
    return utilities, consumption_utilities, period_wages, wage_systematic

In [11]:
def construct_covariates(states_all, period, k):
    """Constructs additional covariates given state space components."""
    
    # Determine education level given number of years of education
    # Would it be more efficient to do this somewhere else?

    # Unpack state space components
    educ_years = states_all[period, k, 0]

    # Extract education information
    if (educ_years <= 10):
        educ_level = [1,0,0]

    elif (educ_years > 10) and (educ_years <= 12):
        educ_level = [0,1,0]

    else:
        educ_level = [0,0,1]

    educ_years_idx = educ_years - educ_min
    
    # Return function output
    return educ_level, educ_years_idx

In [12]:
def calculate_continuation_values (period, educ_years_idx, exp_p, exp_f):
    """Obtain continuation values for all choices."""

    # Initialize container for continuation values
    continuation_values = np.tile(MISSING_FLOAT, num_choices)

    if period != (num_periods - 1):

        # Choice: Non-employment
        # Create index for extracting the continuation value
        future_idx = mapping_states_index[period + 1, educ_years_idx, 0, exp_p, exp_f]
        # Extract continuation value
        continuation_values[0] = periods_emax[period + 1, future_idx] 

        # Choice: Part-time
        future_idx = mapping_states_index[period + 1, educ_years_idx, 1, exp_p + 1, exp_f]
        continuation_values[1] = periods_emax[period + 1, future_idx]

        # Choice: Full-time
        future_idx = mapping_states_index[period + 1, educ_years_idx, 2, exp_p, exp_f + 1]
        continuation_values[2] = periods_emax[period + 1, future_idx]
    
    else:
        continuation_values = np.tile(0.0, num_choices)
        
    # Record function output
    return continuation_values

In [13]:
def construct_emax (period,
                         k,
                         educ_level,
                         educ_years_idx,
                         num_periods,
                         num_draws_emax,
                         draws_emax_period,
                         states_all,
                         mapping_states_index,
                         optim_paras,
                         periods_emax,
):
    """Obtain the maximum of the value fucntion over the available choices 
    via a Monte Carlo simulation integration procedure.
    """
    
    # Initialize container for sum of maximum value functions
    # over all error term draws for the period and state
    emax = 0.0
    
    # Loop over all error term draws
    # for the period and state currently rached by the parent loop
    for i in range(num_draws_emax):
        
        # Extract the error term draws corresponding to
        # period number, state, and loop iteration number, i
        corresponding_draws = draws_emax_period[i, :]
        
        # Extract relevant state space components 
        educ_years, _, exp_p, exp_f = states_all[period, k, :]
        
        # Calculate flow utility at current period, state, and draw
        flow_utilities = calculate_utilities(educ_level, exp_p, exp_f, optim_paras, corresponding_draws)[0]
        
        # Obtain continuation values for all choices
        continuation_values = calculate_continuation_values(period, educ_years_idx, exp_p, exp_f)
        
        # Calculate choice specific value functions
        value_functions = flow_utilities + delta*continuation_values
        
        # Obtain highest value function
        maximum = max(value_functions)
        
        # Add to sum over all draws
        emax += maximum
        
        # End loop
    
    # Average over the number of draws
    emax = emax / num_draws_emax
    
    # Thus, we have integrated out the error term
    
    # Output
    return emax

Before we begin the backward iteration procedure, we initialize the container for the final result.

In [14]:
# Initialize container for the final result,
# maximal value function per perdiod and state:
periods_emax = np.tile(MISSING_FLOAT, (num_periods, max_states_period))

We can now start the backward iteration procedure.

In [15]:
# Loop over all periods
for period in range(num_periods - 1, -1, -1):
    
    # Select the random draws for Monte Carlo integration relevant for the period
    draws_emax_period = draws_emax[period, :, :]
    
    # Loop over all admissible state space points
    # for the period currently reached by the parent loop
    for k in range(states_number_period[period]):
        
        # Construct additional education information
        educ_level, educ_years_idx = construct_covariates(states_all, period, k)  
            
        # Integrate out the error term
        emax = construct_emax (
            period,
            k,
            educ_level,
            educ_years_idx,
            num_periods,
            num_draws_emax,
            draws_emax_period,
            states_all,
            mapping_states_index,
            optim_paras,
            periods_emax,
        )
        
        # Record function output
        periods_emax[period, k] = emax

Export final output:

In [16]:
# Choose a file name
file_name = "periods_emax_file.pkl"

# Open the file for writing
with open(file_name,'wb') as my_file_obj:
    pickle.dump(periods_emax, my_file_obj)  