Discuss:
1. Experience formulation
2. Handling of the 1st period
3. Education initial condition specification 

In [19]:
import numpy as np

In [20]:
# Set value of MISSING_INT here. In respy it is imported from shared_constants module.
MISSING_INT = -99

As a first step, we need to create the variables which would, in the final version, be supplied as impiuts to the function.

In [21]:
num_periods = 20
num_choices = 3 # Individuals can choose between full-time (F), part-time (P), and non-employment (N), i.e. there are 3 choices
educ_max = 25 # Maximum number of years of educ in the data, i.e., upper bound of education initial condition
educ_min = 10 # Minimim number of years of educ in the data, i.e., lower bound of education initial condition

In [22]:
# Auxiliary calculation of the education dimension
educ_range = educ_max - educ_min + 1

In [23]:
educ_range

16

Next, we create arrays which we want to populate with the state space variables.
The components of our state space are:
- the time variable: period
- the initial conditions: education
- the history of choices: choice_lagged
- the accumulated experience in full-time employment: exp_f
- the accumulated experience in part-time employment: exp_p

In [24]:
# Array for mapping the state space points to indices
shape = (num_periods, educ_range, num_choices, num_periods, num_periods)
mapping_state_index = np.tile(MISSING_INT, shape)

# Maximum number of state space points per period. There can be no more states in a period than this number. 
num_states_period_upper_bound = np.prod(mapping_state_index.shape)
print(mapping_state_index.shape, num_states_period_upper_bound)

(20, 16, 3, 20, 20) 384000


Let us briefly discuss the dimension of the mapping_state_index array. Each dimension component corresponds to the state space component we would like to keep record of.
- dimension 1 num_periods corresponds to the period number
- educ_range to the years of education
- num_choices to the number of choices available to the agents each period
- dimension 4 num_periods to years of experience in part-time
- dimension 5 num_periods to years of experience in full-time

In [25]:
# Array to collect all state space points (states) that can be reached each period
states_all = np.tile(MISSING_INT, (num_periods, num_states_period_upper_bound, 4))

Let us briefly discuss the dimention of the states_all array:
- the 1st dimension is determined by the number of periods in our model
- the 2nd dimension is related to the maximum number of state space points ever feasible / possibly reachable in one of the periods. *Question: Is this true? Is 100 000 simply some arbitrary large number that is for sure larger than the highest possible number of state space points ever reachable in a period? Can one replace this number by some educated guess of the maximum number of state point combinantions given no restrictions?*
- the 3rd dimension is equal to the number of remaining state space components (except period number) that we want to record: educ_years + educ_min, choice_lagged, exp_f, exp_p

In [26]:
# Array for the maximum number state space points / states per period
states_number_period = np.tile(MISSING_INT, num_periods)

In a final step, we loop through all admissible state space points and fill up the constructed arrays with necessary information. Thereby, note two important details. 
- Since the individuals make their first choice in the first period, it is only possible to record their lagged choice from the second period onwards. Therefore, the loop directly skips the first period.
- If we want to record only admissible state space points, we have to account for the fact that individuals in the model start making labor supply choices only after they have completed education.

Let us look into latter in greater detail. As a reminder, we note that we model individuals from age 16 (legally binding mininum level of education) until age 60 (typical retirement entry age), i.e., for 45 periods, where the 1st period corresponds to age 16, the second period to age 17, etc. The current specification of starting values is equivalent to the observation / simulated reality that individuals in the sample have completed something between 10 and 25 years of edcuation. In our loop we want to take into account the fact that, in the first period at age 16, only individuals who have completed 10 years of education (assuming education for everyone starts at the age of 6) will be making a labor market choice between full-time (F), part-time (P), and non-employment (N). The remaining individuals are still in education, such that a state space point where years of education equal e.g. 11 and a labor market choice of e.g. part-time is observed is not admissible and should therefore not be recorded. This is ensured by the if clause "educ_years > period".

In [28]:
# Loop over all periods / all ages
for period in range(1, num_periods):
    
    # Start count for admissible state space points
    k = 0
        
    # Loop over all possible initial conditions for education
    for educ_years in range(educ_range):
            
        # Check if individual has already completed education
        # and will make a labor supply choice in the period
        if educ_years > period:
            continue
        
        # Loop over all admissible years of experience accumulated in part-time
        for exp_f in range(num_periods):
            
            # Loop over all admaissible years of experience accumulated in full-time
            for exp_p in range(num_periods):
                
                # Loop over the three labor market choices, N, P, F
                for choice_lagged in [1,2,3]:
                    
                    # If individual has only worked full-time in the past,
                    # she can only have full-time (2) as lagged choice
                    if (choice_lagged != 2) and (exp_f == period - educ_years):
                        continue
                    
                    # If individual has only worked part-time in the past,
                    # she can only have part-time (1) as lagged choice
                    if (choice_lagged != 1) and (exp_p == period - educ_years):
                        continue
                        
                    # If an individual has never worked full-time,
                    # she cannot have that lagged activity
                    if (choice_lagged == 2) and (exp_f == 0):
                        continue
                        
                    # If an individual has never worked part-time,
                    # she cannot have that lagged activity
                    if (choice_lagged == 1) and (exp_p == 0):
                        continue
                
                    # Check for duplicate states
                    if (
                        mapping_state_index[
                            period,
                            educ_years,
                            choice_lagged - 1,
                            exp_f,
                            exp_p,
                        ]
                        != MISSING_INT
                    ):
                        continue
                
                    # Assign the integer count k as an indicator for the
                    # currently reached admissible state space point
                    mapping_state_index[
                        period,
                        educ_years,
                        choice_lagged - 1,
                        exp_f,
                        exp_p,
                    ] = k

                    # Record the values of the state space components
                    # for the currently reached admissible state space point
                    states_all[period, k, :] = [
                        educ_years + educ_min,
                        choice_lagged -1,
                        exp_f,
                        exp_p
                    ]

                    # Update count
                    k += 1
    
    # Record number of admissible state space points for the period currently reached in the loop 
    states_number_period[period] = k

We briefly repeat what has been recorded in the states_all array:
 - educ_years + educ_min: in this example, values from 10 to 25
 - choice_lagged -1: 0,1,2 corresponding to N, P, and F
 - exp_f: full-time experience that can range from 0 to 19
 - exp_p: part-time experience that can range from 0 to 19
 
 *Note: There is a difference to respy here. In respy, the loop in experience is one iteration longer, goes to num_periods + 1 instead of to num_periods.*

In [30]:
# Auxiliary objects
max_states_period = max(states_number_period)

In [31]:
# Collect arguments
args = (states_all, states_number_period, mapping_state_index, max_states_period)

Questions and issues open for discussion:
1. MISSING_INT for number of states in the first period:
   The first open question conciders the handling of the first period (in code notation period == 0) in the model. For now the loop skips the first period entirely. Why? Since one of the state space components is a lagged value - choice_lagged - there is no value one can assign to this state space component in the first period. As individuals make their first choice in the first period there is no prior choice one can record as choice_lagged at the very beginning of the model.
   In respy the problem seams to not exist since the choices education and home seem to be valid entries of choice_lagged in the first period. Question: Where can this be deduced from in KW97, I do not remember coming across any explanation on this in the paper?

In [32]:
states_number_period

array([  -99,  2128,  3249,  4332,  5415,  6498,  7581,  8664,  9747,
       10830, 11913, 12996, 14079, 15162, 16245, 17328, 17328, 17328,
       17328, 17328])

Ideas checks and tests:
- check if loop visits the whole range of values by checking min and max of array entries for the differnt state space components.

In [33]:
print(np.amax(states_all[:,:,0]), np.amax(states_all[:,:,1]), np.amax(states_all[:,:,2]), np.amax(states_all[:,:,3]))

25 2 19 19


In [29]:
# Loop over all admissible years of experience accumulated in part-time
for i in range(num_periods):
    print(i)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
