In [1]:
## Preamble: Package Loading
import numpy as np
import ipywidgets as ipw
from IPython.display import display
import matplotlib.pyplot as plt
from matplotlib import gridspec
import pandas as pd
import json
import kernel as kr
import psc_sumdisp as psd 
# Preamble working directory retreival

<h2>Trial Set 1: Varying the Number of Time Periods </h2> 

Here we examine the sampling distribution of $\hat{\beta}_1, \hat{\alpha}_{1}$ and $\hat{\alpha}_{2}$ as the number of time periods $T$ increases i.e. where $T \in \{30,50,70\}$, while holding the following constant (amongst others shown below).

* Number of Cross Sections: 5


* Number of Endogenous Regressors: 2


* Number of Exogenous Regressors: 2


* Total Number of Instruments: 5


* Number of Instrument Relevant to Each Cross Section: 3

<h3> Trial Set 1: Data Loading and Organization </h3> 

The following is extracts and organizes all relvant information from the results data sets whose file names are list here. 

In [2]:
inpt_filenames0 = ['pscout_6_12_1954.json' ,'pscout_6_12_1220.json' , 'pscout_6_12_1799.json']
line_nms0 = ['n=30', 'n=50' ,'n=70']

res_out0 = [psd.psc_load(inpt_filenames0[i]) for i in range(len(inpt_filenames0))]
estin_dcts0 = [res_out0[i][0] for i in range(len(inpt_filenames0))]
dgp_sum_filenames0 = [ estin_dcts0[i]['input_filename'].replace('pscdata','pscsum')
                      for i in range(len(inpt_filenames0))]
dgp_dicts0 = [psd.pscsum_load(dgp_sum_filenames0[i]) 
             for i in range(len(dgp_sum_filenames0))]
dgpin_dcts0 =  [dgp_dicts0[i][0] for i in range(len(inpt_filenames0))]
merged_dcts0 = [{**estin_dcts0[i],**dgpin_dcts0[i]} for i in range(len(inpt_filenames0))]
true_bcoeffs0 = [dgp_dicts0[i][1] for i in range(len(inpt_filenames0))]
true_acoeffs0 = [dgp_dicts0[i][2] for i in range(len(inpt_filenames0))]
bcoeff0  = [res_out0[i][1] for i in range(len(inpt_filenames0))]
acoeff0  = [res_out0[i][3] for i in range(len(inpt_filenames0))]
btables0 = [res_out0[i][2] for i in range(len(inpt_filenames0))]
atables0 = [res_out0[i][4] for i in range(len(inpt_filenames0))]

<h3> Trial Set 1: Merged DGP and Estimator Function Input Dictionary Comparison </h3> 

Here I have merged together the dictionary used to generate the underlying dataset used to generate the results (you will see the file name for this data set below) and the dictionary used to produce the estimates based on that data below. 

Below you will see a slider which can be used to summarize this merged dictionary corresponding to the position its file name appears in 'input_filenames0' above. 

In [3]:
psd.indict_dsp(merged_dcts0,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=3, min=1, style=SliderStyle(description…

<h3> Trial Set 1: True Primary Equations Coefficients Comparison </h3>

Here I interactively display the coefficent vector $\beta_1$ used to generate the data set corresponding to the position its file name appears in 'input_filenames0' above. Here they should be identical. 

In [4]:
psd.indict_dsp(true_bcoeffs0,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=3, min=1, style=SliderStyle(description…

<h3> Trial Set 1: True Secondary Equation Coefficients Comparison </h3> 

Here I interactively display the coefficent vectors $\alpha_{1jd}$ used to generate the data set (by row indicating cross section and equation) corresponding to the position its file name appears in 'input_filenames0' above. Here they should also be identical across data sets. 

**Note:** 

1.) That since in the above 'sec_pan = 1' the secondary equations are panel type so all non zero coefficients in a columns should be identical. 

2.) A zero coefficient in the following matrix means that the instrument it multiplies is not relevant to that cross section. 


In [5]:
psd.indict_dsp(true_acoeffs0,2)

VBox(children=(IntSlider(value=1, description='Equation: ', max=2, min=1, style=SliderStyle(description_width=…

<h3> Trial Set 1: Primary Function Coefficient Estimates </h3>

Here I show the sampling distribution of the elements of $\beta_1$.  

In [6]:
display(psd.cfs_dsp(bcoeff0,btables0,1,12,line_nms0))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h4> Comments on Primary Function Coefficient Estimates </h4>

1.) The sampling distribution behave in the way that we would expect a consistent estimator to behave meaning that the bias and variance of all coefficient decrease as the number of time periods increases.  

2.) The sample variance of the coefficients multiplying the endogenous regressors are much larger than those multiplying the exogenous regressors. Given the dgp this makes sense in that $Z_1$ is not correlated with error term $\varepsilon$, thus its identification is accomplished without the need for estimating $V$ meaning that the 

<h3> Trial Set 1: Secondary Function Coefficient Estimates </h3>

In [7]:
display(psd.cfs_dsp(acoeff0,atables0,2,5,line_nms0))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h2> Trial Set 2: Varying the number of crossections </h2>

<h3> Trial Set 2: Data Loading and Organization </h3> 

In [8]:
inpt_filenames1 = ['pscout_6_12_1220.json' ,'pscout_6_13_1914.json'
                   ,'pscout_6_13_1498.json','pscout_6_13_1227.json' ]
line_nms1 = ['ncr = 5','ncr = 10', 'ncr = 15', 'ncr = 20']

res_out1 = [psd.psc_load(inpt_filenames1[i]) for i in range(len(inpt_filenames1))]
estin_dcts1 = [res_out1[i][0] for i in range(len(inpt_filenames1))]
dgp_sum_filenames1 = [ estin_dcts1[i]['input_filename'].replace('pscdata','pscsum')
                      for i in range(len(inpt_filenames1))]
dgp_dicts1 = [psd.pscsum_load(dgp_sum_filenames1[i]) 
             for i in range(len(dgp_sum_filenames1))]
merged_dcts1 = [{**estin_dcts0[i],**dgpin_dcts0[i]} for i in range(len(inpt_filenames0))]
dgpin_dcts1 =  [dgp_dicts1[i][0] for i in range(len(inpt_filenames1))]
true_bcoeffs1 = [dgp_dicts1[i][1] for i in range(len(inpt_filenames1))]
true_acoeffs1 = [dgp_dicts1[i][2] for i in range(len(inpt_filenames1))]
bcoeff1  = [res_out1[i][1] for i in range(len(inpt_filenames1))]
acoeff1  = [res_out1[i][3] for i in range(len(inpt_filenames1))]
btables1 = [res_out1[i][2] for i in range(len(inpt_filenames1))]
atables1 = [res_out1[i][4] for i in range(len(inpt_filenames1))]

<h3> Trial Set 2: DGP and Estimator Function Input Dictionary Comparison </h3> 

In [9]:
psd.indict_dsp(merged_dcts1,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=3, min=1, style=SliderStyle(description…

<h3> Trial Set 2: True Primary Equations Coefficients Comparison </h3>

In [10]:
psd.indict_dsp(true_bcoeffs1,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=4, min=1, style=SliderStyle(description…

<h3> Trial Set 2: True Secondary Equation Coefficients Comparison </h3> 

In [11]:
psd.indict_dsp(true_acoeffs1,2)

VBox(children=(IntSlider(value=1, description='Equation: ', max=2, min=1, style=SliderStyle(description_width=…

<h3> Trial Set 2: Primary Function Coefficient Estimates </h3>

In [12]:
display(psd.cfs_dsp(bcoeff1,btables1,1,12,line_nms1))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h3> Trial Set 2: Secondary Function Coefficient Estimates </h3>

In [13]:
display(psd.cfs_dsp(acoeff1,atables1,2,8,line_nms1))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h2> Trial Set 3: Varying Whether the subset of know regressors is known or not </h2>

<h3> Trial Set 3: Data Loading and Organization </h3> 

In [36]:
inpt_filenames2 = ['pscout_6_12_1954.json' ,'pscout_6_19_1577.json']
line_nms2 = ['Known Sub','Unknown Sub']

res_out2 = [psd.psc_load(inpt_filenames2[i]) for i in range(len(inpt_filenames2))]
estin_dcts2 = [res_out2[i][0] for i in range(len(inpt_filenames2))]
dgp_sum_filenames2 = [ estin_dcts2[i]['input_filename'].replace('pscdata','pscsum')
                      for i in range(len(inpt_filenames2))]
dgp_dicts2 = [psd.pscsum_load(dgp_sum_filenames2[i]) 
             for i in range(len(dgp_sum_filenames2))]
dgpin_dcts2 =  [dgp_dicts2[i][0] for i in range(len(inpt_filenames2))]
merged_dcts2 = [{**estin_dcts2[i],**dgpin_dcts2[i]} for i in range(len(inpt_filenames2))]
true_bcoeffs2 = [dgp_dicts2[i][1] for i in range(len(inpt_filenames2))]
true_acoeffs2 = [dgp_dicts2[i][2] for i in range(len(inpt_filenames2))]
bcoeff2  = [res_out2[i][1] for i in range(len(inpt_filenames2))]
acoeff2  = [res_out2[i][3] for i in range(len(inpt_filenames2))]
btables2 = [res_out2[i][2] for i in range(len(inpt_filenames2))]
atables2 = [res_out2[i][4] for i in range(len(inpt_filenames2))]

<h3> Trial Set 3: DGP and Estimator Function Input Dictionary Comparison </h3> 

In [37]:
psd.indict_dsp(merged_dcts2,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=2, min=1, style=SliderStyle(description…

<h3> Trial Set 3: True Primary Equations Coefficients Comparison </h3>

In [38]:
psd.indict_dsp(true_bcoeffs2,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=2, min=1, style=SliderStyle(description…

<h3> Trial Set 3: True Secondary Equation Coefficients Comparison </h3> 

In [39]:
psd.indict_dsp(true_acoeffs2,2)

VBox(children=(IntSlider(value=1, description='Equation: ', max=2, min=1, style=SliderStyle(description_width=…

<h3> Trial Set 3: Primary Function Coefficient Estimates </h3>

In [40]:
display(psd.cfs_dsp(bcoeff2,btables2,1,12,line_nms2))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h3> Trial Set 3: Secondary Function Coefficient Estimates </h3>

In [42]:
display(psd.cfs_dsp(acoeff2,atables2,2,8,line_nms2))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h2> Trial Set 4: Varying whether the subset of know regressors is known or not t_inst = 10 </h2>

<h3> Trial Set 4: Data Loading and Organization </h3> 

In [52]:
inpt_filenames3 = ['pscout_6_19_1579.json' ,'pscout_6_19_1326.json']
line_nms3 = ['Known Sub','Unknown Sub']

res_out3 = [psd.psc_load(inpt_filenames3[i]) for i in range(len(inpt_filenames3))]
estin_dcts3 = [res_out3[i][0] for i in range(len(inpt_filenames3))]
dgp_sum_filenames3 = [ estin_dcts3[i]['input_filename'].replace('pscdata','pscsum')
                      for i in range(len(inpt_filenames3))]
dgp_dicts3 = [psd.pscsum_load(dgp_sum_filenames3[i]) 
             for i in range(len(dgp_sum_filenames3))]
dgpin_dcts3 =  [dgp_dicts3[i][0] for i in range(len(inpt_filenames3))]
merged_dcts3 = [{**estin_dcts3[i],**dgpin_dcts3[i]} for i in range(len(inpt_filenames3))]
true_bcoeffs3 = [dgp_dicts3[i][1] for i in range(len(inpt_filenames3))]
true_acoeffs3 = [dgp_dicts3[i][2] for i in range(len(inpt_filenames3))]
bcoeff3  = [res_out3[i][1] for i in range(len(inpt_filenames3))]
acoeff3  = [res_out3[i][3] for i in range(len(inpt_filenames3))]
btables3 = [res_out3[i][2] for i in range(len(inpt_filenames3))]
atables3 = [res_out3[i][4] for i in range(len(inpt_filenames3))]

<h3> Trial Set 4: DGP and Estimator Function Input Dictionary Comparison </h3> 

In [53]:
psd.indict_dsp(merged_dcts3,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=2, min=1, style=SliderStyle(description…

<h3> Trial Set 4: True Primary Equations Coefficients Comparison </h3>

In [54]:
psd.indict_dsp(true_bcoeffs2,1)

VBox(children=(IntSlider(value=1, description='Results Dataset: ', max=2, min=1, style=SliderStyle(description…

<h3> Trial Set 4: True Secondary Equation Coefficients Comparison </h3> 

In [55]:
psd.indict_dsp(true_acoeffs3,2)

VBox(children=(IntSlider(value=1, description='Equation: ', max=2, min=1, style=SliderStyle(description_width=…

<h3> Trial Set 4: Primary Function Coefficient Estimates </h3>

In [56]:
display(psd.cfs_dsp(bcoeff3,btables3,1,12,line_nms3))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…

<h3> Trial Set 3: Secondary Function Coefficient Estimates </h3>

In [57]:
display(psd.cfs_dsp(acoeff3,atables3,2,8,line_nms3))

VBox(children=(Output(), Output(), IntSlider(value=1, description='Coefficient:', layout=Layout(align_items='s…