**Mehod of Simulated Moments**

This tutorial walks you through the setup of method of simulated moment estimation in respy. 
Respy requires you to provide two addditional object next to the model specification. 
One is a pandas Series containing all empirical moments and one is a function that calculates simulated moments
from the simulated Data. 
Additionally one can supply an own weighting matrix, the default choice is the identity. 
One can also request respy to return the result in vector form which might be requred by some derivative free optimization algorithms. 

In [4]:
#Import relevant packages 
import respy as rp
import numpy as np

from respy.pre_processing.model_processing import process_params_and_options
from respy.tests.random_model import get_mock_moment_func
from respy.method_of_simulated_moments import get_msm_func

ModuleNotFoundError: No module named 'respy'

**Moment Function**

In the first step we create a moment function. 
For  the example we only target choice probabilities and
no other moments.
Therefore we build a function that extracts the proportion of people taking each decision 
in each period from a simulated dataset. 
The results are put into an indexed pandas Series. 

In [5]:
def get_moment_func(df, optim_paras):
    """
    Args: 
        df: pd.DataFrame containing simulated outcomes.
        optim_paras: dict containing relevant information 
                     about the model specifcation. 
    Returns: 
        out: function mapping simulated data into moments
    """
    periods, choices, container_idx = _create_index_mock(df, optim_paras)
    out = partial(_get_moments,
                  container_idx=container_idx,
                  periods=periods,
                  choices=choices)
    return out

def _get_moments(df, container_idx, periods, choices):
    """
    Args: 
        df: pd.DataFrame containing simulated outcomes.
        container_idx: List of indices for the resulting pd.Series
        periods: List of periods
        choices: List of admissible choices
    Returns: 
        moments: pd.Series containing all moments
    """
    #Create container for moments with given index
    moments = pd.Series(index=container_idx)
    
    #Set up the input dataframe in a convneinet way
    df_indexed = df.set_index(["Identifier", "Period"], drop=True)
    
    #We want moments every period thus we group by period 
    df_grouped_period = df_indexed.groupby(["Period"])
    
    #We want the proportion of people taking each decision in each period
    #Thus we also group by choice and apply a summary statistic
    info_period = df_grouped_period["Choice"].value_counts(normalize=True).to_dict()
    
    #We replace missing probabilities with zeros
    info_period = defaultdict(lambda: 0.00, info_period)

    #We put the results in the container we created before 
    for period in periods:
        for choice in choices:
            name = (period, choice)
            moments.loc[name] = info_period[name]
    return moments

def _create_index_mock(df, optim_paras):
    """
    Args: 
        df: pd.DataFrame containing simulated outcomes.
        optim_paras: dict containing relevant information 
                     about the model specifcation. 
    Returns: 
        periods: List of periods
        choices: List of admissible choices
        container_idx:List of indeices for the resulting pd.Series containing the moments
    """
    periods = sorted(df["Period"].unique())
    choices = sorted(list(optim_paras["choices"].keys()))
    
    #In this example we only target choice probabilities thus the index 
    #consists of all combinations of periods and choices 
    container_idx = list(itertools.product(periods, choices))
    return periods, choices, container_idx

**Create Criterion**

Now we have all the ingredients to create a first simple criterion function:

In [6]:
#Get example model specification    
params, options,_ = rp.get_example_model("kw_94_one")

#Build simulation function 
simulate = rp.get_simulate_func(params, options)

#Simulate a model 
df = simulate(params)

#Get auxiliary object for the creation of the moments 
optim_paras, _ = process_params_and_options(params, options)

    

NameError: name 'rp' is not defined

In [7]:
#Build the function that maps a simulated model into moments 
get_moments = get_mock_moment_func(df, optim_paras)

#Map a simulated dataset into corresponding moments 
moments_base = get_moments(df)

#Build the MSM criterion function. This function maps a model specification into
#the weighted sum of squared deviations of the simulated data from the empirical moemnts. 
msm = get_msm_func(params, options, moments_base, get_moments)

#Evaluate the function at the "true" paramter 
rslt = msm(params)

#We ecpect the result to be 0 
assert rslt == 0

NameError: name 'get_mock_moment_func' is not defined

**Weighting Matrix**

The example we just used does not specify a weighting matrix. 
In such a case the identity is used as default. In the situation at hand this 
is most likely not a huge problem since all moments are of the same order of magnitude. 
If we however target a range of different moments it could very well be that moments have different orders of magnitude 
which make the identity basically useless as weighting matrix. 
Respy expects a pandas Dataframe with both columns and index equivalent to the index of the moments object.
The following function illustrates how to construct such a matrix. 
It constructs the inverse Variance matrix that is most commonly used for our example. 


In [8]:
def get_weighting_matrix_func(df, num_boots, num_agents_smm, optim_paras):
    get_moments =  get_moment_func(df, optim_paras)
    out = partial(_get_weighing_matrix,
        num_boots=num_boots,
        num_agents_smm=num_agents_smm,
        get_moments=get_moments
    )
    return out


def _get_weighing_matrix(df, num_boots, num_agents_smm, get_moments, is_store=False):
    """This function constructs the weighing matrix."""
    # Ensure reproducibility
    np.random.seed(123)

    # Distribute clear baseline information.
    container_idx = get_moments(df).index
    df = df.set_index("Identifier")
    index_base = df.index.get_level_values('Identifier').unique()

    # Initialize counters to keep track of attempts.
    moments_sample = []

    for _ in range(num_boots):
        sample_ids = np.random.choice(index_base, num_agents_smm, replace=False)
        moments_boot = get_moments(df.loc[sample_ids].reset_index())
        moments_sample.append(moments_boot)


    # Construct the weighing matrix based on the sampled moments.
    stats = pd.concat(moments_sample, axis = 1)

    moments_var = stats.to_numpy().T.var(axis=0)

    is_zero = moments_var <= 1e-10

    #Set the variance to the one of a univ random variable. The no variance condition should only
    #appear for such moments
    moments_var[is_zero] = 0.1

    if np.all(is_zero):
        raise NotImplementedError('... all variances are zero')
    if np.any(is_zero):
        print('... some variances are zero')

    weighting_matrix = np.diag(moments_var ** (-1))
    #Build a container for weighting matrix
    weighting_matrix = pd.DataFrame(weighting_matrix, columns=container_idx, index=container_idx)

    return weighting_matrix


In [9]:
#Specify required params
num_boots = 10 
num_agents_smm = 1000

#Build the function
get_weighting_matrix = get_weighting_matrix_func(df, num_boots, num_agents_smm, optim_paras)

#Create the weighting matrix 
weighting_matrix = get_weighting_matrix(df)

#DIsplay 
print(weighting_matrix)

NameError: name 'df' is not defined