## Parameter recovery

We simulated data for a set of design and various ground truth parameters. Now we will try to estimate those parameters from the simulated data

In [55]:
# Built-in/Generic Imports
import os,sys
import glob

# Libs
import numpy as np
import pandas as pd
import pymc as pm
import arviz as az

In [56]:

def estimate_bhm(index,design_df,choices):

    # save_dir = os.path.join('simul','bhm')
    
    delay_amt = design_df['cdd_delay_amt'].values
    delay_wait = design_df['cdd_delay_wait'].values
    immed_amt = design_df['cdd_immed_amt'].values
    immed_wait = design_df['cdd_immed_wait'].values
    subj_id = [int(index)]*len(choices)
    print(subj_id,choices)
    # We will fit a model for each subject
    with pm.Model() as model_simple:

        # Hyperparameters for kappa
        mu_kappa_hyper = pm.Beta('mu_kappa_hyper',mu=0.02,sigma=0.01)
        # use the same hyper SD for both parameters
        sd_hyper = pm.LogNormal('sd_hyper',sigma=1)

        kappa = pm.LogNormal('kappa',mu=mu_kappa_hyper,sigma=sd_hyper,shape=np.size(np.unique(subj_id)))
        gamma = pm.HalfNormal('gamma',sigma=sd_hyper,shape=np.size(np.unique(subj_id)))
        
        prob = pm.Deterministic('prob', 1 / (1 + pm.math.exp(-gamma[subj_id] * ( delay_amt/(1+(kappa[subj_id]*delay_wait)) 
                                                                                - immed_amt/(1+(kappa[subj_id]*immed_wait)) ))))

        y_1 = pm.Bernoulli('y_1',p=prob,observed=choices)

        trace_prior = pm.sample(1000, tune=100, cores=2,target_accept=0.98)


    # This is how you get a nice array. Note that this returns a pandas DataFrame, not a numpy array. Indexing is totally different.
    summary= az.summary(trace_prior,round_to=10)
    kappa_hat = summary['mean'].loc['kappa[{}]'.format(index)]
    gamma_hat = summary['mean'].loc['gamma[{}]'.format(index)]
    return kappa_hat,gamma_hat
    # fn = os.path.join(save_dir,'subj_{0:04d}.csv'.format(index))
    # print('Saving to : {}'.format(fn))
    # print(summary)
    # sys.exit()
    # # summary.to_csv(fn)



In [57]:

fn = os.path.join('simul','ground_truth.csv')
params_df = pd.read_csv(fn,index_col=0)

fn = os.path.join('simul','design_set.csv')
design_df = pd.read_csv(fn,index_col=0)

simulated_data = sorted(glob.glob(os.path.join('simul','response','*.csv')))
kappa_hat,gamma_hat = [],[]

for index,fn in enumerate(simulated_data):
    print(fn)
    df = pd.read_csv(fn,index_col=0)
    choices = df['response']
    kh,gh = estimate_bhm(index,design_df,choices)
    kappa_hat += [kh]
    gamma_hat += [gh]

params_df['kappa_bhm'] = kappa_hat
params_df['gamma_bhm'] = gamma_hat
params_df

simul/response/p0000.csv
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] 0      0
1      1
2      1
3      1
4      1
      ..
107    1
108    1
109    1
110    1
111    1
Name: response, Length: 112, dtype: int64


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [mu_kappa_hyper, sd_hyper, kappa, gamma]


Sampling 2 chains for 100 tune and 1_000 draw iterations (200 + 2_000 draws total) took 3 seconds.
We recommend running at least 4 chains for robust computation of convergence diagnostics


simul/response/p0001.csv
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] 0      0
1      1
2      1
3      1
4      1
      ..
107    1
108    1
109    1
110    1
111    1
Name: response, Length: 112, dtype: int64


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...


IndexError: index 1 is out of bounds for axis 0 with size 1
Apply node that caused the error: AdvancedSubtensor1(kappa_log___log, TensorConstant{(112,) of 1})
Toposort index: 8
Inputs types: [TensorType(float64, (1,)), TensorType(uint8, (112,))]
Inputs shapes: [(1,), (112,)]
Inputs strides: [(8,), (1,)]
Inputs values: [array([14.41322166]), 'not shown']
Outputs clients: [[Elemwise{Composite}(AdvancedSubtensor1.0, TensorConstant{[  2.   2...180. 180.]}, TensorConstant{(1,) of 1.0}, TensorConstant{[ 7. 14. 2... 50. 65.]}, TensorConstant{[10. 10. 1... 20. 20.]}, TensorConstant{(1,) of -1.0}, AdvancedSubtensor1.0, TensorConstant{(1,) of 1}, TensorConstant{(1,) of 0}, y_1{[0 1 1 1 1.. 1 1 1
 1]})]]

Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
  File "/Users/pizarror/opt/miniconda/envs/idm_jupy/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3009, in run_cell
    result = self._run_cell(
  File "/Users/pizarror/opt/miniconda/envs/idm_jupy/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3064, in _run_cell
    result = runner(coro)
  File "/Users/pizarror/opt/miniconda/envs/idm_jupy/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner
    coro.send(None)
  File "/Users/pizarror/opt/miniconda/envs/idm_jupy/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3269, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/Users/pizarror/opt/miniconda/envs/idm_jupy/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3448, in run_ast_nodes
    if await self.run_code(code, result, async_=asy):
  File "/Users/pizarror/opt/miniconda/envs/idm_jupy/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3508, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "/var/folders/ts/wpzrly5j2yxb42zf5v0w5rvh0000gs/T/ipykernel_49930/1396054709.py", line 14, in <module>
    kh,gh = estimate_bhm(index,design_df,choices)
  File "/var/folders/ts/wpzrly5j2yxb42zf5v0w5rvh0000gs/T/ipykernel_49930/2494971389.py", line 23, in estimate_bhm
    - immed_amt/(1+(kappa[subj_id]*immed_wait)) ))))

HINT: Use the PyTensor flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.

In [58]:
params_df

Unnamed: 0,kappa_gt,gamma_gt
0,0.00001,0.5
1,0.00001,1.0
2,0.00001,1.5
3,0.00001,2.0
4,0.00001,2.5
...,...,...
95,1.00000,3.0
96,1.00000,3.5
97,1.00000,4.0
98,1.00000,4.5
