# Modeling dataset bias in machine-learned theories of economic decision making - NNs
This notebook gives small examples how to work with the NN models introduced in this paper.

In [None]:
import numpy as np
import pandas as pd

# add the src folder to the python path to import the classes there
import sys
sys.path.append("./src/")
from cognitive_prior_network import CognitivePriorNetwork
from context_dependant_network import ContextDependantNetwork

In [None]:
base_features = ["Ha", "pHa", "La", "Hb", "pHb", "Lb", "LotNumB",
                 "LotShapeB", "Corr", "Amb", "Block", "Feedback"]

## Using pretrained models

In [None]:
# initialize the class and then load the weights from a location
nn_cpc15 = CognitivePriorNetwork()
nn_cpc15.load("models/cpc_bourgin_prior")

# load a dataset to use the model on
choices_df = pd.read_csv("data/choices13k.csv")

# prediction just works with predict, when extracting the right features
nn_cpc15_predictions = nn_cpc15.predict(choices_df[base_features])

In [None]:
# these are also the values which were precomputed
np.max(np.abs(choices_df.cpc15_cog_prior_pred - nn_cpc15_predictions.flatten()))

## Training new models
We show the code now once for training a cognitive prior network model on choices13k, because more models are able to fit the dataset. To fit models on CPC15, more patience and more pretraining is what helped for us. Models without pretraining have not worked at all for us on CPC15.

### Training imports

In [None]:
from hyperopt import pyll, hp, STATUS_OK, fmin, tpe, Trials
import pickle
import os

### Data loading

In [None]:
def split_xy(dataframe):
    y = dataframe["Rate"].values.astype(np.float32)
    X = dataframe[base_features].values.astype(np.float32)
    return X, y

In [None]:
synth15_df = pd.read_csv("data/synth15.csv")
X_synth15, y_synth15 = split_xy(synth15_df)

In [None]:
cpc15_df = pd.read_csv("data/cpc15.csv", index_col=0)
X_cpc15_train, y_cpc15_train = split_xy(cpc15_df.iloc[:450])
X_cpc15_test, y_cpc15_test = split_xy(cpc15_df.iloc[450:])

### Pretraining - Hyperparameter Optimization
Pretrain multiple models in a principled way.

Every set of parameters gets evaluated with 5 different random seeds. All Models and their corresponding validation loss history get saved under `../models/wide_pretraining`, so make sure you have created this folder on your system.

In [None]:
# We optimize pretraining over the batch size during training
# as well as over two parameters deciding the connectivity and the rate of change of the network
hyperparams = {
    'batch_size': hp.uniformint('batch_size', 300, 1500),
    'epsilon': hp.uniformint('eps', 10, 100),
    'zeta': hp.uniform('zeta', 0.2, 0.8)
}

def f(space):
    l = np.inf
    for i in range(5):
        cognitive_model = CognitivePriorNetwork(input_shape=12, batch_size=space['batch_size'],
                                                epsilon=space['epsilon'], zeta=space['zeta'])
        cognitive_model.X_test = X_cpc15_train
        cognitive_model.y_test = y_cpc15_train
        cognitive_model.fit(X_synth15, y_synth15, verbose=0, epochs=300, patience=300)
        cognitive_model.save('../models/wide_pretraining/bs_%d_eps_%d_zeta_%.4f_iter_%d'%
                             (space['batch_size'], space['epsilon'], space['zeta'], i))
        
        loss = np.min(cognitive_model.loss_per_epoch)
        if loss < l:
            l = loss
    print('done searching combination: batch_size: %d, epsilon: %d, zeta %.4f'%
          (space['batch_size'], space['epsilon'], space['zeta']))
    return {'loss': l, 'status': STATUS_OK}

# these optimizations can be interrupted and continue
# just pickle the trials object if the current run was interrupted
if os.path.isfile("models/cpc15_prior_training.hyperopt"):
    trials = pickle.load(open("models/cpc15_prior_training.hyperopt", "rb"))
else:
    trials = Trials()

best = fmin(
    fn=f,  # "Loss" function to minimize
    space=hyperparams,  # Hyperparameter space
    algo=tpe.suggest,  # Tree-structured Parzen Estimator (TPE)
    max_evals=50,  # Amount of trials to perform 
    trials=trials
)

### Finetuning
Finetuning on the other hand is much simpler. We stay with the same batch size, adapt the learning rate to a lower value and do not have to fit the SET parameters anymore. The only thing that we changed between CPC15 and choices13k is the number of episodes. With CPC15, you need much more and somewhere in that training process the random SET procedure can help you to make that last jump to the loss you see in our and Bourgin et al.'s Paper.

In [None]:
# insert name of the pretrained model as well as the batch size
pre_trained = CognitivePriorNetwork(input_shape=12, batch_size=X)
pre_trained.load("../models/wide_pretraining/...")

In [None]:
pre_trained.X_test = X_cpc15_test
pre_trained.y_test = y_cpc15_test
pre_trained.fit(X_cpc15_train,
                y_cpc15_train,
                learning_rate=1e-6, 
                verbose=1, 
                epochs=3000, # for choices13k, you only need 100 episodes 
                patience=3000)