# Title of Your Group Project

This Python notebook serves as a template for your group project for the course "Modeling in Cognitive Science".

This is the practical part of the group project where you get to implement the computational modeling workflow. In this part, you are expected to:


*   Implement at least two computational models relevant for your hypothesis. *(3 points)*
*   Simulate behavior from the two models. *(3 points)*
*   Implement a procedure for fitting the models to data. *(4 points)*
*   Implement a procedure for parameter recovery. *(5 points)*
*   (Implement a procedure for model recovery.) *(optional; 2 bonus points)*
*   Implement a model comparison. *(5 points)*.

You can gain a total of 20 points for the practical part of the group project.

**Note:** *Some of the exercises below (e.g. Model Simulation) rely on code from previous exercises (e.g., Model Implementation). In such cases, you are encouraged to rely on functions implemented for previous exercises. That is, you don't have to produce redundant code.*



## Model Implementation *(3 points)*

For this exercise you should:

*   Implement and simulate data from two* models that are suitable to test your hypothesis. *(3 points)*

<font size=2>*You may implement more than two models if you wish. However, two models are sufficient for this group project.</font>

Make sure to comment your code and provide an explanation for each code block in a preceding text block.


In [236]:
import numpy as np

# environment of the experiment
class TowStepEnv:
    # TODO 
    # - randome seed
    action_space = [0, 1]
    state_space = [0, 1, 2]
    def __init__(self):
        self.state = 0
        # self.action_space = [0, 1]
        # self.state_space = [0, 1, 2]
        self.transition_prob = 0.7
        self.reward = 1
        self.terminal = False
        self.info = {}
        
        # matrix of transition probabilities
        # 0(action left) -> [0(stay in 0), p(go to 1), 1-p(go to 2)]
        # 1(action right) -> [0(stay in 0), 1-p(go to 1), p(go to 2)]
        self.stage_1_transition_matrix = np.array([[0, self.transition_prob, 1 - self.transition_prob], # action left
                                           [0, 1 - self.transition_prob, self.transition_prob]]) # action right
        
        # self.seed = 0
        # np.random.seed(self.seed)
        self.min_reward_prob = 0.25
        self.max_reward_prob = 0.75
        # matrix of reward probabilities
        # 0(state 0) -> [0 (left), 0(right)]
        # 1(state 1) -> [p1 (left), p2(right)]
        # 2(state 2) -> [p3 (left), p4(right)]
        p1 = np.random.uniform(self.min_reward_prob, self.max_reward_prob) 
        p2 = np.random.uniform(self.min_reward_prob, self.max_reward_prob)
        p3 = np.random.uniform(self.min_reward_prob, self.max_reward_prob)
        p4 = np.random.uniform(self.min_reward_prob, self.max_reward_prob)
        # p1 = 0.75
        # p2 = 0.75
        # p3 = 0.25
        # p4 = 0.25

        self.reward_prob_matrix = np.array([[0, 0], # first stage (state 0) for both actions
                                            [p1, p2], # second stage (state 1) for both actions
                                            [p3, p4]]) # second stage (state 2) for both actions
        
        # 1 -> fixed reward prob.
        # 0 -> reward prob. can be changed a long the trails 
        self.fixed_reward_prob_matrix = np.array([[1, 1],
                                            [0, 0],
                                            [0, 0]])
        
    
    def reset(self):
        self.state = 0
        self.terminal = False
        self.info = {}
        return self.state

    def step(self, action):
        if self.terminal:
            raise ValueError("Episode has already terminated")
        if action not in self.action_space:
            raise ValueError(f"The action: {action} is not valid, action space: {self.action_space}")

        # if in stage 1
        if self.state == 0:
            reward = self.reward_function(self.state, action) # reward will be 0
            self.state = np.random.choice(self.state_space, p=self.stage_1_transition_matrix[action])

            self.info["common_transition"] = self.is_common_state(self.state, action)
            self.info["state_transition_to"] = self.state
            self.info["reward_stage_1"] = reward
            self.info["action_stage_1"] = action
            # self.info["reward_probabilities_stage_1"] = self.reward_prob_matrix.flatten()
        
        # if in stage 2
        elif self.state in [1,2]:
            reward = self.reward_function(self.state, action)
            self.terminal = True
            self.info["reward_stage_2"] = reward
            self.info["action_stage_2"] = action
            # self.info["reward_probabilities"] = self.reward_prob_matrix.flatten()
            self.info["reward_probabilities"] = self.reward_prob_matrix.flatten()

        
        else:
            raise ValueError(f"state:{self.state} is an invalid state, state space: {self.state_space}")
        
        
        return self.state, reward, self.terminal, self.info
    
    def reward_function(self, state, action):
        if action not in self.action_space:
            raise ValueError(f"The action: {action} is not valid, action space: {self.action_space}")
        if state not in self.state_space:
            raise ValueError(f"state:{state} is an invalid state, state space: {self.state_space}")
        
        # give a reward according to the probability of getting a reward
        # for the action taken in the state ( state-action pair )
        reward = np.random.uniform() < self.reward_prob_matrix[state][action]
        # scale the reward for a costume reward value equal to self.reward
        # makes no difference in case self.reward = 1
        reward = reward * self.reward
        return reward
    
    def state_transition_function(self, state, action):
        if action not in self.action_space:
            raise ValueError(f"The action: {action} is not valid, action space: {self.action_space}")
        
        new_state = None
        terminal = False
        if state == 0:
            new_state = np.random.choice(self.state_space, p=self.stage_1_transition_matrix[action])
        elif state in [1,2]:
            terminal = True
        else:
            raise ValueError(f"state:{state} is an invalid state, state space: {self.state_space}")
        
        return new_state, terminal

    def is_common_state(self, state, action):
        if action not in self.action_space:
            raise ValueError(f"The action: {action} is not valid, action space: {self.action_space}")
        if state not in self.state_space:
            raise ValueError(f"state:{state} is an invalid state, state space: {self.state_space}")
        
        # return self.stage_1_transition_matrix[action, state] >= 0.5
        return self.stage_1_transition_matrix[action, state] == np.max(self.stage_1_transition_matrix[action])
    
    def set_reward_probabilities(self, reward_prob_matrix):
        if reward_prob_matrix.shape != self.reward_prob_matrix.shape:
            raise ValueError(f"reward_prob_matrix shape: {reward_prob_matrix.shape} is not valid, shape should be {self.reward_prob_matrix.shape}")
        # clip the reward probabilities to be between min_reward_prob and max_reward_prob
        reward_prob_matrix = np.clip(reward_prob_matrix, self.min_reward_prob, self.max_reward_prob)
        
        # update the reward_prob_matrix
        # if the reward_prob_matrix is fixed -> do not update it, else update it with from the new reward_prob_matrix
        self.reward_prob_matrix = np.where(self.fixed_reward_prob_matrix, self.reward_prob_matrix, reward_prob_matrix)
        return self.reward_prob_matrix

    def set_seed(self, seed):
        pass

    def plot(self):
        pass


In [237]:
# agent / models

class RandomAgent:
    def __init__(self, action_space, state_space, alpha=0.1, gamma=0.9):
        # the state space can be infered but here it is given for simplicity
        self.action_space = action_space
        self.state_space = state_space
        self.q_table = np.zeros((len(self.state_space), len(self.action_space)))
        self.alpha = alpha
        self.gamma = gamma

    def policy(self, state):
        return np.random.choice(self.action_space)
    
    def update_q_table_sarsa(self, state, action, reward, next_state, terminal):
        if state not in self.state_space or next_state not in self.state_space:
            raise ValueError(f"state:{state} is an invalid state, state space: {self.state_space}")
        if action not in self.action_space:
            raise ValueError(f"The action: {action} is not valid, action space: {self.action_space}")
        
        if terminal:
            self.q_table[state, action] += self.alpha * (reward - self.q_table[state, action])
        else:
            next_action = self.policy(next_state)
            self.q_table[state, action] += self.alpha * (reward + self.gamma * self.q_table[next_state, next_action] - self.q_table[state, action])
        return self.q_table
    
    def update_beliefs(self, state, action, reward, next_state, terminal):
        self.update_q_table_sarsa(self, state, action, reward, next_state, terminal)
    
    def reset(self):
        pass
# -------------------------------------------------------------------------------------------------------------
# -------------------------------------------------------------------------------------------------------------
# -------------------------------------------------------------------------------------------------------------
        
class AgentModelFree:
    def __init__(self, action_space, state_space, alpha=0.1, gamma=0.9, beta=1.0):
        # the state space can be infered but here it is given for simplicity
        self.action_space = action_space
        self.state_space = state_space
        self.alpha = alpha # Learning rate
        self.gamma = gamma # Discount factor
        self.beta = beta  # Temperature parameter for softmax policy
        self.epsilon = 0.2  # Epsilon for epsilon-greedy policy
        self.q_table = np.zeros((len(self.state_space), len(self.action_space)))

    def softmax(self, arr, beta):
        e_x = np.exp(beta * (arr - np.max(arr)))  # subtract max value to prevent overflow
        return e_x / e_x.sum(axis=0)  # axis=0 for column-wise operation if arr is 2D, otherwise it's not needed

    def policy(self, state, beta=None, epsilon=None,method="softmax"):
        q_values = self.q_table[state, :]
        beta = self.beta if beta is None else beta
        epsilon = self.epsilon if epsilon is None else epsilon
        # calculate the probability of each action in the state with softmax
        if method == "softmax":
            action_probabilities = self.softmax(q_values, beta)
            action = np.random.choice(self.action_space, p=action_probabilities)
        
        # with epsilon gready policy
        else:
            if np.random.uniform() < epsilon:
                action = np.random.choice(self.action_space)
            else:
                action = np.argmax(q_values)

        return action
    
    def update_q_table_sarsa(self, state, action, reward, next_state, terminal):
        if state not in self.state_space or next_state not in self.state_space:
            raise ValueError(f"state:{state} is an invalid state, state space: {self.state_space}")
        if action not in self.action_space:
            raise ValueError(f"The action: {action} is not valid, action space: {self.action_space}")
        
        next_action = self.policy(next_state)
        self.q_table[state, action] += self.alpha * self.reward_prediction_error(state, action, reward, next_state, next_action, terminal)
        
        return self.q_table

    def reward_prediction_error(self, state, action, reward, next_state, next_action, terminal):
        if terminal:
            return reward - self.q_table[state, action]
        return reward + self.gamma * self.q_table[next_state, next_action] - self.q_table[state, action]
    
    def update_beliefs(self, state, action, reward, next_state, terminal):
        self.update_q_table_sarsa(state, action, reward, next_state, terminal)
    
    def reset(self):
        pass

# -------------------------------------------------------------------------------------------------------------
# -------------------------------------------------------------------------------------------------------------
# -------------------------------------------------------------------------------------------------------------

class AgentModelBased:
    def __init__(self, action_space, state_space, alpha=0.1, gamma=0.9, beta=1.0, epsilon=0.2):
        self.action_space = action_space
        self.state_space = state_space
        self.alpha = alpha  # Learning rate
        self.gamma = gamma  # Discount factor
        self.beta = beta  # Temperature parameter for softmax policy
        self.epsilon = 0.2  # Epsilon for epsilon-greedy policy
        self.q_table = np.zeros((len(state_space), len(action_space)))

        # Initialize transition model as a 3D numpy array
        # Dimensions: [current_state, action, next_state]
        # For simplicity, initializing all transitions as equally likely
        self.transition_model = np.zeros((len(state_space),
                                         len(action_space),
                                         len(state_space)))
        self.transition_counts = np.zeros((len(state_space),
                                            len(action_space),
                                            len(state_space)))

    def softmax(self, arr, beta):
        e_x = np.exp(beta * (arr - np.max(arr)))  # subtract max value to prevent overflow
        return e_x / e_x.sum(axis=0)  # axis=0 for column-wise operation if arr is 2D, otherwise it's not needed

    def policy(self, state, beta=None, epsilon=None,method="softmax"):
        q_values = self.q_table[state, :]
        beta = self.beta if beta is None else beta
        epsilon = self.epsilon if epsilon is None else epsilon
        # calculate the probability of each action in the state with softmax
        if method == "softmax":
            action_probabilities = self.softmax(q_values, beta)
            action = np.random.choice(self.action_space, p=action_probabilities)
        
        # with epsilon gready policy
        else:
            if np.random.uniform() < epsilon:
                action = np.random.choice(self.action_space)
            else:
                action = np.argmax(q_values)

        return action
    
    def update_transition_model(self, current_state, action, next_state, terminal):
        # Simple counting method to update transition probabilities
        # TODO - Implement more sophisticated methods like Bayesian updating
        #      - at least insure no 0 probabilities for stage 1 to stage 2 transitions
        if terminal:
            return
        # Increment the count for the observed transition
        self.transition_counts[current_state, action, next_state] += 1

        # Normalize the transition probabilities for the current state-action pair
        total_transitions = self.transition_counts[current_state, action, :].sum()
        self.transition_model[current_state, action, :] = self.transition_counts[current_state, action, :] / total_transitions

    def update_q_table(self, state, action, reward, next_state, terminal):
        # Update Q-table using the transition model
        if terminal: # -> second stage -> update with TD
            self.q_table[state, action] += self.alpha * self.reward_prediction_error(state, action, reward, next_state, terminal)

        else:
            # self.q_table[state, action] +=  self.transition_model[state, action, next_state] * self.alpha * self.reward_prediction_error(state, action, reward, next_state, terminal)
            # self.q_table[state, action] =  self.transition_model[state, action, next_state] * self.alpha * self.reward_prediction_error(state, action, reward, next_state, terminal)

            self.q_table[state, action] = np.sum([self.transition_model[state, action, possible_state] * np.max(
                [self.q_table[possible_state, action] + self.alpha * self.reward_prediction_error(state,action, reward, next_state, terminal) for action in self.action_space]
                ) for possible_state in self.state_space])

    def reward_prediction_error(self, state, action, reward, next_state, terminal):
        if terminal:
            return reward - self.q_table[state, action]
        
        next_action = self.policy(next_state)
        return reward + self.gamma * self.q_table[next_state, next_action] - self.q_table[state, action]
            
    def update_beliefs(self, state, action, reward, next_state, terminal):
        self.update_transition_model(state, action, next_state, terminal)
        self.update_q_table(state, action, reward, next_state, terminal)

    def reset(self):
        pass

# one agent who can have different evaluation algos / policies / models?

In [238]:
import pandas as pd
from IPython.display import display
from datetime import datetime

# simulate data 
# (for now from randome agent, as test the environment and task implementation)
def simulate_tow_step_task(env:TowStepEnv, agent=None, trails=200, policy_method="epsilon-greedy"):
    env.reset()
    task_data = {}
    
    sd_for_random_walk = 0.025
    time_step = 0
    while time_step < trails:
        # first stage choice
        terminal = False
        while not terminal:
            current_state = env.state
            if agent:
                action = agent.policy(env.state, method=policy_method)
            else: # if no agent is given -> random action
                action = np.random.choice(env.action_space)

            next_state, reward, terminal, info = env.step(action)
            
            if agent:
                agent.update_beliefs(current_state, action, reward, next_state, terminal)
            
        task_data[time_step] = info
        env.reset()
        new_reward_prob_matrix = random_walk_gaussian(env.reward_prob_matrix, sd_for_random_walk)
        env.set_reward_probabilities(new_reward_prob_matrix)
        time_step += 1

    return task_data

def random_walk_gaussian(prob, sd, min_prob=0, max_prob=1):
    new_prob = prob + np.random.normal(scale = sd, size=np.shape(prob))
    new_prob = np.clip(new_prob, min_prob, max_prob)
    return new_prob

In [239]:
# simulate the task
# agent = RandomAgent(action_space=TowStepEnv.action_space, state_space=TowStepEnv.state_space)
# agent = AgentModelFree(action_space=TowStepEnv.action_space, state_space=TowStepEnv.state_space)
agent = AgentModelBased(action_space=TowStepEnv.action_space, state_space=TowStepEnv.state_space)
env = TowStepEnv()
task_data = simulate_tow_step_task(env, agent, trails=200)

# (state, action) -> reward
print("qtable:\n", agent.q_table)

# (state, action, new state) -> transition probability
if hasattr(agent, "transition_model"):
    # only the relevant transition probabilities should be non-zero, all others should be zero
    print("transition model for relevant states-action:\n", agent.transition_model[0])
    print("transition model for other states-actions:\n", agent.transition_model[1:])


qtable:
 [[0.53557562 0.31877775]
 [0.66426142 0.14739877]
 [0.2557607  0.081     ]]
transition model for relevant states-action:
 [[0.         0.72955975 0.27044025]
 [0.         0.34146341 0.65853659]]
transition model for other states-actions:
 [[[0. 0. 0.]
  [0. 0. 0.]]

 [[0. 0. 0.]
  [0. 0. 0.]]]


In [240]:
# convert the data to a dataframe
task_df = pd.DataFrame.from_dict(task_data, orient='index')
task_df['trail_index'] = task_df.index

# save the data to a csv file
time_identifier = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
task_df.to_csv(f"data/task_data_{time_identifier}.csv", index=False)

# print some statistics 
# TODO stay probabilities?
print("common transitions percentage:", np.mean(task_df["common_transition"])*100, "%")
print("rewarded trails percentage:", np.mean(task_df["reward_stage_2"] > 0)*100, "%")
print("transition prob. from state 0 action 0 to state 1:", np.mean(task_df[task_df["action_stage_1"] == 0]["state_transition_to"] == 1)*100, "%")
print("transition prob. from state 0 action 1 to state 2:", np.mean(task_df[task_df["action_stage_1"] == 1]["state_transition_to"] == 2)*100, "%")

visited_states_counts_df = task_df['state_transition_to'].value_counts().reset_index()
visited_states_counts_df.columns = ['State', 'Counts']
display(visited_states_counts_df)

action_first_stage_counts_df = task_df['action_stage_1'].value_counts().reset_index()
action_first_stage_counts_df.columns = ['Action Stage 1', 'Counts']
display(action_first_stage_counts_df)


display(task_df)


common transitions percentage: 71.5 %
rewarded trails percentage: 41.5 %
transition prob. from state 0 action 0 to state 1: 72.95597484276729 %
transition prob. from state 0 action 1 to state 2: 65.85365853658537 %


Unnamed: 0,State,Counts
0,1,130
1,2,70


Unnamed: 0,Action Stage 1,Counts
0,0,159
1,1,41


Unnamed: 0,common_transition,state_transition_to,reward_stage_1,action_stage_1,reward_stage_2,action_stage_2,reward_probabilities,trail_index
0,True,1,0,0,0,0,"[0.0, 0.0, 0.28926673057756686, 0.377241230728...",0
1,True,1,0,0,0,0,"[0.0, 0.0, 0.29231358747512753, 0.377822222932...",1
2,True,1,0,0,0,1,"[0.0, 0.0, 0.2799033908597144, 0.3914184289877...",2
3,True,1,0,0,1,0,"[0.0, 0.0, 0.3245719080131001, 0.3928764692649...",3
4,True,1,0,0,0,0,"[0.0, 0.0, 0.3138006775248742, 0.4015376240067...",4
...,...,...,...,...,...,...,...,...
195,True,1,0,0,1,0,"[0.0, 0.0, 0.5261839688961829, 0.3543094191420...",195
196,False,2,0,0,0,0,"[0.0, 0.0, 0.4729856870751432, 0.3639637265564...",196
197,True,1,0,0,1,0,"[0.0, 0.0, 0.4840573038940689, 0.3955094164030...",197
198,True,1,0,0,1,0,"[0.0, 0.0, 0.4473081726634353, 0.4026836114425...",198


In [241]:
# load and inspect human data
human_data = pd.read_csv("data/experiment_data_andrei.csv")
display(human_data)

# print some statistics
common_transitions_percentage = np.mean(
    np.where(human_data["stepOneChoice"] == 0, human_data["isHighProbOne"], human_data["isHighProbTwo"])
) * 100
print("common transitions percentage:", common_transitions_percentage, "%")
print("rewarded trails percentage:", np.mean(human_data["reward"] > 0)*100, "%")

Unnamed: 0,stepOne_Param,stepOneTwo_Param,stepTwoTwo_Param,rewards_Param,stepOneChoice,stepTwoChoice,reward,rewardProbabilities,isHighProbOne,isHighProbTwo,trial_type,trial_index,time_elapsed,internal_node_id
0,"[1,0]","[1,0]","[3,2]","[true,true,false,false]",1,3,False,"[0.42698213416971215,0.4627012728182879,0.3133...",True,True,dawsTwoStep,0,199187,0.0-0.0
1,"[0,1]","[0,1]","[2,3]","[true,false,true,true]",0,0,True,"[0.41278756730835614,0.488285663822515,0.30118...",True,True,dawsTwoStep,1,201995,0.0-1.0
2,"[0,1]","[2,3]","[2,3]","[false,true,false,false]",0,2,False,"[0.4161663996959537,0.45233292242685597,0.25,0...",False,True,dawsTwoStep,2,204233,0.0-2.0
3,"[1,0]","[1,0]","[0,1]","[true,true,true,false]",1,0,True,"[0.3682809695357653,0.485058262290801,0.255683...",True,False,dawsTwoStep,3,206293,0.0-3.0
4,"[1,0]","[1,0]","[0,1]","[false,false,false,false]",1,0,False,"[0.35795686167755414,0.4919457226516181,0.2563...",True,False,dawsTwoStep,4,208362,0.0-4.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,"[1,0]","[1,0]","[2,3]","[false,true,true,true]",1,2,True,"[0.5824104590833694,0.75,0.4648248909993357,0....",True,True,dawsTwoStep,195,632885,0.0-195.0
196,"[1,0]","[3,2]","[1,0]","[true,false,true,true]",1,0,True,"[0.5536965041858476,0.7413805858314367,0.47677...",False,False,dawsTwoStep,196,634934,0.0-196.0
197,"[0,1]","[2,3]","[1,0]","[true,true,false,false]",0,2,False,"[0.5564089642172411,0.71873663019763,0.4886982...",False,False,dawsTwoStep,197,641077,0.0-197.0
198,"[1,0]","[2,3]","[3,2]","[false,true,true,true]",1,3,True,"[0.600021338675008,0.7060795212734965,0.486364...",False,True,dawsTwoStep,198,643497,0.0-198.0


common transitions percentage: 71.5 %
rewarded trails percentage: 50.5 %


In [242]:
# simple one run test
# np.random.seed(0)
env = TowStepEnv()

terminal = False
while not terminal:
    action = np.random.choice([0,1])
    s, r, terminal, info = env.step(action)

print(env.info)

{'common_transition': True, 'state_transition_to': 2, 'reward_stage_1': 0, 'action_stage_1': 1, 'reward_stage_2': 0, 'action_stage_2': 0, 'reward_probabilities': array([0.        , 0.        , 0.73582736, 0.52061943, 0.40996409,
       0.50099739])}


## Model Simulation *(3 points)*

For this exercise you should:

*   Simulate data from both models for a single set of parameters. The simulation should mimic the experiment you are trying to model. *(2 points)*

*   Plot the simulated behavior of both models. *(1 point)*

Make sure to comment your code and provide an explanation for each code block in a preceding text block.


In [None]:
# YOUR MODEL SIMULATION CODE GOES HERE

## Parameter Fitting *(4 points)*

For this exercise you should:

*   Set up a suitable parameter search space *(1 point)*

*   Implement a procedure to evaluate the fit of a model based on data *(2 points)*

*   Implement a procedure for searching the parameter space. *(1 point)*

Make sure to comment your code and provide an explanation for each code block in a preceding text block.



In [None]:
# YOUR PARAMETER FITTING CODE GOES HERE

## Parameter Recovery *(5 points)*

For this exercise you should:

*   Set up a suitable space of parameters relevant for parameter recovery *(1 point)*

*   Use the functions above to generate behavior from a models, for a given set of (randomly sampled) parameters, and then fit the model to its generated data. Make sure to evaluate the parameter fit in a quantiative manner. *(3 points)*

*   Plot the parameter recovery results for both models. *(1 point)*

Make sure to comment your code and provide an explanation for each code block in a preceding text block.





In [None]:
# YOUR PARAMETER RECOVERY CODE GOES HERE

## *Optional*: Model Recovery *(2 bonus points)*

In this bonus exercise, you may examine model reovery. The bonus points count towards your total group project points. That is, you may accumlate up to 22 points in the practical part of the group project.

Make sure to comment your code and provide an explanation for each code block in a preceding text block.





In [None]:
# YOUR MODEL RECOVERY CODE GOES HERE

## Model Comparison *(5 points)*

For this exercise you should:

*   Load and (potentially) preprocess the experimental data. (1 point)

*   Fit the two models to the data.  *(1 point)*

*   Evaluate which model performs better, taking into account fit and model complexity. *(2 points)*

*   Plot the behavior of the winning model against the data. *(1 point)**

Make sure to comment your code and provide an explanation for each code block in a preceding text block.





In [None]:
# YOUR MODEL COMPARISON CODE GOES HERE