## CCNSSS 2018 Module 2: Perceptual inference and motor control

#  Tutorial 1 : Signal Detection Theory & Drift Diffusion Modelling

*Please execute the cells below to initialize the notebook environment*

In [1]:
from IPython.display import HTML
from IPython.display import display
import matplotlib.pyplot as plt    # import matplotlib
import numpy as np                 # import numpy
import scipy as sp                 # import scipy
import math                        # import basic math functions
import random                      # import basic random number generator functions


fig_w, fig_h = (6, 4)
plt.rcParams.update({'figure.figsize': (fig_w, fig_h)})
plt.style.use('ggplot')

In [3]:
# This code allows to call the function 'hide_toggle' that shows/hides solutions for each exercise

def hide_toggle(for_next=False):
    this_cell = """$('div.cell.code_cell.rendered.selected')"""
    next_cell = this_cell + '.next()'

    toggle_text = 'Show/hide Solution below'  # text shown on toggle link
    target_cell = this_cell  # target cell to control with toggle
    js_hide_current = ''  # bit of JS to permanently hide code in current cell (only when toggling next cell)

    if for_next:
        target_cell = next_cell
        toggle_text += ' '
        js_hide_current = this_cell + '.find("div.input").hide();'

    js_f_name = 'code_toggle_{}'.format(str(random.randint(1,2**64)))

    html = """
        <script>
            function {f_name}() {{
                {cell_selector}.find('div.input').toggle();
            }}

            {js_hide_current}
        </script>

        <a href="javascript:{f_name}()">{toggle_text}</a>
    """.format(
        f_name=js_f_name,
        cell_selector=target_cell,
        js_hide_current=js_hide_current, 
        toggle_text=toggle_text
    )

    return HTML(html)

---


## Objectives


In this notebook we'll look at *Signal Detection Theory (SDT)* and implement a *Drift diffusion model (DDM)* to simulate some data.

SDT:

- use random distributions in Python
- visualize data in Python
- practice $d'$ sensitivity analysis on mock data
- practice Receiver Operating Characteristic (ROC) analysis on mock data

DDM:


- What do reaction time distributions look like?
- How do distributions for correct and incorrect trials differ?
- How can these properties be understood in terms of the Drift Diffusion Model?
- simulate the Drift Diffusion Model

---



## Background (SDT)

Signal detection theory (SDT) is used when we want to measure the way we make decisions under conditions of uncertainty (for example: how well would we detect the presence of a car in front of us, while driving under foggy conditions). 

SDT assumes that the decision maker is not a passive receiver of information, but an active decision-maker who makes difficult perceptual judgments under conditions of uncertainty. 

In our example, in foggy circumstances, we are forced to decide whether there is an object in front of us based solely upon visual stimulus, which is impaired by the fog. The density of the fog makes detecting a car in front of us more difficult (and is a function of your distance from the car and the fog density). 

Signal Detection Theory can be applied to a data set where stimuli were either present or absent (e.g. stim = car, or no_car), and the observer categorized each trial as detecting the stimulus as being present or absent (detect=car, or no_car). These tasks are also known as 2-Alternative Forced Choice tasks (2-AFC for short). In such tasks, the trials can be sorted into one of four categories:

![](./figures/2AFC_table.png)

Signal detection theory is a means to quantify the ability to differentiate between valid information (signal) and noise. Multiple measures can be extracted using SDT: 

* the $d'$ (pronoumced dee-prime): is a measure of sensitivity (how hard/easy is it to perceive a stimulus under uncertainty). 
            *How easy/hard is it to see a car under foggy conditions for each participant?*
* the bias (sometimes called 'threshold') '$c$': is a measure of bias in discriminating signal from noise. 
            *Does each participant have a tendency to overestimate or underestimate a car being present?*
* the Receiver Operating Characteristic curve (ROC): enables to illustrate the ability of a participant to discriminate between signal and noise as the threshold and/or uncertainty is varied. 
            *How does the participant ability to detect a car changes as a function of fog density*
            or
            *How does the participant ability to detect a car changes as a function of their threshold/bias*

___

Graphically, you may think of the signal and the noise are overlapping distributions (signal: red distribution, noise: blue distribution). The threshold (or bias) is a boundary that separates the signal from the noise and defines whether the participant responds 'present' or 'not present'. 

When the threshold is set very low, noise might inadvertently be classified as signal (i.e. many false positives (FP)). 

Conversely, when the threshold is set very high, signal might be classified as noise (many false negatives, FN)).

![](./figures/roc.png)

ps: SDT measures can also be used to study any kind of binary classifiers (say: how good is a routine test at detecting cancer). In this case, having a conservative bias (low threshold) would result in more false positive, but depending on the application it could be a good thing. It is better to have a false alarm leading to a follow-up in a clinic, rather than missing a true cancer being present.

For more info: [https://en.wikipedia.org/wiki/Receiver_operating_characteristic](https://en.wikipedia.org/wiki/Receiver_operating_characteristic)

**EXERCISE 1**

Using normal distributions, we will create a sysnthetic dataset of noise and signal distributions.

Let the distributions follow the following form: 
\begin{align*} \mathcal{N}_{signal}\left(\mu,\sigma\right),\qquad \mu=15, \sigma=3 \end{align*}
\begin{align*} \mathcal{N}_{noise}\left(\mu,\sigma\right),\qquad \mu=10, \sigma=2.5 \end{align*}

**Suggestions**
* For reproducibility, set the seed to 0
* Draw 10,000 samples from a normal distribution with mean '$\mu$' and standard deviation '$\sigma$' for the signal distribution (hint: you may use numpy.random functions)
* Now do the same but for the noise distribution
* Plot the histograms of the signal and noise distributions on the same plot(hint: you may use matplotlib.hist function). You may play around with the arguments 'bins', 'density', 'color', 'alpha', 'legend' until you can reproduce the expected figure below.
* Plot on top of the histogram the true probability density function that generated the data (i.e. the normal distributions with mean '$\mu$' and standard deviation '$\sigma$'). Hint: you may import scipy.stats.
* Add a vertical line representing the decision criterion (a.k.a. 'threshold') used to separate data as being either noise or signal (hint: you may use matplotlib.axvline). Use 12.3 as a decision criteria for now.
* Show the legend in the top right corner of the plot

In [4]:
#insert your code here

hide_toggle(for_next=True)

In [None]:
### SOLUTION

import scipy.stats

random.seed(0)

mu1, s1 = 10, 2.5
mu2, s2 = 15, 3
n_samples = 10000

y1 = np.random.normal(mu1, s1, size=n_samples) 
y2 = np.random.normal(mu2, s2, size=n_samples) 

plt.hist(y1, bins=25, density = True, 
                          histtype='stepfilled', color='r',
                          alpha=0.4, linewidth=0, label='noise')

plt.plot(np.arange(0,30,0.1), sp.stats.norm.pdf(np.arange(0,30,0.1), mu1, s1), color = 'r', linewidth=2)

plt.hist(y2, bins=25, density = True, 
                          histtype='stepfilled', color = 'b',
                          alpha=0.4, linewidth=0, label='signal')

plt.plot(np.arange(0,30,0.1), sp.stats.norm.pdf(np.arange(0,30,0.1), mu2, s2), color = 'b', linewidth=2)

plt.axvline(12.5, color='black', linewidth=1)
plt.legend()

***Expected output***

![](./figures/expected_ex1.png)

***EXERCISE 2: Sensitivity ($d'$) and specificity analysis.***

$d'$ is a dimensionless statistic. A higher $d'$ indicates that the signal can be more readily detected.

The sensitivity index or d' provides the separation between the means of the signal and the noise distributions, compared against the standard deviation of the signal or noise distribution. For normally distributed signal ($s$) and noise ($n$), with mean and standard deviations $\mu _{S}$ and $\sigma _{S}$ , and $\mu _{N}$ $\sigma _{N}$, respectively, $d'$ is defined as:

\begin{align*} d'=\frac {\mu _{S}-\mu _{N}}{\sqrt {{\frac {1}{2}}\left(\sigma _{S}^{2}+\sigma _{N}^{2}\right)}} \end{align*}

![](./figures/deeprime.png)

An estimate of $d'$ can be also found from measurements of the hit rate and false-alarm rate. It is calculated as:

\begin{align*} d' = Z(hit rate) − Z(false alarm rate) \end{align*}

where function $Z(p)$, $p \in \left[0,1\right]$, is the inverse of the cumulative distribution function of the Gaussian distribution.

**Suggestions**
* Set your seed to 0.
* Draw 1000 samples from a normal distribution with mean '$\mu_1=10$', and standard deviation '$\sigma_1=2.5$'
* Draw 1000 samples from another normal distribution with standard deviation '$\sigma_2=2.5$' with a different mean '$\mu_2$' (vary the mean '$\mu_2$' from 11 to 19 in steps of 2
* For each value of the second mean '$\mu_2$', calculate the $d'$, using the samples you have for each distribution using the samples from each distribution. (hint: you may want to use np.mean, and np.std)
* For each value of the second mean '$\mu_2$', calculate the $d'$, using the true mean and std of the distributions.  
* Change the number of sample of each distribution from 100, 1000, and 10000. See how having more data leads to better estimates of $d'$. Particularly in real situations when we do not know the true underlying summary statistic (mean and standard deviation) of the noise and signal distributions, having more data will yield better estimates of $d'$.
* Change the decision threshold such that you have 5 threshold values interspersed between the minimum of the noise distribution, up to the maximum of the signal distribution. 
* Print out the number of samples, threshold, estimated d_prime, true d_prime, and error 
* Calculate the Optimal threshold, Hit Rate, False Alarm rate, and bias '$c$' for each true decision-threshold and print it out.  

In [5]:
#insert your code here

hide_toggle(for_next=True)

In [None]:
### SOLUTION 

random.seed(0)

mu1, s = 10, 2.5
n_values = [100, 1000, 10000]
for i_n in range(len(n_values)):
    n = n_values[i_n]
    for mu2 in range(mu1 + 1, 2 * mu1, 2):
    
        dPrime = (mu2 - mu1) / np.sqrt(.5 * (s+s))
    
        noise = np.random.normal(mu1, s, size=n) 
        signal = np.random.normal(mu2, s, size=n) 
    
        mu1_hat = np.mean(noise)
        sd1_hat = np.std(noise)
    
        mu2_hat = np.mean(signal)
        sd2_hat = np.std(signal)
    
        dPrime_hat = (mu2_hat - mu1_hat) / np.sqrt(.5 * (sd1_hat+sd2_hat))

        data_min = np.min([noise, signal])
        data_max = np.max([noise, signal])
        
        # decision criterion z
        z_range = np.linspace(data_min, data_max, num=5)

        FalseAlarmRate_samples = np.zeros_like(z_range)
        HitRate_samples = np.zeros_like(z_range)
    
        print('Num samples: ' + str(n) + ', mu2: ' + str(mu2) + ', Estimated dprime: ' + str(round(dPrime_hat,2)) + ', True dprime: ' + str(round(dPrime,2)) + ', error: ' + str(round(dPrime-dPrime_hat,3))  )
        
        for idx, z in enumerate(z_range):

            FalseAlarmRate_samples[idx] = np.mean(noise >= z)
            HitRate_samples[idx] = np.mean(signal >= z)
        
    #       Could also calculate d' from FA and Hitrate as an example?
    
            print('True Threshold: ' + str(round(z,2)) + ', Optimal threshold: ' + str((mu1+((mu2-mu1)/2))) + ', bias c: ' + str(round(z-(mu1+((mu2-mu1)/2)),2)))
        print(' ')
    print('----')

*** Expected output ***

![](./figures/expected_ex2.png)

**EXERCISE 3**

In statistics, a receiver operating characteristic curve, i.e. ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.

The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as sensitivity, recall or probability of detection in machine learning. The false-positive rate is also known as the fall-out or probability of false alarm.

ROC analysis provides tools to select possibly optimal models and to discard suboptimal ones independently from (and prior to specifying) the cost context or the class distribution. ROC analysis is related in a direct and natural way to cost/benefit analysis of diagnostic decision making.

--

$d'$ can also be calculated with respect to the Area Under Curve (AUC) of the Receiver operating characteristic (ROC) curve, via

\begin{align*} d'=\sqrt {2}Z\left({\mbox{AUC(ROC)}}\right) \end{align*}


**Suggestions**
* Set your seed to 0.
* Draw 1000 samples from a normal distribution with mean '$\mu_1=10$', and standard deviation '$\sigma_1=2.5$'
* Draw 1000 samples from another normal distribution with standard deviation '$\sigma_2=2.5$' with a different mean '$\mu_2$' (vary the mean '$\mu_2$' from 11 to 19 in steps of 2
* For each value of the second mean '$\mu_2$', calculate the ROC curve for varying decision criterion, and plot all ROC curves in the same plot as dots.
* For each value of '$\mu_2$', also calculate the ROC curve based on the normal pdf (instead of samples from the pdf) and plot as lines (hint you may use scipy.stats.norm.cdf, scipy.stats.norm.sf). 
* Increase the number of sample of each distribution for 100, 500, and 1000. Look at how having more data changes the estimates.
* For each value of '$\mu_2$', calculate the sensitivity index d' using your calculation of the AUC and true distribution parameters.
* Compare your estimate of the $d'$ calculated with the AUC, vs. that calculated with the equation in exercise 2.
* Add axes labels, title, legends.

In [6]:
#insert your code here

hide_toggle(for_next=True)

In [None]:
### SOLUTION 

random.seed(0)
mu1, s = 10, 2.5

n_values = [100, 500, 10000]
for i_n in range(len(n_values)):
    plt.figure();
    n = n_values[i_n]
    for mu2 in range(mu1 + 1, 2 * mu1, 2):
    
        dPrime = (mu2 - mu1) / np.sqrt(.5 * (s+s))
    
        no_flash = np.random.normal(mu1, s, size=n) 
        flash = np.random.normal(mu2, s, size=n)    

        data_min = np.min([no_flash, flash])
        data_max = np.max([no_flash, flash])

        # decision criterion z
        z_range = np.linspace(data_min, data_max, num=50)

        falsePositiveRate_samples = np.zeros_like(z_range)
        truePositiveRate_samples = np.zeros_like(z_range)

        falsePositiveRate_distr = np.zeros_like(z_range)
        truePositiveRate_distr = np.zeros_like(z_range)
    
        for idx, z in enumerate(z_range):

            falsePositiveRate_samples[idx] = np.mean(no_flash >= z)
            truePositiveRate_samples[idx] = np.mean(flash >= z)
        
            falsePositiveRate_distr[idx] = sp.stats.norm.sf(z, loc=mu1, scale=s)
            truePositiveRate_distr[idx] = sp.stats.norm.sf(z, loc=mu2, scale=s) 

        plt.plot(falsePositiveRate_distr, truePositiveRate_distr, '-', color='black', linewidth=1)
        plt.plot(falsePositiveRate_samples, truePositiveRate_samples, '.', label="$\mu_2=%s, d'=%.1f$" % (str(mu2), dPrime))
    
    plt.legend(loc='lower right')
    plt.xlim(0,1,)
    plt.ylim(0,1)
    plt.title('ROC curves, n=' + str(n))
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')

***Expected Output***

![](./figures/expected_ex3_1.png)

# Part 2: Drift Diffusion Model (DDM)

Now that we have looked at the monkey's raw reaction time and accuracy data from this discrimination task, we can begin to think about how to model this data. As you saw in the lecture before, DDMs predict both RT and choice accuracy data and have been used to model behavior in this sort of sequential discimination task.

**Short summary of the Drift diffusion model**

The Drift Diffusion Model arises from the Sequential Probability Ratio Test in the limit where discretely presented evidence becomes continuously presented evidence. 

Let's take an example. Say a participant is shown a blurry stimulus that is either a face or a house, and ther participant needs to respond either 'face' or 'house' (2-AFC). Once a stimulus is displayed, the participant accumulates information over time by looking at the stimulus (the longer the participant looks at the stimulus, the more confident he/she will be that it is either a house or a face). The information accumulated with a drift-rate '$v$', and when the participant trace hits a decision-boundary '$0$' or '$a$', the participant responds 'house' or 'face' respectively. We can change the bias of a given participant '$z$', by changing the starting point of the accumulation trace with respect to the decision boundaries '$0$' or '$a$'. If we move '$z$' to be closer to '$a$' than to '$0$', then the participant will be more likely to respond 'face' (i.e. a bias in responding for faces).

![](./figures/DDM.png)

Mathematically, the DDM is given as a stochastic differential equation

\begin{eqnarray}
dx = \mu dt + \sigma dW,
\end{eqnarray}

where
$\mu$ : Drift rate, 
$\sigma$ : Noise standard deviation, 
$dW$ : White Noise.

Ignoring the boundary conditions, the distribution of increments of $x$ from time $s$ to time $t$ is a normal distribution $\mathcal{N}$ with mean $\mu (t-s)$ and standard deviation $\sigma \sqrt{t-s}$, $(s\leq t)$:
\begin{eqnarray}
X_t-X_s \sim \mathcal{N}(\mu (t-s), \sigma \sqrt{t-s}).
\end{eqnarray}

In discrete time form, the increment of the decision variable $\Delta x$ after time $\Delta t$ is
\begin{eqnarray}
\Delta x \sim \mathcal{N}(\mu \Delta t, \sigma \sqrt{\Delta t}).
\end{eqnarray}

Now consider two absorbing boundaries at $\pm B$. A decision is committed once the decision variable reaches one of the boundaries. In other words, the decision variable is "absorbed" by the boundary.

**References:**

* Ratcliff, Roger. "A theory of memory retrieval." Psychological review 85.2 (1978): 59.

* Bogacz, Rafal, et al. "The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks." Psychological review 113.4 (2006): 700.

**In this exercise** we'll write a function that simulates RTs and choices using a DDM with constant decision boundaries. We will plot this data, using the function written above, and in doing this we will begin to see how the model generated behavior compares to the monkey RT and choice data. In particular, we will play with some of the parameters of the DDM and look at the effects this has on the simulated data. In order to simulate the data, remember the key equations of the DDM:

\begin{eqnarray}
\Delta x \sim \mathcal{N}(\mu \Delta t, \sigma \sqrt{\Delta t}).
\end{eqnarray}

which describes the incremental change in the decision variable $\Delta x$ after time $\Delta t$.

Next, we will relax certain conditions of the DDM and see how this affects the simulated behavior. In particular, we will introduce time-varying drift and collapsing decision bounds. 

Finally, we will look at the analytical DDM and graphically compare its predictions to our simulated RT data.

***EXERCISE 4: Constant bound DDM simulation***

***Suggestions***

* Write a function that simulates one trial of the DDM. The function should take parameters $\mu$, $\sigma$, and a boundary $B$ as inputs and return the choice, correctness, and reaction time for that trial, as well as the simulated trace of the decision variable (x in the equation above) and the times at which the decision variable was sampled in the simulation. 
* Plot the decision variable trajectories for 200 trials in the same figure with the following parameters: $\mu=1.5 \cdot 10^{-3}$, $\sigma=0.05$, $B=1$.
* (Optional) Change the parameters $\mu, \sigma$, and observe the change in the decision variable density.

Hints: 

- adjust the alpha value of a plot to show more trajectories
- to obtain reproducible results in code that uses random numbers, set the "random seed" (np.random.seed) to an integer value of your choice (we used the trial number as seed)

In [7]:
#insert your code here

hide_toggle(for_next=True)

In [8]:
### Solution function

def sim_DDM_constant(mu, sigma, B, dt=1, tMax=2500, seed=1):
    """
    Function that simulates one trial of the constant bound DDM
    
    Parameters
    ----------
    mu: float
        DDM drift rate
    sigma: float
        DDM standard-deviation
    B: float
        DDM boundary
    dt: float, optional
        time step in msec with which DDM will be integrated
    tMax: float, optional
        DDM is integrated from t=0 to t=tMax [in msec], should be multiple of dt
    seed: integer, optional
        random seed
    
    Returns
    -------
    choice: categorical
        indicates whether left or right boundary was reached by decision variable
    correct: bool
        whether or not the left boundary (which is assumed to be the target boundary) was chosen
    rt: float
        reaction time in msec
    dvTrace: list
        trace of decision variable
    tTrace: array_like
        times at which decision variable was sampled in the simulation
        
    """
    
    
    # Set random seed
    np.random.seed(seed)
    
    # Additional parameters
    n_max    = tMax / dt   # maximum number of time steps
    tSimu   = dt * np.arange(1,n_max)
    
    sigma_dt = sigma * np.sqrt(dt)
    mu_dt    = mu * dt
    
    # Initialize decision variable x
    x = 0
    
    # Storage
    tTrace = [0]
    dvTrace = [x]
    
    # Looping through time
    for t in tSimu:
        x += mu_dt + sigma_dt * np.random.randn() # internal decision variable x
        
        tTrace.append(t)
        dvTrace.append(x) # save new x

        # check boundary conditions
        if x > B:
            rt = t  
            choice = 'left'
            break
        if x < -B:
            rt = t
            choice = 'right'
            break
    else: # executed if no break has occurred in the for loop
        # If no boundary is hit before maximum time, 
        # choose according to decision variable value
        rt = t
        choice = 'left' if x > 0 else 'right'
        
    correct = (choice == 'left') # assumes left choice is correct
    
    return choice, correct, rt, dvTrace, tTrace

hide_toggle(for_next=True)

In [None]:
### Solution 

# Loop over trials, for each trial call your function, plot trajectories

mu, sigma, B = 1.5*1e-3, 0.05, 1
plt.figure()
for i_trial in np.arange(200):
    choice, correct, rt, dvTrace, tTrace = sim_DDM_constant(mu, sigma, B, seed=i_trial)
    plt.plot(tTrace, dvTrace, '-', color='k', alpha=0.1)
    
# beautify plot
plt.xlabel('Time (ms)')
plt.ylabel('Decision variable')
plt.ylim((-B,B))

**Expected Output**

![](./figures/expected_ex4.png)

***EXERCISE 5: Reaction time distribution***

***Suggestions***

* Simulate the DDM for 5000 trials with $\mu=0.0015, \sigma=0.05, B=1$.
* Plot the reaction time distribution, separating correct from error trials (use the function provided below)
* (optional) Edit the function 'plot_rt_distribution' so that you have the raw traces of sim_DDM sandwiched between the distribution of correct RTs in blue, and incorrect RTs in red (see expected output; hint: you may want to look-up subplots in matplotlib).

In [9]:
# insert your code here

hide_toggle(for_next=True)

In [10]:
### Solution function

def  plot_rt_distribution (rt1, rt0, bins=None):
    '''
    # Function that takes RT data as input and plots a histogram

    rt1/rt0 : array of reaction time for correct/error trials
    bins: if given, the bins for plotting
    '''
    if bins is None:
        maxrt = max((max(rt1),max(rt0)))
        bins = np.linspace(0,maxrt,26)
    count1, bins_edge = np.histogram(rt1, bins=bins)
    count0, bins_edge = np.histogram(rt0, bins=bins)
    n_rt = len(rt0) + len(rt1)
    
    plt.figure()
    plt.bar(bins_edge[:-1], count1/n_rt, np.diff(bins_edge), color='blue', edgecolor='white')
    plt.bar(bins_edge[:-1], -count0/n_rt, np.diff(bins_edge), color='red', edgecolor='white')
    
    titletxt = 'Prop. correct {:0.2f}, '.format(sum(count1)/n_rt)
    titletxt += 'Mean RT {:0.0f}/{:0.0f} ms'.format(np.mean(rt1),np.mean(rt0))
    
    plt.ylabel('Proportion')
    plt.xlabel('Reaction Time')
    plt.title(titletxt)
    plt.xlim((bins.min(),bins.max()))
    
hide_toggle(for_next=True)

In [None]:
### Solution

random.seed(0)

mu, sigma, B = 0.0015, 0.05, 1
n_trial = 5000
rts      = np.zeros(n_trial)
corrects = np.zeros(n_trial)
for i_trial in range(n_trial):
    choice, correct, rt, _, _ = sim_DDM_constant(mu, sigma, B, seed=i_trial)
    rts[i_trial] = rt
    corrects[i_trial] = correct
    
# Plot the RT distributions
plot_rt_distribution(rts[corrects==1], rts[corrects==0],np.linspace(0,2500,51))

**Expected Output**
![](./figures/expected_ex5.png)


***Exercise 6: Analytical solution of classical DDM***

One can obtain an analytic solution for the reaction time distribution of the drift diffusion model, which we provide here. Don't worry about the inner working of the function, just how to use it.

***Suggestions***

* import the function 'analytic_ddm' from the ddm module in the 'src' folder.
* look at the docstring of the function to see what parameters it takes and what it returns (in the notebook you can append '?' to the function name)

In [11]:
#insert your code here

hide_toggle(for_next=True)

In [None]:
### Solution
from src.ddm import analytic_ddm

analytic_ddm?

***EXERCISE 7: Comparison between analytic and simulated solution***

***Suggestions***

* Compare the analytical solution with simulation results (use 10,000 trials): $\mu=1e-3, \sigma=0.05, B=1$.
* Compare the time taken by the simulation and analytical calculation.

Hints:

- When comparing analytical and simulated RT histograms, make sure the normalizations of the histograms are consistent
- Useful function: time.time()

In [12]:
# insert your code here

hide_toggle(for_next=True)

In [None]:
### Solution 


import time

# Parameters
mu, sigma, B = 0.0015, 0.05, 1

# Simulation results
n_trial  = 10000
rts      = np.zeros(n_trial)
corrects = np.zeros(n_trial)
t_start  = time.time()
for i_trial in range(n_trial):
    choice, correct, rt, _, _ = sim_DDM_constant(mu, sigma, B, seed=i_trial)
    rts[i_trial] = rt
    corrects[i_trial] = correct
print('Time taken: ' + str(time.time() - t_start) + 's (Simulations results)')

bins = np.linspace(0,2500,51)
plot_rt_distribution(rts[corrects==1], rts[corrects==0], bins)

# Analytical results
teval = np.arange(0,bins[-1],1)[1:]
t_start = time.time()
dist_cor, dist_err = analytic_ddm(mu, sigma, B, teval)
print('Time taken: ' + str(time.time() - t_start) + 's (Analytical results)')

plt.plot(teval, dist_cor*(bins[1]-bins[0]), color = 'blue', lw=2)
plt.plot(teval,-dist_err*(bins[1]-bins[0]), color = 'red', lw=2)

**Expected Output**

![](./figures/expected_ex7.png)