<a href="https://colab.research.google.com/github/CoAxLab/BiologicallyIntelligentExploration/blob/main/Labs/Lab3_Signal_detection_theory.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 3 - Signal Detection Theory

This lab has three main components designed to go over Signal Detection Theory (SDT).

1. Review the foundational concepts and parameters behind SDT.
1. Get an interactive understanding of the how the parameters/measures behave and relate to one another.
1. Plot receiver operating characteristic (ROC) curves to learn about them as a measure of sensitivity.

## Section - Setup

First let's set things up for the three parts of the lab.

### Install the ADMCode library

Install the code for running the adaptive decision making simulations (ADMCode). We'll use this library for visualizing SDT concepts.

In [None]:
# ADMCode first
# ADMCode uses an old version of numba
!pip install numba==0.48
!pip install --upgrade git+https://github.com/CoAxLab/AdaptiveDecisionMaking_2018

### Import Modules

Here we will bring in all the modules and libraries that we will need for this lab.

In [None]:
from __future__ import division
from ADMCode import visualize as vis
from ADMCode import ddm, sdt

import numpy as np
import pandas as pd

from ipywidgets import interactive
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

warnings.simplefilter('ignore', np.RankWarning)
warnings.filterwarnings("ignore", module="matplotlib")
warnings.filterwarnings("ignore")
sns.set(style='white', font_scale=1.3)

%matplotlib inline

## Section 1 - Reviewing Signal Detection Theory Concepts

Here, we will review the essential concepts of Signal Detection Theory (SDT).

(image source: https://iujur.iu.edu/features/archives/2017-2018/signal-detection-theory.html)

![](https://iujur.iu.edu/features/archives/2017-2018/signal-detection-theory-title)

### Signal Detection Theory Basics

- *Signal Detection Theory*, or *SDT*, centers around determining whether a stimulus is a **signal** or just **noise**.
- When a signal is present, stimulus values follow a distibution (SDT assumes this distribution to be normal, in other words, a gaussian distibution).
- When noise is present, stimulus values follow a *different* distibution (SDT assumes this noise distribution to share the same standard deviation as the signal distribution).
- The crux is that the signal and noise stimulus distributions *overlap*, such that from given a certain stimulus value, it is not certain whether the stimulus comes from signal or noise.
- SDT posits that a decision-maker determines whether a stimulus is signal or noise depending on whether the stimulus value exceeds a certain **threshold** (also referred to as the **criterion**).
- Depending on the decision and the reality of the stimulis, there are 4 possible results:
  - If the decision-maker responds "*Yes*" and the signal *is* present: **Hit**
  - If the decision-maker responds "*Yes*" and the signal *is NOT* present: **False Alarm (FA)**
  - If the decision-maker responds "*No*" and the signal *is* present: **Miss**
  - If the decision-maker responds "*No*" and the signal *is NOT* present: **Correct Rejection (CR)**

![](https://raw.githubusercontent.com/CoAxLab/BiologicallyIntelligentExploration/main/Labs/SDT_screenshot.png)

### Signal Detection Theory Parameters:

- $d'   $: Strength of the signal relative to the noise.
- $B    $: Position of the threshold relative to the noise distribution. The criterion.
- $C    $: The strategy of the participant relative to an ideal observer threshold.
- $\beta$: A variant of $C$￼. The ratio of the height of the signal distribution relative to the noise distribution for $B$￼.
- $D    $: Separation between threshold ($B$￼) and ￼$d'$ (i.e., ￼$D=B-d'$)

(source: class slides)

## Section 2 - Interacting with Signal Detection Theory Concepts

### Widget 1

We will start by playing with a simple widget for visualizing the SDT concepts:
- probability of a hit: P(hit), or "pH" below
- probability of a false alarm: P(fa), or "pFA" below

These are experimentally measurable values. From these measures, one can calculate other SDT parameters (given the assumptions of SDT such as normal and equivariant signal & noise distributions).

Before you use the widget, look through and run the next code cell to see how d' and c values can be computed using hit and FA rates. When playing with the widget, you can check its calculations by setting different p_hit and p_FA rates and re-running the code block below.

**Note:** the `norm.ppf()` function below is an inverse CDF function for the standard normal distribution. It's the same as the function in the reading denoted as $\Phi^{-1}$, the "inverse phi" function. Basically, the function takes in a probability and outputs the Z-score corresponding to where the CDF of the standard normal distribution would equal that probability.

In [None]:
from scipy.stats import norm # import statistical library for inverse norms

# set your hit and FA rates to calculate resulting d' and c values
p_hit = 0.8
p_FA  = 0.1

# calculate and print d' value
d_prime = norm.ppf(p_hit) - norm.ppf(p_FA)

print("d' = {}".format(d_prime))

# calculate and print c value
c = -1 * ( (norm.ppf(p_hit) + norm.ppf(p_FA)) / 2)

print("c  = {}".format(c))

Run the code below to load the widget. Play around with the widget below to get some understanding of how pH and pFA interact to lead to differnt $d'$ and $c$ values. Use the widget and your understanding of SDT to answer the questions below it.

If you'd like to look at the code for this widget, you can find it at the following path: /usr/local/lib/python3.7/dist-packages/ADMCode/visualize.py

In [None]:
interactive_plot = interactive(vis.sdt_interact, pH=(0.,1.,.1), pFA=(0.,1.,.1))
output = interactive_plot.children[-1]
output.layout.height = '300px'
interactive_plot

###  Question 1: 
#### (double click on the cells below to edit)

### Question 1.1

Describe the relationship between number of **Misses** and the criterion parameter ($c$) in SDT (use the interactive visualization at the top to help get some intuition).

In [None]:
# Write your answer here, as a python comment

### Question 1.2

Describe the relationship between number of **Misses** and $d'$.

In [None]:
# Write your answer here, as a python comment

### Question 1.3

Describe in plain words why $c=\frac{1}{2}d'$ when the **Hit** and **Miss** counts are equal.

In [None]:
# Write your answer here, as a python comment

## Section 3 - Building Reciever Operating Characteristic (ROC) Curves

### Widget 2

Run the code block below to load the second widget we will play with in this lab. In this next widget, you can change any SDT parameter and see the effect on all the others. We will use this widget to get an interactive understanding of what goes into ROC plots and how to interpret them.

This widget is loaded in from the web (it's written in the programming language JavaScript, not Python), so we can't look at its source code here. However, there is a link to the source code for this widget (a GitHub page) at its bottom left.

**Note:** if you want to return to default values, just re-run the code block. 

In [None]:
# The following widget is from Daniel Dickison (https://github.com/danieldickison/sdt-visualization)
from IPython.display import IFrame
IFrame("https://danieldickison.github.io/sdt-visualization/", 820, 580)

Play around with the widget. Then, when you're ready, re-run the code block to return the widget to default values. Time to make an ROC plot.

A **receiver operating characteristic (ROC)** depics the trade-off between the false alarm rate (x-axis) and hit rate (y-axis) at different criteria.

We'll manually record hit and FA rates at different $C$ values (from -1.5 to 1.5 in increments of 0.75) in Python arrays and then plot them. A python array is a list of values separated by spaces and/or commas and enclosed in brackets. An example is `[1, 2, 3]`.

**Note:** Avoid setting $C$ less than -1.5 or greater than 1.5, as it will mess up the widget. However if you do so, you can reload it.

In [None]:
c_vals = [-1.5, -0.75, 0, 0.75, 1.5] # this is an array of c values

p_hits = [] # fill in the P(hit) values you get at each of the c values here

p_FAs  = [] # fill in the P(FA) values you get at each of the c values here

Run the code block above once you have recorded the P(hit) and P(fa) values in their Python arrays. Then run the code block below to plot what you've entered!

In [None]:
import matplotlib.pyplot as plt # load plotting library

# plot hit rates over false alarm rates
plt.plot(p_FAs, p_hits)
# set axes to each go from 0 to 1
plt.xlim([0,1])
plt.ylim([0,1])
# add axis labels and title
plt.xlabel("P(fa)", size='xx-large')
plt.ylabel("P(hit)", size='xx-large')
plt.title('ROC curve', size='xx-large')

# annotate plot for which C values yield which P(hit) vs. P(fa) trade-offs
for i in range(len(c_vals)):
  plt.text(p_FAs[i], p_hits[i], "c = " + str(c_vals[i]))

Congrats! You've made an ROC plot.

You can see that with more *liberal* criterions (lower $C$ values), hit rates rise quite high but false alarm rates also rise.

With more *conservative* criterions (higher $C$ values), false alarm rates become quite low but hit rates also "take a hit".

If you have time, 

### Question 2

If you changed $d'$ to from 1.561 to 1, how would the ROC curve change, if at all? What would happen to the **area under the curve (AUC)**?

In [None]:
# Write your answer here, as a python comment

This lab substantially adapted by Jack Burgess in the fall of 2022 from materials by Matthew Clapp.