# Signal Detection Tutorial

Signal detection theory is among the most successful theories in all of cognitive science and psychology. You will run a little signal detection experiment on yourself and analyze the data.

When you're at the ear doctor, she will increase the intensity of a tone until you can hear it. Until you can detect it. She does this to measure how sensitive your hearing is. Imagine you are a radiologist and have to decide whether on an x-ray there is a tumor present or not. This is also a detection task. You have to detect the tumor. We will be simulating this last task with our stimuli but what we will learn is applicable more generally to all detection tasks. 

## Stimuli

Instead of detecting a tumor, in our task you have to detect the letter *A* in a noisy image. Running the code in the next cell will generate a plot. In it you see a noisy image *without* the letter *A* on the left. We call this the noise-only condition. On the right you see a noisy image *with* the letter *A*. This is the signal-plus-noise condition. 

In [None]:
# we need psychopy and a few other libraries (pandas, numpy, matplotib, etc.)
# they are all imported in `signal_detection.py` in the same folder 
from signal_detection import *

# in this cell press shift+enter to run only this cell and generate the plot!
imshow_stimuli(125)

The one parameter of the function `imshow_stimuli` allows you to change the intensity of the stimulus. Play with it! When it is at zero there is no signal (just like on the left). The *A* is clearly visible with the intensity being at `125`. You can probably still kinda see the *A* when you set the parameter to `50`. You should know you're looking at grayscale images where `0` is black and `255` is white. The noise-only pixels have a mean of `127`, i.e. gray, and the standard deviation of the noise was set to `25`. The pixels of the *A* have a mean of `127` plus the intensity that you've set and the same standard deviation. 

## Some Cautionary Notes

The question we want to answer is: How well can you detect the *A*. Or more generally, how can we measure your sensitivity in any detection task? In a second step, which we will not address here, we could then compare your sensitivity to everyone else's in the class to find out who's the best *A*-detector (or the best radiologist or who has a hearing problem). The reason why we won't attempt a comparison is that you're all doing this task on different computers and with different monitors. Some monitors might be better than other monitors and allow for a higher contrast. Also your detection performance will depend on the luminance of the monitor and the lighting in your room. Hence, if we measure your performance on the *A*-detection task we're really measuring your performance *on your setup*. Therefore, any comparison is not meaningful psychologically (but note that some radiologists might also have better setups than other radiologists and we might therefore also care about the setup in some applications). Given that we expect the lighting in your room to also play a role you should try to keep all conditions as constant as possible, e.g. by sitting at the same desk and by closing the curtains and switching off all lights. Now is also a good opportunity to clean your screen. Also make sure that you're well rested and able to concentrate while you're doing the experiments. We want to measure your best possible performance and not you're performance when you're tired or worn out. Psychological data are already noisy enough as they are, we don't want to introduce additional, unnecessary variability. With these comments out of the way. How do we measure sensitivity?

## The Yes-No Task

The first idea one could have is the following: We choose an intensity, show the *A* with this intensity to a subject 100 times and record how often the subject reports to see the *A*. The trouble with this experiment ist that the subject could just lie and always say "I see it". A simple fix is to introduce catch trials. On each trial we flip a coin an there either is (signal trial) or there is not (noise-only or catch trial) an *A*. In this way we can offer the performance objectively.

A block in the experiment will always consist of 50 trials. In each trial you will first see a fixation cross that you should look at. Then, very briefly, the stimulus will show up. With probability `p=0.5` there will be an *A*. Your task is to then press either `y` (for yes) or `n` (for no) to answer the question whether you detected the *A*. Let's try this:

In [None]:
data = run_block(subject='test',intensity=100) # run this cell by shift+enter

This was an easy block with a high intensity (`intensity=100`). Let's look at how well you did.

In [None]:
data = load_data('test')
summarize(data,'block','pc').T

The first row (block) shows you the blocks you've done and the other rows how many trials you did and how well you did in each block. The first thing you want to look at is the row N. It says you did 50 trials. The row PC shows the proportion correct. Since this was an easy block hopefully the number will be close to 1. If not you need to practice this task a little more until you make only 1 or 2 mistakes in a block of 50 trials. These mistakes are not because you didn't see the *A* but because you accidentally pressed the wrong button. As the task is pretty fast paced this can happen and it's really hard to bring down the number of mistakes down to zero even if you see all *A*s. As we want to measure how well you can see the *A* it's important that you really learn the response mapping (which button is which) and make as few "finger errors" as possible. So really do practice this task by re-running the `run_block`-cell above (with `subject='test'` and a clearly visible `intensity=100`) before you move on.

## Psychometric Functions

The obvious experiment to do now is to vary the intensity (our independent variable) and see how that affects the proportion of correct responses (our dependent variable). Because we are keeping the stimuli constant in each block, i.e. we show either the catch trial or a stimulus with a fixed intensity, people call this method of measuring detection performance the *method of constant stimuli*. I usually start the experiment with a warm-up block that is pretty easy for subjects to get right. For example, set the `intensity=60`:

In [None]:
subject = 'xy' # change xy to your initials

In [None]:
data = run_block(subject,intensity=60) 

In [None]:
data = load_data(subject)
summarize(data,'intensity','pc').T

That was easy. Note that instead of arranging the table by block we have now arranged it by intensity. Let's make it harder by re-running the last two cells with `intensity=40` and see how hard that is. For me (on my setup) that was still pretty easy. So I made it harder and went down to `intensity=30`. Still very easy although I made 1 or 2 real mistakes. So I then tried `intensity=20`. That was already quite difficult for me with about 75% correct. I often had to guess. Still 75% seems above chance level. So I made it even harder with `intensity=10`. With 48% correct I was pretty much at chance level. So I made it a little easier again, `intensity=15`. When you do this yourself you want to choose the intensities such that some data points are close to chance level and some will give you perfect performance and you also want some data points in between. It's a little cumbersome to always look at the table so let's make a plot instead.

In [None]:
thresholds = psychometric_function(data)

We have also fit a function to your data. This function is called a *psychometric function*. A psychometric function has some property of the stimulus that we vary on the x-axis. Here that's the stimulus intensity. On the y-axis there's always some proportion of the subject's responses, usually the proportion of correct responses. Where the dotted lines interect the x-axis are the 55%, 65%, 75%, 85%, and 95% thresholds. Those are the stimulus intensities that are necessary to achieve the respective percentage of correct responses. You should have at least 5 data points around these performance levels and a couple of data points with lower and higher performance so that you have the full psychometric function covered with your data. You should at least have collected 500 trials with good coverage of all performance levels to get a good estimate of the psychometric function.

## Threshold and Sensitvity