# 🧠 Week 1 – Reaction Time Analyzer (Solution)

This notebook contains the **full instructor version** of the Week 1 seminar assignment. It demonstrates how to perform simple data analysis and sanity checks on simulated **reaction time (RT)** data.

In behavioral and cognitive neuroscience, reaction times are one of the most basic measurements of performance. Before interpreting such data, it’s crucial to perform some *data sanity checks*: identify outliers, compute summary statistics, and verify plausibility. This notebook shows how to do this step by step using Python basics.

### Expected Output Example
```
Trials analyzed: 11
Mean RT: 305.0 ms
Median RT: 290 ms
Fastest RT: 250 ms
Slowest RT: 500 ms
Performance: Average
```

## Step 1 – Define the Data
In an experiment, each trial records a *reaction time* — how long the participant takes to respond to a stimulus. These values are collected in a list.

Real datasets can contain hundreds of trials, but here we only take **12** for clarity.

In [None]:
reaction_times = [250, 310, 295, 280, 275, 305, 290, 265, 300, 1080, 285, 500]
print('Reaction times:', reaction_times)

## Step 2 – Count the Trials
Before analyzing, always check how many trials were recorded.

In [None]:
# TODO: Count how many trials were recorded and print the result


## Step 3 – Summary Statistics: Describe the Data
We first compute **mean** and **median** reaction times to describe the dataset.

Mathematically:

$$
\text{mean} = \frac{1}{N}\sum_{i=1}^{N} RT_i, \quad
\text{median} =
\begin{cases}
RT_{\frac{N+1}{2}}, & N \text{ odd} \\
\frac{RT_{\frac{N}{2}} + RT_{\frac{N}{2}+1}}{2}, & N \text{ even}
\end{cases}
$$

The **mean** is simply the average reaction time across all trials, while the **median** is the middle value — *but only after the data have been sorted in ascending order.*  
Sorting ensures that smaller reaction times come first, so we can correctly identify the middle point.

We’ll now implement these calculations manually, this helps you understand how these operations actually work internally.

### 💡 Note on Median Indexing
In mathematical formulas, list positions start at **1**, e.g. $RT_1$ is the first element.

In **Python**, indices start at **0**, so we must shift every index by one:

$$RT_{\frac{N+1}{2}} (math) \Rightarrow RT_{\frac{N+1}{2}-1} = RT_{\frac{N-1}{2}} (Python)$$

Thus, we simply **subtract one after applying the formula** to get the correct element. Since division creates a decimal value, we wrap the result in `int()` to convert it into an integer index.

In [None]:
# Sort the data first
sorted_rts = sorted(reaction_times)

# TODO: compute the sum of all RTs manually
# hint: use a for-loop and a variable to keep track of the running total

# TODO: compute the mean manually and store it in the variable mean_rt

# TODO: compute the median manually store it in the variable median_rt (use int() for indices)

print("Sorted RTs:", sorted_rts)
print("Mean RT:", mean_rt, "ms")
print("Median RT:", median_rt, "ms")

## Step 4 – Fastest and Slowest Trial
We now scan through the list to find the **fastest** and **slowest** trials.

In [None]:
# TODO: go through all reaction times and find the fastest and slowest values
# and store them in seperate variables


print('Fastest RT:', fastest, 'ms')
print('Slowest RT:', slowest, 'ms')

## Step 5 – Remove Implausible Trials (Outliers)
Some reaction times are implausible — **too short** (<150 ms) or **too long** (>1000 ms). These are considered *outliers*.

There are two possible ways to handle them:
1. **Create a new list** containing only valid reaction times (recommended for clarity).
2. **Remove invalid values directly** using `.pop()`, but loop backwards to avoid shifting indices.

In [None]:
# Option 1 – create a new list with valid reaction times
cleaned_rts = []
# TODO: loop through reaction_times and append only plausible RTs
# (between 150 and 1000 ms)

print("Cleaned RTs (new list):", cleaned_rts)

In [None]:
# Option 2 – pop elements directly (loop backwards)
reaction_times_pop = reaction_times[:]  # copy original
# TODO: loop backwards and remove invalid RTs

print("Cleaned RTs (pop method):", reaction_times_pop)

Now that the data are cleaned, **copy your code from Step 3** (for mean and median) and run it again using `cleaned_rts`. This shows how cleaning affects the summary statistics.

## Step 6 – Recalculate Summary Statistics (After Cleaning)
Let’s repeat the same computations from Step 3 with the cleaned data.

In [None]:
# TODO: Copy your code from Step 3 and re-run it here using cleaned_rts
# Store your results in new variables called mean_cleaned and median_cleaned
# so you can compare before and after cleaning.


## Step 7 – Classify Overall Performance
We classify the subject’s performance based on the average reaction time:

| Mean RT (ms) | Performance |
|---------------|-------------|
| < 280 | Excellent |
| 280–310 | Average |
| > 310 | Needs improvement |

In [None]:
# TODO: classify based on mean_cleaned value
# < 280 -> "Excellent", between 280–310 -> "Average", > 310 -> "Needs improvement"

if ... :
    performance = "..."
elif ... :
    performance = "..."
else:
    performance = "..."

print("Performance:", performance)

## Step 8 – Print Summary of Results
We can now print a full summary of all the key results.

In [None]:
print('Trials analyzed:', len(cleaned_rts))
print('Mean RT:', round(mean_cleaned, 1), 'ms')
print('Median RT:', median_cleaned, 'ms')
print('Fastest RT:', min(cleaned_rts), 'ms')
print('Slowest RT:', max(cleaned_rts), 'ms')
print('Performance:', performance)

## Step 9 – Store Results in a Dictionary
Now that we have all results, we can place them together in a **dictionary**. This lets us keep everything organized in one variable, making it easier to access or print later.

In [None]:
results = {
    "num_trials": ...,      # TODO
    "mean": ...,            # TODO
    "median": ...,          # TODO
    "fastest": ...,         # TODO
    "slowest": ...,         # TODO
    "performance": ...      # TODO
}

print("Results dictionary:")
print(results)


## Step 10 – Built-In Functions Teaser
Python provides built-in functions that can do these steps instantly, but understanding the manual approach first is valuable. Here’s how it could look with built-ins:

## Step 10 – Built-In Functions Teaser
All the analysis steps above (counting trials, computing mean and median, finding fastest and slowest) can also be done in one line each using Python’s built-in functions.
Here, we use the cleaned list of reaction times `cleaned_rts` directly as input.
The only step we still do manually is the performance classification, since it depends on our own thresholds and interpretation.

In [None]:
import statistics

print('Trials analyzed:', len(cleaned_rts))
print('Mean RT:', round(statistics.mean(cleaned_rts),1), 'ms')
print('Median RT:', statistics.median(cleaned_rts), 'ms')
print('Fastest RT:', min(cleaned_rts), 'ms')
print('Slowest RT:', max(cleaned_rts), 'ms')

## 🧩 Bonus 1 – Full Reaction Times (Larger Dataset)
We can test if our code still works for larger datasets. The following code generates **1000 trials** with mostly realistic reaction times and a few outliers.

Simply overwrite your `reaction_times` list at the top with this new dataset and rerun your analysis. If it runs without errors, your code is general and robust!

In [None]:
import random

n_trials = 1000             # number of simulated trials
random.seed(42)             # ensures reproducibility

full_reaction_times = []
for i in range(n_trials):
    rt = random.gauss(275, 25)         # generate RT around 300 ms
    if random.random() < 0.01:         # add ~1% random outliers
        rt = random.choice([random.randint(50, 120), random.randint(900, 1200)])
    full_reaction_times.append(round(rt, 1))

print('Example of first 20 reaction times:')
print(full_reaction_times[:20])
print('Total trials generated:', len(full_reaction_times))

## ⚙️ Bonus 2 – Sorting Algorithm (Bubble Sort)
Finally, we can implement our own sorting algorithm — **Bubble Sort**. This algorithm repeatedly compares two neighboring elements and swaps them if they are in the wrong order.

Each pass pushes the largest remaining value toward the end of the list, like a bubble rising to the surface.

In [None]:
# strictly not the literal bubble sort

bubble_sorted = cleaned_rts[:]
for i in range(len(bubble_sorted)-1):
    for j in range(len(bubble_sorted)-1-i):
        if bubble_sorted[j] > bubble_sorted[j+1]:
            bubble_sorted[j], bubble_sorted[j+1] = bubble_sorted[j+1], bubble_sorted[j]
print('Bubble-sorted RTs:', bubble_sorted)