# Lab 3: GSR Signal Processing and Stress Analysis

**BioRobotics**

---

## Learning Objectives

By the end of this lab, you will be able to:

1. **Load and explore GSR data** from the BioRadio 150
2. **Apply appropriate signal processing** for slowly varying DC signals (low-pass filtering, NOT the EMG pipeline)
3. **Decompose EDA into tonic and phasic components** (SCL and SCR)
4. **Detect and measure SCR peaks** — amplitude, latency, rise time
5. **Compare autonomic responses** across stress/relaxation conditions
6. **Analyze Stroop test effects** on GSR and relate to reaction time

---

## How to Use This Notebook

**Phase 1 (Parts 1-2):** Learn GSR fundamentals with your recorded data  
**Phase 2 (Parts 3-4):** Analyze the guided protocol and Stroop test  
**Phase 3 (Part 5):** Comparative discussion and conclusions  

**Code guidance:**
- **Completed code** demonstrates techniques
- **`# TODO` sections** require your implementation
- **Questions (Q1-Q10)** must be answered in the designated cells

---

## Part 0: Setup and Imports

In [None]:
# Standard libraries
import numpy as np
import pandas as pd
from pathlib import Path
import os
import glob
import warnings
warnings.filterwarnings('ignore')

# Signal processing
from scipy import signal
from scipy import stats

# EDA/GSR processing (research-grade library)
import neurokit2 as nk

# Visualization
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import seaborn as sns

# Our GSR processing module
import sys
sys.path.insert(0, '..')
from src.gsr_processing import (
    lowpass_filter, decompose_eda, detect_scr_peaks,
    compute_gsr_features, process_gsr_pipeline, segment_by_events
)

# Configure matplotlib
%matplotlib inline
plt.rcParams['figure.figsize'] = (14, 6)
plt.rcParams['font.size'] = 12
plt.rcParams['axes.grid'] = True
plt.rcParams['grid.alpha'] = 0.3
sns.set_style('whitegrid')

np.random.seed(42)

# Constants
SAMPLE_RATE = 250  # Hz (BioRadio GSR sample rate)

# Condition colors for consistent plotting
CONDITION_COLORS = {
    'baseline': '#4ECDC4',
    'breathing': '#45B7D1',
    'arithmetic': '#FF6B6B',
    'recovery': '#96CEB4',
}

print('All imports successful!')
print(f'NeuroKit2 version: {nk.__version__}')

---

# Phase 1: Understanding GSR Signals

## Part 1: Loading and Exploring GSR Data

### 1.1 GSR Data File Format

The data files from `gsr_collect.py` have this structure:

```
# participant_id: P01
# protocol: guided_stress_relaxation
# stream_name: BioRadio_GSR
# stream_type: GSR
# samples: 120000
# duration_sec: 480.000
# nominal_sample_rate: 250
# effective_sample_rate: 249.85
#
timestamp,gsr_1,condition
0.000000,5.234567,baseline
0.004000,5.234612,baseline
...
```

Key differences from EMG data:
- **1 GSR channel** (not 8 EMG channels)
- **Condition column** marks which experimental phase each sample belongs to
- **Signal range**: Typically 1-20 microsiemens (not microvolts)
- **Slowly varying**: Changes happen over seconds, not milliseconds

### 1.2 Load GSR File

In [None]:
def load_gsr_file(filepath):
    """
    Load a GSR CSV file with metadata header.
    
    Parameters
    ----------
    filepath : str or Path
        Path to the CSV file
    
    Returns
    -------
    data : pd.DataFrame
        GSR data with 'time', 'gsr_1', and 'condition' columns
    metadata : dict
        File metadata (participant, sample rate, etc.)
    """
    metadata = {}
    header_lines = 0
    
    with open(filepath, 'r') as f:
        for line in f:
            if line.startswith('#'):
                header_lines += 1
                if ':' in line:
                    key, value = line[1:].split(':', 1)
                    key = key.strip()
                    value = value.strip()
                    try:
                        value = int(value)
                    except ValueError:
                        try:
                            value = float(value)
                        except ValueError:
                            pass
                    metadata[key] = value
            else:
                break
    
    data = pd.read_csv(filepath, skiprows=header_lines)
    
    # Use timestamp column as time (already relative to recording start)
    if 'timestamp' in data.columns:
        data['time'] = data['timestamp']
    else:
        sr = metadata.get('effective_sample_rate', 
                          metadata.get('nominal_sample_rate', SAMPLE_RATE))
        data['time'] = np.arange(len(data)) / sr
    
    # Store effective sample rate
    if 'effective_sample_rate' not in metadata:
        if len(data) > 1 and 'timestamp' in data.columns:
            duration = data['timestamp'].iloc[-1] - data['timestamp'].iloc[0]
            metadata['effective_sample_rate'] = (len(data) - 1) / duration
        else:
            metadata['effective_sample_rate'] = SAMPLE_RATE
    
    return data, metadata

print('load_gsr_file() defined')

### 1.3 Load Your Recording

In [None]:
# TODO: Set this to the path of YOUR GSR recording from gsr_collect.py
GSR_FILE = '../data/P01_gsr_guided_XXXXXXXX_XXXXXX.csv'  # <-- CHANGE THIS

# Load the recording
gsr_df, gsr_meta = load_gsr_file(GSR_FILE)

print('Recording Metadata:')
print('-' * 50)
for key, value in gsr_meta.items():
    print(f'  {key}: {value}')

print(f'\nData shape: {gsr_df.shape}')
print(f'Columns: {list(gsr_df.columns)}')
print(f'Duration: {gsr_df["time"].iloc[-1]:.1f} seconds')
print(f'Conditions: {gsr_df["condition"].unique()}')

display(gsr_df.head())

### 1.4 Visualize the Full GSR Recording

In [None]:
fig, ax = plt.subplots(figsize=(16, 5))

# Plot GSR signal with condition shading
conditions = gsr_df['condition'].unique()
for cond in conditions:
    mask = gsr_df['condition'] == cond
    cond_data = gsr_df[mask]
    color = CONDITION_COLORS.get(cond, '#999999')
    
    # Shade the condition region
    t_start = cond_data['time'].iloc[0]
    t_end = cond_data['time'].iloc[-1]
    ax.axvspan(t_start, t_end, alpha=0.15, color=color, label=f'{cond}')

# Plot the signal on top
ax.plot(gsr_df['time'], gsr_df['gsr_1'], linewidth=0.5, color='black')

ax.set_xlabel('Time (s)')
ax.set_ylabel('GSR (raw units)')
ax.set_title('Full GSR Recording with Condition Markers', fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
plt.tight_layout()
plt.show()

### 1.5 Basic Signal Statistics

In [None]:
# Per-condition basic statistics
stats_df = gsr_df.groupby('condition')['gsr_1'].agg(
    ['mean', 'std', 'min', 'max']
).round(4)

# Add sample count and duration
sr = gsr_meta.get('effective_sample_rate', SAMPLE_RATE)
stats_df['samples'] = gsr_df.groupby('condition')['gsr_1'].count()
stats_df['duration_sec'] = (stats_df['samples'] / sr).round(1)

print('Per-Condition Statistics:')
display(stats_df)

---

## Part 2: Signal Processing for GSR

### Key Insight: GSR is NOT like EMG!

| Property | EMG | GSR |
|----------|-----|-----|
| **Frequency** | 20-450 Hz | 0-5 Hz |
| **Signal type** | Oscillatory (AC) | Slowly varying (DC) |
| **Processing** | Bandpass, rectify, envelope | Low-pass, tonic/phasic decomposition |
| **Rectification** | Required (bipolar signal) | NOT needed (unipolar) |
| **Notch filter** | Yes (60 Hz powerline) | Not needed (signal is below 5 Hz) |
| **Time scale** | Milliseconds | Seconds |

### 2.1 Low-Pass Filtering (Student Exercise)

GSR only needs a low-pass filter to remove high-frequency noise.

In [None]:
def student_lowpass_filter(data, cutoff_freq, sample_rate, order=4):
    """
    Apply a low-pass Butterworth filter to GSR data.
    
    Parameters
    ----------
    data : np.ndarray
        Input signal
    cutoff_freq : float
        Cutoff frequency in Hz
    sample_rate : float
        Sampling rate in Hz
    order : int
        Filter order
    
    Returns
    -------
    filtered : np.ndarray
        Filtered signal
    """
    # TODO: Calculate the Nyquist frequency
    
    
    # TODO: Normalize the cutoff frequency to Nyquist (0 to 1 range)
    
    
    # TODO: Design Butterworth low-pass filter
    # Use: signal.butter(order, normalized_cutoff, btype='lowpass')
    
    
    # TODO: Apply zero-phase filtering
    # Use: signal.filtfilt(b, a, data)
    
    
    return filtered

print('student_lowpass_filter() defined (complete the TODOs above)')

In [None]:
# Apply low-pass filter to the GSR signal
raw_gsr = gsr_df['gsr_1'].values
time_vec = gsr_df['time'].values

# Use the module version (or your own once implemented)
filtered_gsr = lowpass_filter(raw_gsr, cutoff_freq=5.0, sample_rate=SAMPLE_RATE)

# Compare raw vs filtered
fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True)

axes[0].plot(time_vec, raw_gsr, linewidth=0.5)
axes[0].set_ylabel('GSR (raw)')
axes[0].set_title('Raw GSR Signal', fontweight='bold')

axes[1].plot(time_vec, filtered_gsr, linewidth=0.8, color='green')
axes[1].set_ylabel('GSR (filtered)')
axes[1].set_title('Low-Pass Filtered (5 Hz cutoff)', fontweight='bold')
axes[1].set_xlabel('Time (s)')

plt.tight_layout()
plt.show()

### 2.2 Frequency Spectrum: Why Low-Pass is Enough

In [None]:
# Compare frequency content of raw vs filtered
freqs_raw, psd_raw = signal.welch(raw_gsr, fs=SAMPLE_RATE, nperseg=2048)
freqs_filt, psd_filt = signal.welch(filtered_gsr, fs=SAMPLE_RATE, nperseg=2048)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Full spectrum
axes[0].semilogy(freqs_raw, psd_raw, 'b', alpha=0.7, label='Raw')
axes[0].semilogy(freqs_filt, psd_filt, 'g', alpha=0.9, label='Filtered')
axes[0].axvline(x=5, color='r', linestyle='--', alpha=0.5, label='5 Hz cutoff')
axes[0].set_xlabel('Frequency (Hz)')
axes[0].set_ylabel('PSD (log)')
axes[0].set_title('Power Spectrum: Full Range', fontweight='bold')
axes[0].legend()
axes[0].set_xlim([0, 30])

# Zoomed to GSR band
axes[1].plot(freqs_raw, psd_raw, 'b', alpha=0.7, label='Raw')
axes[1].plot(freqs_filt, psd_filt, 'g', alpha=0.9, label='Filtered')
axes[1].set_xlabel('Frequency (Hz)')
axes[1].set_ylabel('PSD')
axes[1].set_title('Zoomed: GSR Band (0-5 Hz)', fontweight='bold')
axes[1].legend()
axes[1].set_xlim([0, 5])

plt.tight_layout()
plt.show()

print('Notice: Almost ALL the signal power is below 5 Hz.')
print('This is why GSR needs a LOW-pass filter, not the 20-450 Hz bandpass used for EMG.')

### 2.3 Tonic/Phasic Decomposition

The GSR signal has two components:

- **Tonic (SCL)**: Skin Conductance Level — the slow baseline that drifts over minutes. Reflects general arousal/stress level.
- **Phasic (SCR)**: Skin Conductance Responses — rapid transient peaks lasting 1-5 seconds. Reflect specific autonomic events.

This is the equivalent of "envelope extraction" for EMG, but the approach is fundamentally different.

In [None]:
# Decompose using neurokit2 (research-grade algorithm)
components = decompose_eda(filtered_gsr, sample_rate=SAMPLE_RATE, method='neurokit')

tonic = components['tonic']
phasic = components['phasic']

# Plot the decomposition
fig, axes = plt.subplots(3, 1, figsize=(16, 10), sharex=True)

# Add condition shading to all subplots
for ax in axes:
    for cond in gsr_df['condition'].unique():
        mask = gsr_df['condition'] == cond
        t_start = time_vec[mask][0]
        t_end = time_vec[mask][-1]
        color = CONDITION_COLORS.get(cond, '#999')
        ax.axvspan(t_start, t_end, alpha=0.1, color=color)

axes[0].plot(time_vec, filtered_gsr, linewidth=0.8, color='black')
axes[0].set_ylabel('GSR')
axes[0].set_title('1. Filtered GSR Signal', fontweight='bold')

axes[1].plot(time_vec, tonic, linewidth=1.5, color='blue')
axes[1].set_ylabel('SCL')
axes[1].set_title('2. Tonic Component (Skin Conductance Level)', fontweight='bold')

axes[2].plot(time_vec, phasic, linewidth=0.8, color='orange')
axes[2].set_ylabel('SCR')
axes[2].set_title('3. Phasic Component (Skin Conductance Responses)', fontweight='bold')
axes[2].set_xlabel('Time (s)')

# Add condition legend
patches = [mpatches.Patch(color=c, alpha=0.3, label=n) 
           for n, c in CONDITION_COLORS.items()]
axes[0].legend(handles=patches, loc='upper right', ncol=4)

plt.tight_layout()
plt.show()

### 2.4 SCR Peak Detection

In [None]:
# Detect SCR peaks in the phasic component
scr_info = detect_scr_peaks(phasic, sample_rate=SAMPLE_RATE)

print(f'Detected {len(scr_info["peaks_idx"])} SCR peaks')

if len(scr_info['peaks_idx']) > 0:
    print(f'Mean amplitude: {np.mean(scr_info["amplitudes"]):.4f}')
    print(f'Mean rise time: {np.mean(scr_info["rise_times"]):.2f} sec')

# Plot phasic with detected peaks
fig, ax = plt.subplots(figsize=(16, 5))

# Condition shading
for cond in gsr_df['condition'].unique():
    mask = gsr_df['condition'] == cond
    t_start = time_vec[mask][0]
    t_end = time_vec[mask][-1]
    ax.axvspan(t_start, t_end, alpha=0.1, color=CONDITION_COLORS.get(cond, '#999'))

ax.plot(time_vec, phasic, linewidth=0.8, color='orange', label='Phasic (SCR)')

# Mark peaks
if len(scr_info['peaks_idx']) > 0:
    peak_times = time_vec[scr_info['peaks_idx']]
    ax.scatter(peak_times, scr_info['amplitudes'], color='red', 
              s=80, zorder=5, label=f'SCR Peaks (n={len(peak_times)})')
    
    # Mark onsets
    onset_times = time_vec[scr_info['onsets_idx']]
    onset_vals = phasic[scr_info['onsets_idx']]
    ax.scatter(onset_times, onset_vals, color='green', marker='^',
              s=50, zorder=5, label='Onsets')

ax.set_xlabel('Time (s)')
ax.set_ylabel('Phasic SCR')
ax.set_title('SCR Peak Detection', fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
plt.tight_layout()
plt.show()

### Question 1

**How does the GSR signal differ in character from the EMG signals you recorded in Lab 1? Why does GSR need a completely different processing pipeline (low-pass only, no rectification, no bandpass)?**

*Your answer here:*

> 

### Question 2

**Look at the tonic (SCL) component. Which experimental condition shows the highest mean skin conductance level? Is this what you expected based on the protocol? Explain why.**

*Your answer here:*

> 

---

# Phase 2: Experimental Analysis

## Part 3: Guided Protocol Analysis (Part A)

### 3.1 Segment Data by Condition

In [None]:
# Segment the data by condition
CONDITIONS = ['baseline', 'breathing', 'arithmetic', 'recovery']

condition_data = {}
for cond in CONDITIONS:
    mask = gsr_df['condition'] == cond
    condition_data[cond] = {
        'gsr': gsr_df.loc[mask, 'gsr_1'].values,
        'time': gsr_df.loc[mask, 'time'].values,
        'tonic': tonic[mask],
        'phasic': phasic[mask],
    }
    print(f'{cond:12s}: {mask.sum():6d} samples ({mask.sum()/SAMPLE_RATE:.1f}s)')

### 3.2 Tonic Analysis: Mean SCL per Condition (Student Exercise)

In [None]:
# TODO: Compute the mean SCL (tonic component) for each condition
# Store results in a dictionary: condition_name -> mean_scl

scl_means = {}
for cond in CONDITIONS:
    # TODO: Calculate the mean of the tonic component for this condition
    # Hint: use condition_data[cond]['tonic'] and np.mean()
    scl_means[cond] = None  # Replace with your implementation

print('Mean SCL per condition:')
for cond, val in scl_means.items():
    print(f'  {cond:12s}: {val}')

In [None]:
# Box plot of SCL across conditions
scl_data = []
for cond in CONDITIONS:
    # Downsample tonic for plotting (every 250th sample = 1 per second)
    tonic_downsampled = condition_data[cond]['tonic'][::SAMPLE_RATE]
    for val in tonic_downsampled:
        scl_data.append({'condition': cond, 'SCL': val})

scl_plot_df = pd.DataFrame(scl_data)

fig, ax = plt.subplots(figsize=(10, 6))
colors = [CONDITION_COLORS[c] for c in CONDITIONS]
sns.boxplot(data=scl_plot_df, x='condition', y='SCL', order=CONDITIONS,
            palette=colors, ax=ax)
ax.set_xlabel('Condition')
ax.set_ylabel('Skin Conductance Level (SCL)')
ax.set_title('Tonic SCL by Condition', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

### 3.3 Phasic Analysis: SCR Rate per Condition

In [None]:
# Detect SCR peaks within each condition
scr_stats = {}
for cond in CONDITIONS:
    phasic_segment = condition_data[cond]['phasic']
    scr = detect_scr_peaks(phasic_segment, sample_rate=SAMPLE_RATE)
    
    duration_min = len(phasic_segment) / SAMPLE_RATE / 60.0
    n_peaks = len(scr['peaks_idx'])
    
    scr_stats[cond] = {
        'count': n_peaks,
        'rate_per_min': n_peaks / duration_min if duration_min > 0 else 0,
        'mean_amp': np.mean(scr['amplitudes']) if n_peaks > 0 else 0,
        'mean_rise_time': np.mean(scr['rise_times']) if n_peaks > 0 else 0,
    }

# Display as DataFrame
scr_df = pd.DataFrame(scr_stats).T
scr_df.index.name = 'condition'
print('SCR Statistics per Condition:')
display(scr_df.round(3))

In [None]:
# Bar chart: SCR rate per condition
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

colors = [CONDITION_COLORS[c] for c in CONDITIONS]

# SCR Rate
rates = [scr_stats[c]['rate_per_min'] for c in CONDITIONS]
axes[0].bar(CONDITIONS, rates, color=colors, edgecolor='black')
axes[0].set_ylabel('SCR Rate (peaks/min)')
axes[0].set_title('SCR Frequency by Condition', fontweight='bold')

# Mean SCR Amplitude
amps = [scr_stats[c]['mean_amp'] for c in CONDITIONS]
axes[1].bar(CONDITIONS, amps, color=colors, edgecolor='black')
axes[1].set_ylabel('Mean SCR Amplitude')
axes[1].set_title('Mean SCR Amplitude by Condition', fontweight='bold')

plt.tight_layout()
plt.show()

### Question 3

**How does the SCR frequency (peaks per minute) differ between the mental arithmetic condition and the baseline? What does this tell you about sympathetic nervous system activation during cognitive stress?**

*Your answer here:*

> 

### 3.4 Statistical Comparison (Student Exercise)

In [None]:
# TODO: Perform a paired statistical test comparing SCL between conditions
#
# Compare baseline vs arithmetic using a paired t-test or Wilcoxon signed-rank test.
# 
# Steps:
# 1. Get per-second SCL values for baseline and arithmetic
#    (downsample the tonic: condition_data['baseline']['tonic'][::SAMPLE_RATE])
# 2. Make sure both arrays are the same length (use min length)
# 3. Use scipy.stats.ttest_rel() for paired t-test
#    OR scipy.stats.wilcoxon() for non-parametric test
# 4. Print the test statistic and p-value

# baseline_scl = condition_data['baseline']['tonic'][::SAMPLE_RATE]
# arithmetic_scl = condition_data['arithmetic']['tonic'][::SAMPLE_RATE]
# ...

print('TODO: Implement statistical comparison')
print('Compare baseline vs arithmetic SCL using a paired test')

### Question 4

**Is the difference in SCL between baseline and mental arithmetic statistically significant? What test did you use and why? Report the test statistic and p-value.**

*Your answer here:*

> 

### 3.5 Summary Statistics Table

In [None]:
# Comprehensive feature extraction using our module
all_features = []
for cond in CONDITIONS:
    features = compute_gsr_features(
        condition_data[cond]['gsr'], 
        sample_rate=SAMPLE_RATE,
        condition=cond
    )
    all_features.append(features)

summary_df = pd.DataFrame(all_features).set_index('condition')
print('Full GSR Feature Summary:')
display(summary_df.round(4).style.highlight_max(axis=0, color='#FFD700'))

---

## Part 4: Stroop Test Analysis (Part B)

### 4.1 Load Stroop Results

In [None]:
# TODO: Set this to the path of YOUR Stroop results CSV
STROOP_FILE = '../data/stroop_results_XXXXXXXX_XXXXXX.csv'  # <-- CHANGE THIS

# Load Stroop results (skip metadata header lines)
stroop_meta = {}
header_lines = 0
with open(STROOP_FILE, 'r') as f:
    for line in f:
        if line.startswith('#'):
            header_lines += 1
            if ':' in line:
                k, v = line[1:].split(':', 1)
                stroop_meta[k.strip()] = v.strip()
        else:
            break

stroop_df = pd.read_csv(STROOP_FILE, skiprows=header_lines)

print('Stroop Test Parameters:')
for k, v in stroop_meta.items():
    print(f'  {k}: {v}')

print(f'\nTrials: {len(stroop_df)}')
print(f'Columns: {list(stroop_df.columns)}')
display(stroop_df.head(10))

### 4.2 Stroop Behavioral Results

In [None]:
# Separate congruent vs incongruent
congruent = stroop_df[stroop_df['congruent'] == True]
incongruent = stroop_df[stroop_df['congruent'] == False]

# Filter for correct responses only (exclude timeouts)
cong_correct = congruent[congruent['correct'] == True]
incong_correct = incongruent[incongruent['correct'] == True]

print('Stroop Behavioral Results:')
print(f'  Congruent:    {len(cong_correct)}/{len(congruent)} correct '
      f'({len(cong_correct)/len(congruent)*100:.0f}%), '
      f'mean RT = {cong_correct["response_time_ms"].mean():.0f} ms')
print(f'  Incongruent:  {len(incong_correct)}/{len(incongruent)} correct '
      f'({len(incong_correct)/len(incongruent)*100:.0f}%), '
      f'mean RT = {incong_correct["response_time_ms"].mean():.0f} ms')

stroop_effect = incong_correct['response_time_ms'].mean() - cong_correct['response_time_ms'].mean()
print(f'\n  Stroop Effect (RT difference): {stroop_effect:.0f} ms')

In [None]:
# Reaction time distributions
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram
axes[0].hist(cong_correct['response_time_ms'], bins=15, alpha=0.6, 
            color='green', label='Congruent', edgecolor='black')
axes[0].hist(incong_correct['response_time_ms'], bins=15, alpha=0.6,
            color='red', label='Incongruent', edgecolor='black')
axes[0].set_xlabel('Response Time (ms)')
axes[0].set_ylabel('Count')
axes[0].set_title('Reaction Time Distribution', fontweight='bold')
axes[0].legend()

# Box plot
plot_data = pd.DataFrame({
    'RT (ms)': pd.concat([cong_correct['response_time_ms'], 
                          incong_correct['response_time_ms']]),
    'Type': ['Congruent'] * len(cong_correct) + ['Incongruent'] * len(incong_correct)
})
sns.boxplot(data=plot_data, x='Type', y='RT (ms)', 
           palette=['green', 'red'], ax=axes[1])
axes[1].set_title('Reaction Time by Trial Type', fontweight='bold')

plt.tight_layout()
plt.show()

### 4.3 Stroop GSR Analysis (Student Exercise)

If you recorded GSR during the Stroop test, load that recording and analyze how GSR responded to congruent vs. incongruent trials.

**Note:** If you ran the Stroop test separately from the GSR recording, you'll need to manually align the timestamps. Record the Stroop test start time and use it as an offset.

In [None]:
# TODO: If you have GSR data recorded during the Stroop test, load and analyze it
#
# Steps:
# 1. Load the GSR file recorded during the Stroop test
# 2. For each Stroop trial, extract a window of GSR data
#    (e.g., 1-5 seconds after stimulus onset)
# 3. Compute the mean SCR amplitude for congruent vs incongruent windows
# 4. Compare using a statistical test
#
# If you DON'T have simultaneous GSR + Stroop data, analyze the Stroop behavioral
# data (RT and accuracy) and discuss what you would EXPECT to see in GSR.

print('TODO: Implement Stroop GSR analysis or discuss expected GSR effects')

### Question 5

**Did incongruent Stroop trials produce different reaction times than congruent trials? Based on what you know about the autonomic nervous system, would you expect incongruent trials to also produce larger GSR responses? Why or why not?**

*Your answer here:*

> 

### Question 6

**You were able to adjust parameters in the Stroop test (stimulus time, inter-trial interval, congruent ratio). How would you expect changing the stimulus display time to affect both reaction time AND GSR? What hypothesis would you test?**

*Your answer here:*

> 

### 4.4 Response Time vs. GSR Correlation

In [None]:
# Plot response time over trial number to see if there's a learning/fatigue effect
fig, ax = plt.subplots(figsize=(14, 5))

colors = ['green' if c else 'red' for c in stroop_df['congruent']]
ax.scatter(stroop_df['trial'], stroop_df['response_time_ms'], 
          c=colors, alpha=0.6, s=50)

# Add trend line
correct_mask = stroop_df['correct'] == True
z = np.polyfit(stroop_df.loc[correct_mask, 'trial'], 
               stroop_df.loc[correct_mask, 'response_time_ms'], 1)
p = np.poly1d(z)
ax.plot(stroop_df['trial'], p(stroop_df['trial']), 
       'k--', alpha=0.5, label=f'Trend (slope={z[0]:.1f} ms/trial)')

ax.set_xlabel('Trial Number')
ax.set_ylabel('Response Time (ms)')
ax.set_title('Reaction Time Over Trials (Green=Congruent, Red=Incongruent)', fontweight='bold')
ax.legend()
plt.tight_layout()
plt.show()

slope_direction = 'decreasing (learning)' if z[0] < 0 else 'increasing (fatigue)'
print(f'RT trend: {slope_direction}, slope = {z[0]:.2f} ms per trial')

---

# Phase 3: Discussion

## Part 5: Comparative Analysis

### 5.1 Multi-Panel Summary Figure

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# (a) Full GSR trace with condition shading
ax = axes[0, 0]
for cond in gsr_df['condition'].unique():
    mask = gsr_df['condition'] == cond
    t_s, t_e = time_vec[mask][0], time_vec[mask][-1]
    ax.axvspan(t_s, t_e, alpha=0.2, color=CONDITION_COLORS.get(cond, '#999'), label=cond)
ax.plot(time_vec, filtered_gsr, linewidth=0.5, color='black')
ax.set_xlabel('Time (s)')
ax.set_ylabel('GSR')
ax.set_title('(a) Full GSR Recording', fontweight='bold')
ax.legend(loc='upper right', fontsize=9)

# (b) SCL box plots
ax = axes[0, 1]
colors = [CONDITION_COLORS[c] for c in CONDITIONS]
sns.boxplot(data=scl_plot_df, x='condition', y='SCL', order=CONDITIONS,
            palette=colors, ax=ax)
ax.set_title('(b) Tonic SCL by Condition', fontweight='bold')

# (c) SCR rate bar chart
ax = axes[1, 0]
rates = [scr_stats[c]['rate_per_min'] for c in CONDITIONS]
ax.bar(CONDITIONS, rates, color=colors, edgecolor='black')
ax.set_ylabel('SCR Rate (peaks/min)')
ax.set_title('(c) Phasic SCR Frequency', fontweight='bold')

# (d) Stroop RT comparison
ax = axes[1, 1]
if len(cong_correct) > 0 and len(incong_correct) > 0:
    means = [cong_correct['response_time_ms'].mean(), 
             incong_correct['response_time_ms'].mean()]
    sems = [cong_correct['response_time_ms'].sem(),
            incong_correct['response_time_ms'].sem()]
    ax.bar(['Congruent', 'Incongruent'], means, yerr=sems,
          color=['green', 'red'], edgecolor='black', capsize=5)
    ax.set_ylabel('Mean RT (ms)')
    ax.set_title('(d) Stroop Effect', fontweight='bold')

plt.suptitle('Lab 3 Summary: GSR Stress/Relaxation Analysis', 
            fontsize=16, fontweight='bold', y=1.01)
plt.tight_layout()
plt.show()

### Question 7

**Compare the GSR responses during mental arithmetic (Part A) vs the Stroop test (Part B). Which produced a stronger autonomic response? Why might this be? Consider both the nature of the stressor and the duration of exposure.**

*Your answer here:*

> 

### Question 8

**What are two potential confounds or sources of error in this experiment? How would you control for them in a future study? (Consider: electrode placement, movement artifacts, room temperature, habituation, individual differences.)**

*Your answer here:*

> 

### Question 9

**In what clinical or practical applications might GSR/EDA monitoring be useful? Give at least two examples and explain what GSR would measure in each case.**

*Your answer here:*

> 

### Question 10

**If you were to extend this experiment with one more session, what additional condition or stimulus would you test? State your hypothesis and describe how you would analyze the GSR data to test it.**

*Your answer here:*

> 

---

## Summary

### What We Learned

1. **GSR measures autonomic arousal** — skin conductance increases with sympathetic nervous system activation
2. **Tonic (SCL)** reflects overall arousal level; **Phasic (SCR)** reflects event-related responses
3. **Signal processing for GSR** is fundamentally different from EMG: low-pass only, no rectification
4. **Mental arithmetic** produces reliable stress-related GSR increases
5. **The Stroop effect** creates cognitive interference measurable in both RT and (potentially) GSR
6. **Experimental design** requires careful control of confounds and statistical testing

### Key Comparison: EMG vs GSR

| | EMG (Lab 1) | GSR (Lab 3) |
|---|---|---|
| **Signal** | Muscle electrical activity | Skin conductance |
| **Nervous system** | Somatic (voluntary) | Autonomic (involuntary) |
| **Frequency** | 20-450 Hz | 0-5 Hz |
| **Processing** | Bandpass + rectify + envelope | Low-pass + tonic/phasic |
| **Application** | Gesture recognition | Stress/emotion detection |

---

## Submission Checklist

- [ ] Implemented `student_lowpass_filter()` (Section 2.1)
- [ ] Computed mean SCL per condition (Section 3.2)
- [ ] Performed statistical comparison (Section 3.4)
- [ ] Analyzed Stroop test results (Section 4.3)
- [ ] Answered all 10 questions
- [ ] Run all cells
- [ ] Exported to PDF