# Background

This assignment is on the temporal response function (TRF) framework for EEG analysis.

The following sources may be helpful for this assignment:
- Eelbrain [examples on TRFs](https://eelbrain.readthedocs.io/en/latest/auto_examples/#temporal-response-functions), in particular [TRF for Alice EEG Dataset](https://eelbrain.readthedocs.io/en/latest/auto_examples/temporal-response-functions/alice-trf.html)
- Paper describing more background: https://elifesciences.org/articles/85012


## General instructions
Same procedural instructions as for A1/2.

In addition: 

This assignment is in the form of questions about the data. 
For each question, decide on an approach to answer it at the group level. 
Generally, repeat the same analysis steps for each subject, 
collect relevant outcome variable(s) for each subject,
and then use appropriate test statistics that answers the question 
(e.g., a related-measures *t*-test).

Basic statistics functions from [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html) should be sufficient
(alongside `matplotlib` visualization).
However, you are also allowed to use functions from `pandas`, `seaborn`, `mne`, and `eelbrain` if those are helpful.

Make sure to formulate your answer to the initial question, and make it clear how the data you present supports you answer.

Document reasoning for parameter choices.

## Setup
- Use the environment defined in `A3-environment.yml`.
  - Check the file for instructions on how to create the environment from the file
- Download the Alice dataset.
  - The steps are outlined in the repository's README file (https://github.com/Eelbrain/Alice). In the README, under **Setup**, follow **Download this repository** and **Download the Alice dataset** (you don't need to follow **Create the Python environment** because we are using the `4CN3-A3` environment created in the previuos step)

# Decoding (Backward Model)

## 1. 

Question: *At what latencies does the cortical EEG response encode information about the acoustic envlope?*

Use cross-validation to fit decoders that can reconstruct the acoustic envelope from the EEG response.
Use linear decoders as discussed in class:
$\hat{s}_t = \sum_{ch} \sum_{\tau} h_{ch,\tau} r_{ch,t+\tau}$

Test the reconstruction accuracy on held-out test data.

By restricting the decoder to use EEG in a narrow latency window (i.e., by restricting the range of values used for $\tau$),
you can determine how much information the brain response contains in that latency range about the stimulus envelope.
For example, a decoder that only uses EEG data points lagging the stimulus by 175-225 ms
($\tau$ between 175 and 225 ms)
can tell you how much information about the stimulus is contained in the EEG response in the latency 175-225 ms.

Create a plot that shows decoding performance as a function of latency of the time lags 
(always using the same number of EEG data points, but varying the latencies used).
Use 6 different time ranges to cover the broad range between 0 and 600 ms relative to the stimulus.
Include some measure of confidence, like standard errors.
Evaluate which latency ranges lead to above chance decoding.

To speed up training for this assignment you can set `delta=0.05`. 
Note that, for a real experiment, we would be trading some accuracy for time.

## 2. 

Question: *How much data do you need to distinguish between 2 trials?*

Choose parameters for an optimal decoder based on your observations above.

For each participant, use the first n-2 trials to fit a decoder. 
Then, use the decoder to classify trial *n* and trial *n-1* as trial *n* vs trial *n-1*.
For example, a simple approach for classifying trial *n* would be to compare
$r(\hat{env_n}, env_n)$ with $r(\hat{env_n}, env_{n-1})$,
where *r* is the Pearson correlation, 
$env_n$ is the actual envelope of trial *n*,
and $\hat{env_n}$ is the envelope reconstructed from trial *n*.

Determine how the length of the test data (trials *n* and *n-1*) affects the decoder's ability to distinguish between the two trials.
For example, if you're given only the first 5 s of each trial, your classification will presumably perform worse than if you are given the first 120 s.
Use at least 5 different lengths between 5 s and full trial length.