# EEG Source Reconstruction Pipeline

This notebook outlines a step-by-step procedure to:
1. Load and prepare EEG data from an EEGLAB `.set` file.
2. Compute a forward solution using MNE’s `fsaverage` template.
3. Invert sensor-level data to source space (i.e., estimate cortical current densities).
4. Parcellate those sources into anatomically defined ROIs (Desikan-Killiany atlas).
5. (Optionally) reshape the results back to an epoch-like structure for easier viewing.

Let's get started!

***

In [None]:
import mne
import numpy as np
import matplotlib.pyplot as plt
import os

%matplotlib qt

# (Optional) Path setup for your project
raw_data_path = '../data/raw/'
source_path = '../data/source_reconstruction/'


## 1. Load EEG Epochs

We load preprocessed EEG data stored in an EEGLAB `.set` file and convert it to an 
MNE `Epochs` object. This gives us a structured representation of the data with
event-based segmentation (epochs).



In [None]:
path = raw_data_path + 'PPT1/'
input_fname = path + 's_101_Coordination.set'
epochs = mne.io.read_epochs_eeglab(input_fname)
print(epochs)

# Subsample epochs to reduce memory usage (optional)
# epochs = epochs[:5]  # e.g., only keep first 5 epochs


In [None]:
epochs.plot_sensors(kind='3d')
plt.show()

## 2. Set Montage

We apply the standard 64-channel BioSemi montage to ensure that electrode positions are accurately mapped in 3D space.


In [None]:
#montage = mne.channels.make_standard_montage('biosemi64')
montage = mne.channels.make_standard_montage("standard_1005")
epochs.set_montage(montage)

## 3. Compute Forward Solution

A forward solution maps neural sources in the brain to the EEG electrodes. We use
the MNE-supplied `fsaverage` template, which includes:
- A source space (`ico-5`) describing dipole locations on the cortical mesh
- A BEM model describing how currents propagate through the scalp/skull/brain
- A standard head-to-MRI transform (`fsaverage`)

This step produces a gain matrix that we'll later invert (via the inverse operator).

In [None]:
SUBJECTS_DIR = mne.datasets.fetch_fsaverage()
SUBJECT = 'fsaverage'

# If fetch_fsaverage() returns a path ending with "fsaverage/fsaverage",
# move one directory up to get the correct SUBJECTS_DIR:
if "fsaverage" in os.path.basename(SUBJECTS_DIR):
    SUBJECTS_DIR = os.path.dirname(SUBJECTS_DIR)
print(f"Fsaverage directory is at: {SUBJECTS_DIR}")

# Standard transformation: 'fsaverage'
trans = 'fsaverage'  

# The source space & BEM files for fsaverage
src = os.path.join(SUBJECTS_DIR, SUBJECT, 'bem', 'fsaverage-ico-5-src.fif') 
bem = os.path.join(SUBJECTS_DIR, SUBJECT, 'bem', 'fsaverage-5120-5120-5120-bem-sol.fif') 

# Build the forward solution (5 mm mindist from inner skull)
fwd = mne.make_forward_solution(
    info=epochs.info,
    trans=trans,
    src=src,
    bem=bem,
    eeg=True,
    mindist=5.0,
    n_jobs=4
)

print(fwd)

# Save forward operator for reuse
fwd_fname = os.path.join(source_path, 'fsaverage_64_fwd.fif')
mne.write_forward_solution(fwd_fname, fwd, overwrite=True)


In [None]:
mne.viz.plot_alignment(epochs.info, trans=trans, src=src, bem=bem)

In [None]:
# Check that the locations of EEG electrodes is correct with respect to MRI
mne.viz.plot_alignment(
    epochs.info,
    src=src,
    eeg=["original", "projected"],
    trans=trans,
    show_axes=True,
    mri_fiducials=True,
    dig="fiducials",
)

### 3.5 Re-loading the Forward Solution (if needed)

In practice, you can load the saved forward solution instead of recomputing it 
each time. Here we show how to read it back in.

In [None]:
fname_fwd = os.path.join(source_path, 'fsaverage_64_fwd.fif')
fwd = mne.read_forward_solution(fname_fwd)

In [None]:
mne.viz.plot_alignment(
    epochs.info,
    trans='fsaverage',
    subject='fsaverage',
    subjects_dir=SUBJECTS_DIR,
    eeg=['original', 'projected'],
    src=fwd['src']
)

## 4. Parcellation - Desikan-Killiany Atlas

This atlas divides each hemisphere into 34 regions, giving 68 cortical ROIs total.

The file `aparc.annot` is part of the FreeSurfer segmentation for `fsaverage`.

In [None]:
labels = mne.read_labels_from_annot(
    subject="fsaverage",
    parc="aparc",
    subjects_dir=SUBJECTS_DIR
)
# Remove the "unknown" label
labels = labels[:-1] 
label_names = [label.name for label in labels]
print(label_names)

In [None]:
Brain = mne.viz.get_brain_class()
brain = Brain(
    "fsaverage",
    "both",
    "inflated",
    subjects_dir=SUBJECTS_DIR,
    background="white",
    size=(800, 600),
)
brain.add_annotation("aparc")
brain.add_label(labels[0])

## 5. Concatenate Epochs into One Raw

To avoid computing large STCs for each epoch, we concatenate all epochs in time 
forming a single, continuous `Raw` object. We'll then apply the inverse 
solution on one label at a time, drastically reducing memory usage.


In [None]:
# 1) Get data shape
n_epochs, n_channels, n_times = epochs.get_data().shape
print(f"Epochs shape: {n_epochs} epochs, {n_channels} channels, {n_times} time points each")

# 2) Keep only EEG channels (drop EOG, etc. if any)
epochs_eeg = epochs.copy().pick_types(eeg=True)

# 3) Convert [n_epochs, n_channels, n_times] -> [n_channels, n_epochs * n_times]
data_3d = epochs_eeg.get_data()  # shape (n_epochs, n_eeg_ch, n_times)
data_2d = data_3d.transpose(1, 0, 2).reshape(n_channels, -1)
print(f"Data shape after conversion: {data_2d.shape}")

# 4) Create a RawArray with the same Info (for EEG channels only)
info_eeg = epochs_eeg.info
raw = mne.io.RawArray(data_2d, info_eeg)
raw._filenames = [""]  # to avoid warnings about missing filename
raw.set_eeg_reference(projection=True)
print(raw)

## 6. Noise Covariance

Here we use a simple **ad-hoc covariance** (diagonal with default noise values). 

It is possible too also use `mne.compute_covariance`, but that is applied for a baseline which we don't have here.

In [None]:
noise_cov = mne.make_ad_hoc_cov(raw.info, None)
print("Ad-hoc noise covariance diagonal:\n", noise_cov.data)

## 7. Construct the Inverse Operator

Combining the forward model, noise covariance, and sensor info yields an inverse 
operator that we can use to reconstruct cortical sources from scalp measurements.


In [None]:
inverse_operator = mne.minimum_norm.make_inverse_operator(
    info=raw.info,
    forward=fwd,
    noise_cov=noise_cov,
    loose=1.0,    # free orientation
    depth=0.8
)

## 8. Obtain Time Series for Each ROI

We loop over each Desikan-Killiany label, computing the inverse solution 
restricted to that subset of cortical vertices. This is memory-efficient 
because we never handle the entire cortex at once.

- **`apply_inverse_raw(label=...)`**: Only solves for vertices in that label.
- **PCA**: Reduces the 3D dipole orientations to a single principal axis.
- **mean_flip**: Ensures consistent polarity so waveforms don't cancel out.


In [None]:
n_labels = len(labels)
print(f"Number of labels (ROIs): {n_labels}")

snr = 3.0
lambda2 = 1.0 / snr**2

# label_ts will store the final time courses: [n_labels, total_time_points]
label_ts = np.zeros((n_labels, n_epochs * n_times))

for li, label in enumerate(labels):
    # Apply inverse for only these vertices
    stc = mne.minimum_norm.apply_inverse_raw(
        raw,
        inverse_operator,
        lambda2=lambda2,
        method='MNE',
        pick_ori='vector',  # unconstrained orientation
        label=label,
        verbose=False
    )

    # PCA to collapse the 3 orientation components
    stc_pca, pca_dir = stc.project(directions='pca', src=inverse_operator['src'])

    # Extract the mean time course for this ROI
    roi_data = mne.extract_label_time_course(
        stc_pca, [label], inverse_operator['src'],
        mode='mean_flip', return_generator=False, verbose=False
    )
    # Insert into array (roi_data[0] => shape = total_time_points)
    label_ts[li, :] = roi_data[0, :]

    # Free memory
    del stc, stc_pca

    # Progress logging
    if (li+1) % 5 == 0:
        print(f"Processed {li+1} / {n_labels} labels")

print("All labels processed. Shape of label_ts:", label_ts.shape)


### Quick Plot of One ROI

Here we just take one ROI (e.g., label index 20) and show its time course across 
the concatenated epochs.


In [None]:
# Plot the time series of the first label
plt.figure(figsize=(10, 5))
plt.plot(1e3 * label_ts[20, :])
plt.xlabel("Time (ms)")
plt.ylabel("Mean source amplitude")
plt.title(f"Mean source amplitude for {labels[0].name}")
plt.show()

## 9. (Optional) Reshape into Epochs

If you want to restore the data to an epoch-like structure, we can reshape 
`[label, total_time] -> [epochs, label, time]`. Then we can use `mne.EpochsArray`
to visualize it in MNE's usual epoch plot.

In [None]:
label_ts_reshaped = label_ts.reshape(n_labels, n_epochs, n_times).transpose(1, 0, 2)
print("New shape: ", label_ts_reshaped.shape)  # (n_epochs, n_labels, n_times)

# Create an EpochsArray object for ROI-based signals
info = mne.create_info(
    ch_names=[lbl.name for lbl in labels],
    sfreq=epochs.info['sfreq'],
    ch_types='eeg'
)
label_epochs = mne.EpochsArray(
    data=label_ts_reshaped,
    info=info,
    tmin=epochs.times[0],
    verbose=False
)
# save the epochs
label_epochs.save(os.path.join(source_path, 's101_Coordination-source-epo.fif'), overwrite=True)

In [None]:

# Plot the ROI-level epochs
label_epochs.plot(n_channels=10, n_epochs=5, scalings="auto", title="ROI-level epochs")


In [None]:
#print label names
print(label_names)
print(epochs.ch_names)

In [None]:

# label_ts has shape (n_labels, total_time)
# 'labels' is a list of label objects
# 'label_names' is the list of label names, or you can do [lbl.name for lbl in labels]

roi_info = mne.create_info(
    ch_names=[lbl.name for lbl in labels],    # e.g. "cuneus-lh", "insula-rh", ...
    sfreq=epochs.info['sfreq'],
    ch_types='eeg'  # treat each ROI time course as an EEG channel
)

roi_raw = mne.io.RawArray(label_ts, roi_info)
roi_raw._filenames = [""]  # to avoid filename warnings


# Plot the sensor-level concatenated data
raw.plot(
    n_channels=10,   # how many channels to view at once
    scalings='auto',
    title='Sensor-level (Raw)'
)

# Plot the ROI-level data
roi_raw.plot(
    n_channels=10,
    scalings='auto',
    title='ROI-level (Raw)'
)


In [None]:
# --- Choose an anatomically close pair ---
sensor_name = 'P1'       # Sensor channel (from raw.ch_names)
roi_name = 'superiorparietal-lh'   # ROI channel (from roi_raw.ch_names)

# --- Extract data from raw objects ---
sensor_idx = raw.ch_names.index(sensor_name)
roi_idx = roi_raw.ch_names.index(roi_name)

sensor_signal = raw.get_data(picks=[sensor_idx])[0]  # shape: (n_times,)
roi_signal = roi_raw.get_data(picks=[roi_idx])[0]    # shape: (n_times,)

# --- Time vector ---
sfreq = raw.info['sfreq']
n_times = sensor_signal.shape[0]
time = np.arange(n_times) / sfreq  # in seconds


In [None]:
# --- Choose an anatomically close pair ---
sensor_name = 'C1'       # Sensor channel (from raw.ch_names)
roi_name = 'paracentral-lh'   # ROI channel (from roi_raw.ch_names)

# --- Extract data from raw objects ---
sensor_idx = raw.ch_names.index(sensor_name)

roi_idx = roi_raw.ch_names.index(roi_name)

sensor_signal = raw.get_data(picks=[sensor_idx])[0]  # shape: (n_times,)
roi_signal = -roi_raw.get_data(picks=[roi_idx])[0]    # shape: (n_times,)

# --- Time vector ---
sfreq = raw.info['sfreq']
n_times = sensor_signal.shape[0]
time = np.arange(n_times) / sfreq  # in seconds


In [None]:
# --- Plot side-by-side ---
fig, axes = plt.subplots(2, 1, figsize=(12, 6), sharex=True)

axes[0].plot(time, sensor_signal, color='blue')
axes[0].set_ylabel('Sensor amplitude (µV)')
axes[0].set_title(f'Sensor-level signal: {sensor_name}')

axes[1].plot(time, roi_signal, color='green')
axes[1].set_ylabel('Source amplitude (nAm)')
axes[1].set_xlabel('Time (s)')
axes[1].set_title(f'Source-level signal: {roi_name}')

plt.tight_layout()
plt.show()

In [None]:
from scipy.stats import zscore

sensor_signal_norm = zscore(sensor_signal)
roi_signal_norm = zscore(roi_signal)

plt.figure(figsize=(12, 5))
plt.plot(time, sensor_signal_norm, label=f'{sensor_name} (Sensor)', linewidth=2)
plt.plot(time, roi_signal_norm, label=f'{roi_name} (ROI)', linewidth=2)
plt.xlabel('Time (s)')
plt.ylabel('Z-scored amplitude')
plt.title('Normalized ROI vs Sensor waveforms')
plt.legend()
plt.show()


# Conclusion

This notebook demonstrates a **memory-efficient** approach to EEG source 
reconstruction in MNE using a template subject (`fsaverage`). We:

1. Loaded epochs from EEGLAB format.
2. Set a 64-channel standard montage.
3. Computed (and saved) a forward solution.
4. **Looped label-by-label** to apply the inverse, reducing memory usage.
5. Extracted label time series and optionally reshaped them for epoch-style plotting.

From here, you could:
- Compute power spectra in each ROI,
- Investigate connectivity between ROIs,
- or visualize full-brain STCs for single epochs if you want 3D interactive plots 
  (by not restricting to `label=...` in `apply_inverse_raw`).

Happy analyzing!
