# Visualizing PhysioNet EEG Motor Movement/Imagery Dataset

This notebook demonstrates how to download, load, and visualize EEG data from the PhysioNet EEG Motor Movement/Imagery Dataset.

## Dataset Information

The EEG Motor Movement/Imagery Dataset contains 64-channel EEG data from 109 subjects performing various motor/imagery tasks. The dataset is available at:
https://physionet.org/content/eegmmidb/1.0.0/

**Note**: The dataset uses the EDF format (not MAT), which is common for physiological signals.

## Dataset Details:
- **Subjects**: 109 volunteers
- **Channels**: 64 EEG channels
- **Sampling Rate**: 160 Hz
- **Tasks**: Baseline, motor execution, and motor imagery tasks
- **Format**: EDF (European Data Format)

## Requirements

Before running this notebook, make sure you have the following packages installed:

```bash
pip install numpy matplotlib scipy mne requests
```

These dependencies are also listed in `requirements.txt`.

## 1. Import Required Libraries

In [None]:
import os
import requests
import numpy as np
import matplotlib.pyplot as plt
import mne
from pathlib import Path

# Set up plotting style
try:
    plt.style.use('seaborn-v0_8-darkgrid')
except OSError:
    # Fall back to default style if seaborn style is not available
    plt.style.use('default')
    print("Note: Using default matplotlib style")
%matplotlib inline

print("Libraries imported successfully!")
print(f"MNE version: {mne.__version__}")

## 2. Download EEG Data from PhysioNet

We'll download a sample EDF file from the PhysioNet database. For this example, we'll use Subject 1, Run 1 (baseline task with eyes open).

In [None]:
# Create a directory to store the downloaded data
data_dir = Path('physionet_eeg_data')
data_dir.mkdir(exist_ok=True)

# Define the URL for the EEG file
# Subject 1, Run 1 (baseline, eyes open)
subject = 'S001'
run = 'R01'
filename = f'{subject}{run}.edf'
url = f'https://physionet.org/files/eegmmidb/1.0.0/{subject}/{filename}'

file_path = data_dir / filename

# Download the file if it doesn't exist
if not file_path.exists():
    print(f"Downloading {filename}...")
    try:
        response = requests.get(url, timeout=30)
        response.raise_for_status()
        
        with open(file_path, 'wb') as f:
            f.write(response.content)
        
        print(f"Download complete! File saved to {file_path}")
    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 404:
            print(f"Error: File not found (404). The file may have been moved or the URL is incorrect.")
        elif e.response.status_code == 403:
            print(f"Error: Access denied (403). You may not have permission to access this file.")
        else:
            print(f"HTTP Error {e.response.status_code}: {e}")
        print("\nAlternative: You can manually download the file from:")
        print(url)
        print(f"And place it in the '{data_dir}' directory")
        raise
    except requests.exceptions.RequestException as e:
        print(f"Error downloading file: {e}")
        print("This may be due to network issues or firewall restrictions.")
        raise
else:
    print(f"File already exists: {file_path}")

## 3. Load EEG Data

We'll use the MNE-Python library to load and process the EDF file. MNE is a powerful library specifically designed for EEG/MEG data analysis.

In [None]:
# Load the EDF file
# Note: Using preload=False initially to check metadata without loading all data
# For large files, this saves memory. We'll load data as needed later.
try:
    raw = mne.io.read_raw_edf(file_path, preload=False, verbose=False)
    print("EEG data file opened successfully!")
    print(f"\nDataset Information:")
    print(f"  Number of channels: {len(raw.ch_names)}")
    print(f"  Sampling rate: {raw.info['sfreq']} Hz")
    print(f"  Duration: {raw.times[-1]:.2f} seconds")
    print(f"  Total samples: {len(raw.times)}")
    print(f"\nFirst 10 channel names: {raw.ch_names[:10]}")
    
    # Load the data into memory now (for this small example file it's fine)
    # For larger files, consider loading only specific time segments
    raw.load_data()
    print("\nData loaded into memory.")
except Exception as e:
    print(f"Error loading EEG data: {e}")
    raise

## 4. Visualize EEG Channels

### 4.1 Plot Multiple Channels - Time Series View

We'll create a time-series plot showing multiple EEG channels. This gives us an overview of the signal patterns across different brain regions.

In [None]:
# Select a subset of channels for visualization
# We'll focus on common EEG channels: Fz, Cz, Pz, Oz (midline channels)
channels_to_plot = ['Fc5.', 'Fc3.', 'Fc1.', 'Fcz.', 'Fc2.', 'Fc4.', 'Fc6.']

# Filter to only include channels that exist in the data
available_channels = [ch for ch in channels_to_plot if ch in raw.ch_names]

if not available_channels:
    # If specified channels are not available, use the first 7 channels
    available_channels = raw.ch_names[:7]

print(f"Plotting channels: {available_channels}")

# Extract data for selected channels (first 10 seconds)
duration = 10.0  # seconds
start_time = 0.0
end_time = start_time + duration

# Get data
data, times = raw.get_data(picks=available_channels, 
                           start=int(start_time * raw.info['sfreq']),
                           stop=int(end_time * raw.info['sfreq']),
                           return_times=True)

# Convert from Volts to microvolts for better readability
data_uv = data * 1e6

# Create the plot
fig, axes = plt.subplots(len(available_channels), 1, 
                         figsize=(14, 2*len(available_channels)), 
                         sharex=True)

if len(available_channels) == 1:
    axes = [axes]

for idx, (ax, ch_name) in enumerate(zip(axes, available_channels)):
    ax.plot(times, data_uv[idx], linewidth=0.5, color='steelblue')
    ax.set_ylabel(f'{ch_name}\n(µV)', fontsize=10)
    ax.grid(True, alpha=0.3)
    ax.set_xlim(times[0], times[-1])
    
    # Add zero line for reference
    ax.axhline(y=0, color='red', linestyle='--', alpha=0.3, linewidth=0.8)

axes[-1].set_xlabel('Time (seconds)', fontsize=12)
fig.suptitle('EEG Time Series - Multiple Channels', fontsize=14, fontweight='bold', y=0.995)
plt.tight_layout()
plt.show()

print(f"\nDisplayed {duration} seconds of EEG data from channels: {', '.join(available_channels)}")

### 4.2 Plot Single Channel with Detailed View

Let's examine a single channel in more detail to see the EEG waveform characteristics.

In [None]:
# Select a single channel for detailed visualization
single_channel = available_channels[0]
channel_idx = raw.ch_names.index(single_channel)

# Get 5 seconds of data
duration = 5.0
data_single, times_single = raw.get_data(picks=[single_channel],
                                         start=0,
                                         stop=int(duration * raw.info['sfreq']),
                                         return_times=True)

# Convert to microvolts
data_single_uv = data_single[0] * 1e6

# Create detailed plot
fig, ax = plt.subplots(figsize=(14, 5))
ax.plot(times_single, data_single_uv, linewidth=1.0, color='darkblue', label=single_channel)
ax.set_xlabel('Time (seconds)', fontsize=12)
ax.set_ylabel('Amplitude (µV)', fontsize=12)
ax.set_title(f'Detailed EEG Signal - Channel {single_channel}', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3)
ax.legend(loc='upper right')
ax.axhline(y=0, color='red', linestyle='--', alpha=0.3, linewidth=1.0)

plt.tight_layout()
plt.show()

# Print statistics
print(f"\nChannel {single_channel} Statistics:")
print(f"  Mean: {np.mean(data_single_uv):.2f} µV")
print(f"  Std Dev: {np.std(data_single_uv):.2f} µV")
print(f"  Min: {np.min(data_single_uv):.2f} µV")
print(f"  Max: {np.max(data_single_uv):.2f} µV")
print(f"  Range: {np.max(data_single_uv) - np.min(data_single_uv):.2f} µV")

### 4.3 Power Spectral Density (PSD) Plot

Analyze the frequency content of the EEG signals. This helps identify dominant brain rhythms (delta, theta, alpha, beta, gamma).

In [None]:
# Compute power spectral density
fig, ax = plt.subplots(figsize=(12, 6))

# Select channels for PSD analysis
psd_channels = available_channels[:4]  # Use first 4 channels

# Compute and plot PSD for each channel
for ch in psd_channels:
    # Get PSD using Welch's method
    spectrum = raw.compute_psd(picks=[ch], fmin=0.5, fmax=50, method='welch', verbose=False)
    psds, freqs = spectrum.get_data(return_freqs=True)
    
    # Convert to dB
    psds_db = 10 * np.log10(psds[0])
    
    ax.plot(freqs, psds_db, linewidth=2, label=ch, alpha=0.8)

# Mark frequency bands
bands = {
    'Delta (0.5-4 Hz)': (0.5, 4, 'gray'),
    'Theta (4-8 Hz)': (4, 8, 'blue'),
    'Alpha (8-13 Hz)': (8, 13, 'green'),
    'Beta (13-30 Hz)': (13, 30, 'orange'),
    'Gamma (30-50 Hz)': (30, 50, 'red')
}

for band_name, (fmin, fmax, color) in bands.items():
    ax.axvspan(fmin, fmax, alpha=0.1, color=color)

ax.set_xlabel('Frequency (Hz)', fontsize=12)
ax.set_ylabel('Power Spectral Density (dB)', fontsize=12)
ax.set_title('Power Spectral Density - EEG Frequency Bands', fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
ax.grid(True, alpha=0.3)
ax.set_xlim(0.5, 50)

plt.tight_layout()
plt.show()

print("\nEEG Frequency Bands:")
print("  Delta (0.5-4 Hz): Deep sleep")
print("  Theta (4-8 Hz): Drowsiness, meditation")
print("  Alpha (8-13 Hz): Relaxed, eyes closed")
print("  Beta (13-30 Hz): Active thinking, focus")
print("  Gamma (30-50 Hz): High-level cognitive processing")

### 4.4 Topographic Map (Scalp Topography)

Visualize the spatial distribution of EEG activity across the scalp at a specific time point.

In [None]:
# Create a topographic map showing the average signal across a time window
try:
    # Select a time window (e.g., 2-3 seconds)
    tmin, tmax = 2.0, 3.0
    
    # Create evoked object for topographic plotting
    # Average the data over the time window
    times_to_plot = [2.5]  # Middle of the window
    
    # Get the data for all EEG channels
    data_for_topo = raw.get_data(start=int(tmin * raw.info['sfreq']),
                                 stop=int(tmax * raw.info['sfreq']))
    
    # Average over time
    data_avg = np.mean(data_for_topo, axis=1, keepdims=True)
    
    # Create evoked structure for plotting
    info = mne.create_info(ch_names=raw.ch_names, sfreq=raw.info['sfreq'], ch_types='eeg')
    evoked = mne.EvokedArray(data_avg, info, tmin=0)
    
    # Set standard montage for electrode positions
    montage = mne.channels.make_standard_montage('standard_1020')
    evoked.set_montage(montage, on_missing='ignore')
    
    # Plot topographic map
    fig = evoked.plot_topomap(times=[0], ch_type='eeg', 
                              time_format='',
                              colorbar=True, size=4,
                              title='EEG Topographic Map\n(Average 2-3 seconds)',
                              show=False)
    plt.tight_layout()
    plt.show()
    
    print("Topographic map shows the spatial distribution of EEG activity across the scalp.")
    print("Warmer colors indicate higher amplitude, cooler colors indicate lower amplitude.")
    
except Exception as e:
    print(f"Note: Topographic plotting requires proper electrode positions.")
    print(f"Error: {e}")
    print("This is expected if the electrode positions are not standard or missing.")

## 5. Interactive Visualization with MNE

MNE provides an interactive browser for exploring EEG data. This is useful for detailed inspection of the signals.

In [None]:
# Note: The interactive plot works best in local Jupyter environments
# It may not display properly in some cloud-based notebooks

print("Creating interactive EEG browser...")
print("Note: This visualization works best in local Jupyter notebooks.")
print("If running in a cloud environment, you may see a static image instead.")
print("\nIn the interactive view, you can:")
print("  - Scroll through time")
print("  - Zoom in/out")
print("  - Select/deselect channels")
print("  - Adjust scaling")

# Create static plot for non-interactive environments
fig = raw.plot(duration=10.0, n_channels=20, scalings='auto', 
               title='EEG Data Browser (First 10 seconds, 20 channels)',
               show=False, block=False)
plt.show()

## 6. Summary Statistics

Let's compute some summary statistics for all channels to get an overview of the data quality and characteristics.

In [None]:
# Get all data
all_data = raw.get_data()

# Convert to microvolts
all_data_uv = all_data * 1e6

# Compute statistics for each channel
print("\nSummary Statistics for All Channels:")
print("=" * 80)
print(f"{'Channel':<15} {'Mean (µV)':<12} {'Std (µV)':<12} {'Min (µV)':<12} {'Max (µV)':<12}")
print("-" * 80)

for idx, ch_name in enumerate(raw.ch_names[:10]):  # Show first 10 channels
    mean_val = np.mean(all_data_uv[idx])
    std_val = np.std(all_data_uv[idx])
    min_val = np.min(all_data_uv[idx])
    max_val = np.max(all_data_uv[idx])
    
    print(f"{ch_name:<15} {mean_val:>11.2f} {std_val:>11.2f} {min_val:>11.2f} {max_val:>11.2f}")

if len(raw.ch_names) > 10:
    print(f"... and {len(raw.ch_names) - 10} more channels")

print("\nOverall Dataset Statistics:")
print(f"  Total recording duration: {raw.times[-1]:.2f} seconds ({raw.times[-1]/60:.2f} minutes)")
print(f"  Number of channels: {len(raw.ch_names)}")
print(f"  Sampling rate: {raw.info['sfreq']} Hz")
print(f"  Total samples per channel: {len(raw.times):,}")
print(f"  Overall mean amplitude: {np.mean(all_data_uv):.2f} µV")
print(f"  Overall std amplitude: {np.std(all_data_uv):.2f} µV")

## 7. Conclusion

This notebook demonstrated:

1. **Data Download**: How to programmatically download EEG data from PhysioNet
2. **Data Loading**: Using MNE-Python to load EDF files
3. **Visualization**: Multiple approaches to visualize EEG data:
   - Time-series plots of multiple channels
   - Detailed single-channel analysis
   - Power spectral density analysis
   - Topographic maps (spatial distribution)
   - Interactive browser
4. **Analysis**: Basic statistical analysis of EEG signals

### Next Steps

You can extend this notebook by:
- Downloading and comparing multiple runs from the same subject
- Comparing different tasks (baseline vs. motor execution vs. motor imagery)
- Applying filters (bandpass, notch) to remove artifacts
- Performing event-related potential (ERP) analysis
- Using machine learning for classification of motor tasks
- Analyzing connectivity between different brain regions

### Additional Resources

- [MNE-Python Documentation](https://mne.tools/stable/index.html)
- [PhysioNet EEG Dataset](https://physionet.org/content/eegmmidb/1.0.0/)
- [EEG Analysis Tutorial](https://mne.tools/stable/auto_tutorials/index.html)