# EEG Motor Movement/Imagery Classification Using Random Forest and Convolutional Neural Networks

### Getting .env variables

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

INPUT_DIR = os.getenv("INPUT_DIR", "./data/raw")
OUTPUT_DIR = os.getenv("OUTPUT_DIR", "./result")

SFREQ = int(os.getenv("SFREQ", 160))
WINDOW_SEC = float(os.getenv("WINDOW_SEC", 2))
OVERLAP = float(os.getenv("OVERLAP", 0.5))

DEBUG = os.getenv("DEBUG", "1").strip().lower() in {"1", "true", "yes", "y"}
# DEBUG = False
print(f"DEBUGGING is {'ON' if DEBUG else 'OFF'}")

DEBUGGING is ON


In [2]:
window_samples = int(SFREQ * WINDOW_SEC)
print(f"Window samples: {window_samples}")

records_file = os.path.join(INPUT_DIR, "RECORDS")
print(f"Records file: {records_file}")

Window samples: 320
Records file: ./data/raw/RECORDS


## 1. Data Preparation

This data set consists of over 1500 one- and two-minute EEG recordings, obtained from 109 volunteers, as described below.

Subjects performed different motor/imagery tasks while 64-channel EEG were recorded using the BCI2000 system (http://www.bci2000.org). 

The experimental runs were:
1. Baseline, eyes open
2. Baseline, eyes closed
3. Task 1 (open and close left or right fist)
4. Task 2 (imagine opening and closing left or right fist)
5. Task 3 (open and close both fists or both feet)
6. Task 4 (imagine opening and closing both fists or both feet)
7. Task 1
8. Task 2
9. Task 3
10. Task 4
11. Task 1
12. Task 2
13. Task 3
14. Task 4

Each annotation includes one of three codes (**T0**, **T1**, or **T2**):
- **T0** corresponds to rest
- **T1** corresponds to onset of motion (real or imagined) of
    the left fist (in runs 3, 4, 7, 8, 11, and 12)
    both fists (in runs 5, 6, 9, 10, 13, and 14)
- **T2** corresponds to onset of motion (real or imagined) of
    the right fist (in runs 3, 4, 7, 8, 11, and 12)
    both feet (in runs 5, 6, 9, 10, 13, and 14)

### 1.1. Loading RECORDS file and verifying EDF files' existence

In [3]:
with open(records_file, "r") as f:
    records = [line.strip() for line in f if line.strip()]

print(f"Number of RECORDS entries: {len(records)}")
for r in records[:4]:
    print(" ", r)
print("  ...\n ",records[-1])

edf_paths = []
missing = []

for rel in records:
    p = os.path.join(INPUT_DIR, rel)
    if os.path.exists(p):
        edf_paths.append(p)
    else:
        missing.append(p)

print(f"\nResolved EDF files: {len(edf_paths)}")
print(f"Missing EDF files: {len(missing)}")

if missing:
    print("Example missing path:", missing[0])

Number of RECORDS entries: 1526
  S001/S001R01.edf
  S001/S001R02.edf
  S001/S001R03.edf
  S001/S001R04.edf
  ...
  S109/S109R14.edf

Resolved EDF files: 1526
Missing EDF files: 0


### 1.2. Testing [first] EDF file loading and preprocessing

In [4]:
import mne

if DEBUG:
    test_edf = edf_paths[0]
    print("Testing EDF:", test_edf)

    raw = mne.io.read_raw_edf(test_edf, preload=False, verbose=False)

    print("\n--- EDF INFO ---")
    print("Channels:", len(raw.ch_names))
    print("Sampling freq:", raw.info["sfreq"])
    print("Duration (sec):", raw.times[-1])
    print("First 10 channels:", raw.ch_names[:10])

    assert len(raw.ch_names) >= 64, "Expected ~64 EEG channels"
    assert abs(raw.info["sfreq"] - SFREQ) < 1e-3, "Sampling frequency mismatch"

Testing EDF: ./data/raw/S001/S001R01.edf

--- EDF INFO ---
Channels: 64
Sampling freq: 160.0
Duration (sec): 60.99375
First 10 channels: ['Fc5.', 'Fc3.', 'Fc1.', 'Fcz.', 'Fc2.', 'Fc4.', 'Fc6.', 'C5..', 'C3..', 'C1..']
