# MODMA Depression EEG â€” Exploration

This notebook is a lightweight, reproducible sanity-check:
- Loads the precomputed feature table (one row per subject)
- Checks dataset shape, labels, and missing values
- (Optional) Quick distribution check for a few features

If you installed the package in editable mode:
```bash
pip install -e .
```
then imports should work without any `sys.path` hacks.


In [None]:
from pathlib import Path
import pandas as pd

from modma_depression_eeg.dataset_builder import build_dataset

DATA_DIR = Path("../data/MODMA")
META_PATH = DATA_DIR / "subjects_information_EEG_128channels_resting_lanzhou_2015.xlsx"

df = build_dataset(DATA_DIR, META_PATH)
df.shape


In [None]:
# Basic QA
display(df.head())
print("Label counts:\n", df["label"].value_counts())
print("Missing values (total):", int(df.isna().sum().sum()))


In [None]:
# Quick distribution sanity check for a few features (optional)
import matplotlib.pyplot as plt

example_cols = [c for c in df.columns if c.startswith("rbp_alpha_")][:4]
example_cols


In [None]:
plt.figure(figsize=(8, 4))
for c in example_cols:
    plt.hist(df[c], bins=30, alpha=0.4, label=c)
plt.title("Example feature distributions")
plt.legend(fontsize=8)
plt.xlabel("Relative alpha band power")
plt.ylabel("Number of subjects")
plt.grid(True, alpha=0.25)
plt.show()
