v0.2.0 — smarter, more honest cards + more datasets
v0.2.0 makes the dataset cards smarter and more honest, and broadens dataset coverage.
Added
- Chance level + significance — cards report the chance level (1 / n_classes) and an
approximate one-sided binomial test of whether the score beats chance (closes #1). - Per-class metrics + confusion matrix — per-class precision/recall/F1 table and a
confusion-matrix plot, so a multi-class result is interpretable beyond overall accuracy. - License / DOI / citation surfaced in cards, read defensively from MOABB metadata (closes #2).
- More motor-imagery datasets —
BNCI2014_004,Zhou2016,Weibo2014(closes #3). .pre-commit-config.yaml(Ruff) and extended mocked tests for every new output.
Verified on BNCI2014_001 (subjects 1-3, leave-one-subject-out, seed 42)
- Accuracy 0.429 vs a 0.25 chance level (binomial p < 0.001) — but the per-class breakdown
shows the baseline barely detects the "feet" class, detail a single accuracy number hides.
Honesty note
The "above chance" check is an approximate binomial test versus the naive chance level, not
a permutation test — a sanity check, not proof. Still not medical software; no diagnostic claims.