The current primary use of the PNPL library is for the LibriBrain competition. Click here to learn more and get started!
Welcome to PNPL — a Python toolkit for loading and processing brain datasets for deep learning. It provides ready‑to‑use dataset classes (PyTorch Dataset) and utilities with a simple, consistent API.
- Friendly dataset APIs backed by real MEG recordings
- Batteries‑included standardization, clipping, and windowing
- LibriBrain 2025 dataset support with optional on‑demand download
- Works with PyTorch
DataLoaderout of the box - Clean namespace and lazy imports to keep startup fast
pip install pnpl
This will also take care of all requirements.
The core functionality of the library is contained in the two Dataset classes LibriBrainSpeech and LibriBrainPhoneme.
Check out the basic usage:
This wraps the LibriBrain dataset for use in speech detection problems.
from pnpl.datasets import LibriBrainSpeech
speech_example_data = LibriBrainSpeech(
data_path="./data/",
include_run_keys = [("0","1","Sherlock1","1")]
)
sample_data, label = speech_example_data[0]
# Print out some basic info about the sample
print("Sample data shape:", sample_data.shape)
print("Label shape:", label.shape)This wraps the LibriBrain dataset for use in phoneme classification problems.
from pnpl.datasets import LibriBrainPhoneme
phoneme_example_data = LibriBrainPhoneme(
data_path="./data/",
include_run_keys = [("0","1","Sherlock1","1")]
)
sample_data, label = phoneme_example_data[0]
# Print out some basic info about the sample
print("Sample data shape:", sample_data.shape)
print("Label shape:", label.shape)In case of any questions or problems, please get in touch through our Discord server.
Load a single run of the LibriBrain Speech dataset and iterate samples:
from pnpl.datasets.libribrain2025 import constants
from pnpl.datasets import LibriBrainSpeech
ds = LibriBrainSpeech(
data_path="./data/LibriBrain",
preprocessing_str="bads+headpos+sss+notch+bp+ds",
include_run_keys=[constants.RUN_KEYS[0]], # pick a single run
tmin=0.0,
tmax=0.2,
standardize=True,
include_info=True,
)
print(len(ds), "samples")
x, y, info = ds[0]
print(x.shape, y.shape, info["dataset"]) # (channels,time), (time,), "libribrain2025"We publish documentation with Jupyter Book and GitHub Pages.
- Local preview:
pip install -r docs/requirements.txt && jupyter-book build docs/then opendocs/_build/html/index.html. - GitHub Pages: when made public, enable Pages via repo settings to publish automatically from the existing workflow.
We welcome contributions from the community!
- Read the Contributor Guide in
docs/contributing.mdfor setup, coding style, and PR workflow. - Open issues for bugs and enhancements with clear, minimal repros when possible.
- Tests: add/update
pytesttests for any feature or fix.
Quick dev setup:
git clone https://github.com/neural-processing-lab/pnpl-public.git
cd pnpl-public
python -m venv .venv && source .venv/bin/activate
pip install -e .
pip install pytest
pytest -q- Check the FAQ at
docs/faq.md. - If something is unclear in the docs, please open a documentation issue.
BSD‑3‑Clause. See LICENSE for details.