Music Error Detection with Iterative Inter-Stream Alignment and Symbolic Score Prompting
LadderSym is a transformer-based system for end-to-end music performance error detection.
Clone the repository and set up the environment:
git clone https://github.com/ben2002chou/LadderSYM.git
cd LadderSYM
conda create -n laddersym python=3.11 -y
conda activate laddersym
pip install -r requirements.txt
wandb loginDownload datasets and official checkpoints (recommended):
python setup_laddersym_assets.py --datasets all --checkpoints all
source .env.laddersymManual setup is also supported (see Data and Checkpoints below).
python train_laddersym.py \
--config-name=config_maestro \
model=laddersym_MT3Net \
dataset=MAESTRO \
split_frame_length=2000python train_laddersym.py \
--config-name=config_coco \
model=laddersym_MT3Net \
dataset=CocoChorales \
split_frame_length=2000MAESTRO-E:
python test_laddersym.py \
--config-dir=config \
--config-name=config_maestro \
model=laddersym_MT3Net \
path=/path/to/model.ckpt \
eval.eval_dataset=MAESTRO \
eval.exp_tag_name=laddersym_maestro \
hydra/job_logging=disabled \
eval.contiguous_inference=True \
split_frame_length=2000CocoChorales-E:
python test_laddersym_coco.py \
--config-dir=config \
--config-name=config_coco \
model=laddersym_MT3Net \
path=/path/to/model.ckpt \
eval.eval_dataset=CocoChorales \
eval.exp_tag_name=laddersym_coco \
hydra/job_logging=disabled \
eval.contiguous_inference=True \
split_frame_length=2000Batch inference over piece folders (recommended):
python laddersym_test_inference.py \
--config-dir=config \
--config-name=config_maestro_prompted \
model=laddersym_MT3Net \
path=pretrained/laddersym/checkpoints/maestro/prompted/model.ckpt \
dataset_dir=$LADDERSYM_MAESTRO_ROOT \
output_mid_name=laddersym_output.mid \
overwrite=true \
hydra/job_logging=disabledSingle-piece inference with explicit file paths:
python laddersym_test_inference.py \
--config-dir=config \
--config-name=config_maestro_prompted \
model=laddersym_MT3Net \
path=pretrained/laddersym/checkpoints/maestro/prompted/model.ckpt \
mistake_file=$LADDERSYM_MAESTRO_ROOT/<piece_id>/mistake.wav \
score_file=$LADDERSYM_MAESTRO_ROOT/<piece_id>/score.wav \
prompt_file=$LADDERSYM_MAESTRO_ROOT/<piece_id>/score.mid \
output_dir=$LADDERSYM_MAESTRO_ROOT/<piece_id> \
output_mid_name=laddersym_output.mid \
hydra/job_logging=disabledThe output MIDI contains three semantic tracks:
- Track 1: Extra notes
- Track 2: Missing notes
- Track 3: Correct notes
- Example output path:
<piece_dir>/laddersym_output.mid
- Datasets: CocoChorales-E, MAESTRO-E
- Checkpoints repo: ben2002chou/laddersym-checkpoints
- Viewer-friendly previews: CocoChorales-E-preview, MAESTRO-E-preview
Recommended setup:
python setup_laddersym_assets.py --datasets all --checkpoints all
source .env.laddersymCheckpoint variants:
| Dataset | Prompted | Unprompted |
|---|---|---|
| CocoChorales-E | model.ckpt | model.ckpt |
| MAESTRO-E | model.ckpt | model.ckpt |
Advanced setup options:
# only datasets
python setup_laddersym_assets.py --datasets all --checkpoints none
# only checkpoints
python setup_laddersym_assets.py --datasets none --checkpoints all@inproceedings{
chou2026laddersym,
title={LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection},
author={Benjamin Shiue-Hal Chou and Purvish Jajal and Nicholas John Eliopoulos and James C. Davis and George K Thiruvathukal and Kristen Yeon-Ji Yun and Yung-Hsiang Lu},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=cizuvfyQXs}
}