SPRINT: Sequence-based Immunogenicity Prediction Networks

📄 RECOMB 2025 Supplementary Material: The file appendix_recomb.pdf contains the supplementary materials for our RECOMB submission.

A unified PyTorch-based benchmarking framework for deep learning methods in T-cell receptor (TCR) and peptide-MHC (pMHC) binding prediction.

Quick Start

Installation

git clone https://github.com/Computational-Machine-Intelligence/SPRINT.git
cd SPRINT
pip install -r requirements.txt

Requirements: Python >= 3.8, PyTorch >= 2.0.0

Set HuggingFace Token (Required)

# Linux/Mac
export HF_TOKEN="your_huggingface_token"

# Windows PowerShell
$env:HF_TOKEN = "your_huggingface_token"

Available Resources

Methods (7)

pmtnet: Multi-head attention for pMHC-TCR binding
piste: Pre-trained immune system transformer encoder
fusionpmt: Fusion model for pMHC-TCR interactions
fusionpm: Peptide-MHC binding prediction
transphla: Transformer for pHLA binding
ergo2: LSTM/Autoencoder TCR specificity model
nettcr: CNN-based TCR-pMHC prediction

Datasets (6)

pmt: Large-scale pMHC-TCR training set
pm: Peptide-MHC binding dataset
pt: Peptide-TCR binding dataset
allelic_ood: Out-of-distribution test (unseen alleles)
modality_ood: Cross-modality test (BA/EL assays)
temporal_ood: Temporal test (post-2021 data)

Modes (3)

train: Train models from scratch
eval: Evaluate models on test data
both: Train then evaluate

Usage

1. Evaluate Pre-trained Models

Evaluate a pre-trained model on test data:

# Evaluate on standard dataset
python scripts/run_benchmark.py --method pmtnet --dataset pmt --mode eval --pretrain

# Evaluate on OOD datasets
python scripts/run_benchmark.py --method transphla --dataset temporal_ood --mode eval --pretrain
python scripts/run_benchmark.py --method fusionpm --dataset modality_ood --mode eval --pretrain

What happens:

Downloads pre-trained model from HuggingFace (first time only)
Loads test data automatically
Evaluates and saves results to outputs/<method>/pre_train_results/evaluations/

2. Train Models from Scratch

Train a model on a dataset:

# Basic training
python scripts/run_benchmark.py --method pmtnet --dataset pmt --mode train

# Train with custom config
python scripts/run_benchmark.py --method ergo2 --dataset pt --mode train --config configs/methods/ergo2.yaml

Output: Trained model saved to outputs/<method>/<dataset>_<timestamp>/

3. Train and Evaluate

Train a model and immediately evaluate:

python scripts/run_benchmark.py --method fusionpmt --dataset pmt --mode both

4. List Available Resources

# List all methods
python scripts/run_benchmark.py --list-methods

# List all datasets
python scripts/run_benchmark.py --list-datasets

Command Options

Option	Description	Example
`--method`	Model to use	`pmtnet`, `ergo2`, `piste`
`--dataset`	Dataset to use	`pmt`, `temporal_ood`
`--mode`	Operation mode	`train`, `eval`, `both`
`--pretrain`	Use pre-trained model (eval only)	Flag, no value
`--device`	Computing device	`cuda`, `cpu`, `auto`
`--seed`	Random seed	`42`

Citation

If you use SPRINT in your research, please cite:

@software{yin2025sprint,
  title={SPRINT for Benchmarking Sequence-based Immunogenicity Prediction Networks},
  author={Yin, Yujia and Li, Hongzong and Ma, Jiahao and Chen, Weijia and Yu, Yingying and Zhang, Xiaoyuan and Qu, Tianyi and Wu, Xinhong and Li, Junyi and Huang, Jian-Dong and Hu, Ye-Fan and Chen, Yifan},
  year={2025},
  url={https://github.com/Computational-Machine-Intelligence/SPRINT}
}

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__pycache__		__pycache__
configs		configs
scripts		scripts
sprint		sprint
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
appendix_recomb.pdf		appendix_recomb.pdf
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SPRINT: Sequence-based Immunogenicity Prediction Networks

Table of Contents

Quick Start

Installation

Set HuggingFace Token (Required)

Available Resources

Methods (7)

Datasets (6)

Modes (3)

Usage

1. Evaluate Pre-trained Models

2. Train Models from Scratch

3. Train and Evaluate

4. List Available Resources

Command Options

Citation

License

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Computational-Machine-Intelligence/SPRINT

Folders and files

Latest commit

History

Repository files navigation

SPRINT: Sequence-based Immunogenicity Prediction Networks

Table of Contents

Quick Start

Installation

Set HuggingFace Token (Required)

Available Resources

Methods (7)

Datasets (6)

Modes (3)

Usage

1. Evaluate Pre-trained Models

2. Train Models from Scratch

3. Train and Evaluate

4. List Available Resources

Command Options

Citation

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages