Skip to content

Masked peptides for low-data peptide drug discovery (BiB 2023)

License

Notifications You must be signed in to change notification settings

horsepurve/DeepB3P3

Repository files navigation

DeepB3P3: masked peptide transformer for low-data peptide drug discovery

Installation

Please see requirements.txt.

Datasets

Source Total number BBBPs non-BBBPs
B3Pred Training set 2367 215 2152
B3Pred Testing set 592 54 538

Masking peptides for small data challenge

The size of drug discovery datasets can be extremely limited due to the high cost of the experiments (1,2). However, the training of modern neural networks typically requires large-scale high-quality data. In this paper, we introduce 'masked peptide' that can significantly overcome this issue (Fig. (A)).

Unlike other data augmentation methods, our masking peptide technique does not involve any substitution, insertion, or deletion, but it can significantly change the latent distribution, as follows.

Training

mkdir temp
python DeepB3P3.py \
    --train_path 'bbbp/d3_train_a1x8.txt' \
    --test_path 'bbbp/d3_test_a1x8.txt' \
    --result_path 'temp/d1_test.pred.txt' \
    --log_path 'temp/d1_test.txt.log' \
    --max_length 75 \
    --conv1_kernel 10 \
    --conv2_kernel 10 \
    --regCLASS --LR 0.001 --EVALUATE_ALL --NUM_EPOCHS 50

Or experiment with multiple magnitudes of data augmentation using a single script.

mkdir collect
bash run.sh

Analysis

Pretrained model files: Google Drive. Please download the file (163MB) and unzip to 'DeepB3P3/collect/8/max75'. Then follow the jupyter notebook 'DeepB3P3_Analysis.ipynb'.

Reference

@article{ma2023prediction,
  title={A prediction model for blood-brain barrier penetrating peptides based on masked peptide transformers with dynamic routing},
  author={Ma, Chunwei and Wolfinger, Russ},
  journal={Briefings in Bioinformatics},
  volume={24},
  number={6},
  pages={bbad399},
  year={2023},
  publisher={Oxford University Press}
}

Please let me know if you have any questions about this research.

About

Masked peptides for low-data peptide drug discovery (BiB 2023)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published