Skip to content

gv20-therapeutics/antibody-in-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine learning models for antibody sequences in PyTorch

Introduction

Recently, more people are realizing the use of machine learning, especially deep learning, in helping to understand antibody sequences in terms of binding specificity, therapeutic potential, and developability. Several models have been proposed and shown excellent performance in different datasets. We believe there should be an optional solution of modeling antibody sequences, because if otherwise, people can use transfer learning to keep the good "knowledge" and train a minimal amount of parameters for specific tasks. Therefore, we create this public repo to collect and re-implement (if needed) public available machine learning models in PyTorch.

Requirements

  • Pytorch
  • Pandas
  • Numpy
  • Scikit-learn

Download and install the package

git clone https://github.com/gv20-therapeutics/antibody-in-pytorch.git
cd antibody-in-pytorch.git
python setup.py install

Ways to run the machine learning models

Directly run from the command line:

AIPT --help

Run from a local python script

python runner.py --help

Run as a module

python -m AIPT --help

Run from the entry point in the module

python -m AIPT.entry_point --help

Run without any parameter will go through all the test() functions.

Antibody datasets

Included deep learning studies

  1. Mason et al. 2019, Deep learning enables therapeutic antibody optimization in mammalian cells by deciphering high-dimensional protein sequence space. [https://www.biorxiv.org/content/10.1101/617860v3]
  2. Liu et al. 2019, Antibody complementarity determining region design using high-capacity machine learning. [https://doi.org/10.1093/bioinformatics/btz895]
  3. Wollacott et al. 2019, Quantifying the nativeness of antibody sequences using long short-term memory networks. [https://academic.oup.com/peds/advance-article/doi/10.1093/protein/gzz031/5554642]

Other deep learning studies

  1. Hu and Liu 2019, DeepBCR: Deep learning framework for cancer-type classification and binding affinity estimation using B cell receptor repertoires. [https://www.biorxiv.org/content/10.1101/731158v1]
  2. Davidsen et al. 2019, Deep generative models for T cell receptor protein sequences. [https://elifesciences.org/articles/46935]
  3. Hu et al. 2019, ACME: pan-specific peptide–MHC class I binding prediction through attention-based deep neural networks. [https://academic.oup.com/bioinformatics/article-abstract/35/23/4946/5497763]
  4. Chen et al. 2020, Predicting Antibody Developability from Sequence using Machine Learning. [https://www.biorxiv.org/content/10.1101/2020.06.18.159798v1]
  5. Beshnova et al. 2020, De novo prediction of cancer-associated T cell receptors for noninvasive cancer detection. [https://stm.sciencemag.org/content/12/557/eaaz3738]

About

Machine learning models for antibody sequences in PyTorch

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published