Efficient evolution from general protein language models

Scripts for running the analysis described in the paper "Efficient evolution of human antibodies from general protein language models".

Running the model

To evaluate the model on a new sequence, clone this repository and run

python bin/recommend.py [sequence]

where [sequence] is the wildtype protein sequence you want to evolve. The script will output a list of substitutions and the number of recommending language models.

To recommend mutations to antibody variable domain sequences, we have simply run the above script separately on the heavy and light chain sequences.

We have also made a Google Colab notebook available. However, this notebook requires a full download and installation of the language models for each run and requires Colab Pro instances with a higher memory requirement than the free version of Colab. When making many predictions, we recommend the local installation above, as this will allow you to cache and reuse the models.

Paper analysis scripts

To reproduce the analysis in the paper, first download and extract data with the commands:

wget https://zenodo.org/record/6968342/files/data.tar.gz
tar xvf data.tar.gz

To acquire mutations to a given antibody, run the command

bash bin/eval_models.sh [antibody_name]

where [antibody_name] is one of medi8852, medi_uca, mab114, mab114_uca, s309, regn10987, or c143.

DMS experiments can be run with the command

bash bin/dms.sh

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
bin		bin
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient evolution from general protein language models

Running the model

Paper analysis scripts

About

Releases 1

Packages

Contributors 2

Languages

License

brianhie/efficient-evolution

Folders and files

Latest commit

History

Repository files navigation

Efficient evolution from general protein language models

Running the model

Paper analysis scripts

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages