Updated on 2024.07.24
This repository provides the official implementation of Prime (Protein language model for Intelligent Masked pretraining and Environment (temperature) prediction).
Key feature:
- Zero-shot mutant effect prediction.
- OGT Prediction
Pro-Prime, a novel protein language model, has been developed for predicting the Optimal Growth Temperature (OGT) and enabling zero-shot prediction of protein thermostability and activity. This novel approach leverages temperature-guided language modeling.
Main Requirements
biopython==1.81
torch==2.0.1
Installation
pip install -r requirements.txt
https://drive.google.com/file/d/1AEpK3TmgFNszZXJQWwRPkHUugrdHrTgk/view?usp=sharing
- Run ProtienGym Benchmark or Zero-shot mutant Effect Prediction, see in this notebook.
- OGT prediction, see in this notebook.
- Tm prediction, see in this notebook.
- Topt prediction, see in this notebook.
This project is under the MIT license. See LICENSE for details.
A lot of code is modified from 🤗 transformers and esm.
If you find this repository useful, please consider citing this paper:
@misc{tan2023,
title={Engineering Enhanced Stability and Activity in Proteins through a Novel Temperature-Guided Language Modeling.},
author={Pan Tan and Mingchen Li and Liang Zhang and Zhiqiang Hu and Liang Hong},
year={2023},
eprint={2304.03780},
archivePrefix={arXiv},
primaryClass={q-bio.QM}
}