GPD

Graphormer-based Protein Design (GPD) model deploys the Transformer on a graph-based representation of 3D protein structures and supplements it with Gaussian noise and a sequence random mask applied to node features, thereby enhancing sequence recovery and diversity. The performance of GPD model was significantly better than that of state-of-the-art model for ProteinMPNN on multiple independent tests, especially for sequence diversity.

Install

Quick Start

One can use pip to directly install our package

pip install fair-GPD

Note:

For 40 series gpus, we recommend use the following methods to install. For the current pip installed pytorch may have some errors with 40 series gpus.

Install with conda

conda create -n GPD
source activate GPD
conda install pytorch==1.12.1 -c pytorch
conda install -c conda-forge mdtraj==1.9.9
conda install -c anaconda networkx==3.1

Note that GPD could be used with cuda, you can install the cudatoolkit package according to your own gpu version. Also, one could use our given environment.yml file to create an environment

conda env create -f environment.yml

Install with pip

One can use our given requirements.txt file for pip installation

pip install -r requirements.txt

Example

cd test/
sh submit_example_2_fixed.sh  (simple example)
sh submit_example_1.sh (fix some residue positions)

Output example:

outputs/example_1_outputs/1tca.fasta

> predicted model_0	acc: 0.3501577287066246	length: 317
APTGAAPPLTLPPATLRAQLAAKGASPEDLKNPVLILHGPGTDGAEDFAGFLVRLLKSKGYTPAYVDPDPN
ALDDIADDLEALALAAKYLAAGLGNKPFNVITHSLGGVALLTALAYHPELRDKIKRVVLVSPLPTGSDSLR
ALLAANTLRLLQFLSVKGSALDDAARKAGALTPLVPTTVIGHANDPLHYPTSLGSPASGAYVPDARVIDLY
SVYGPDFTVDHAEAVFSSLVRKALKAALTSSSGYARASDVGKSLRVSDPAKDLSAEQREAFLNLLAPAAAA
IANGKTGNACPPLPPEYLPAAPGAKGAGGVLTP
> predicted model_1	acc: 0.334384858044164	length: 317
APTGEPLPLLLPDATLLANVEADGADIDEVTNPVLLLHGLGSDGEEALGASLVALLKALGYTPLGVDPDPN
YTDDILDDAQALAAAARALAAGLGNKPLLVVGHSLGGVVVLLALRYNPALADLIASVILVAPAPRGSSEAR
PLIAAKILRPEDFLLLYGSALADALRAAGLDVPLVPTTVIDSADDPLHSPNALLSAESAAYVPGGTVVDLS
DIFGPDFTVSHAGAVLSPFLRKLLEAALASPTGVPREEDVGASLLDLDLAADLTAEERAAALNALAAYAAR
IAAGARFNAYPALPPELVPAAKGATDAAGTLKP

acc is recovery. Recovery was the proportion of the same amino acids at equivalent position between the native sequence and the designed sequence
length is the length of designed sequence.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
GPD		GPD
data		data
test		test
train		train
GPD.jpg		GPD.jpg
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
run_design.py		run_design.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPD

Install

Quick Start

Note:

Install with conda

Install with pip

Example

Output example:

Directory

The "data" directory

The "GPD" directory

The "train" directory

The “test” directory

About

Releases

Packages

Contributors 2

Languages

decodermu/GPD

Folders and files

Latest commit

History

Repository files navigation

GPD

Install

Quick Start

Note:

Install with conda

Install with pip

Example

Output example:

Directory

The "data" directory

The "GPD" directory

The "train" directory

The “test” directory

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages