Skip to content

juliecmitchell/beGAN

Repository files navigation

beGAN is a code for generating beta-hairpin sequences of variable residue lengths like 14-mer, 16-mer, 18-mer, and 20-mer peptides. It is provided as a Jupyter Notebook and as a Python file. To run either, there are a number of dependencies, including pytorch, numpy, pandas, propy, etc.

To create the beGAN environment and activate it using conda

  • conda create --name beGAN

  • conda activate beGAN

To obtain the beGAN code repository in your local machine.

  • git clone https://github.com/juliecmitchell/beGAN.git

To install required dependencies

  • conda install seaborn scikit-learn ipywidgets

To run the code using the command line python

  • cd beGAN
  • python beGAN_Pauling33k_run.py

This code will generate 16-mer beta-hairpin peptide sequences with corresponding GP scores.

To run the code interactively using jupyter-lab

  • conda install -c conda-forge jupyterlab

  • jupyter-lab

Run beGAN_Pauling33k_run.ipynb interactively.

Features for the beGAN model were collected using AAindex matrix: [https://www.genome.jp/aaindex/]

  • Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., and Kanehisa, M.; AAindex: amino acid index database, progress report 2008. Nucleic Acids Res. 36, D202-D205 (2008)

Amino Acid indices can be extracted using the Propy3 package during model training.

pip install propy3

[https://github.com/MartinThoma/propy3]

Validations:

3D structures of the beta-hairpin peptide sequences can be further validated using AlphaFold2 and ESMFold.

Jumper, J., R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S.A.A. Kohl, A.J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A.W. Senior, K. Kavukcuoglu, P. Kohli, and D. Hassabis. 2021. Highly accurate protein structure prediction with AlphaFold. Nature. 596:583–589.

Hie, B., S. Candido, Z. Lin, O. Kabeli, R. Rao, N. Smetanin, T. Sercu, and A. Rives. 2022. A high-level programming language for generative protein design. Synthetic Biology

ML-predicted solubility can be tested using Peptide-bio:

Ansari, M., and A.D. White. 2023. Serverless Prediction of Peptide Properties with Recurrent Neural Networks. J. Chem. Inf. Model. 63:2546–2553.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published