<a href="https://colab.research.google.com/github/karvla/RFDesign/blob/main/RFDesign_Inpainting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RFDesign Inpainting
Notebook for running [RosettaCommons RFDesing](https://github.com/RosettaCommons/RFDesign)

Currently only inpainting is implemented but adding support for hallucination should be fairly easy


In [None]:
#@title ##Install dependencies and download models

import os
import sys
from IPython.utils import io
from google.colab import files

if not os.path.isdir("RFDesign"):
    %shell git clone https://github.com/RosettaCommons/RFDesign.git

    # download models
    %shell wget http://files.ipd.uw.edu/pub/rfdesign/weights/BFF_last.pt -P RFDesign/hallucination/weights/rf_Nov05
    %shell wget http://files.ipd.uw.edu/pub/rfdesign/weights/BFF_mix_epoch25.pt -P  RFDesign/inpainting/weights/

    # install libraries
    import torch
    torch_v = torch.__version__

    %shell pip install -q dgl-cu113 -f https://data.dgl.ai/wheels/repo.html
    %shell pip install -q torch-scatter -f https://pytorch-geometric.com/whl/torch-{torch_v}.html
    %shell pip install -q torch-sparse -f https://pytorch-geometric.com/whl/torch-{torch_v}.html
    %shell pip install -q torch-geometric
    %shell pip install -q py3Dmol
    %shell pip install -q e3nn
    %shell pip install -q icecream



    # importing torch again here seems to solve some problem with DGL
    # I don't know why
    import torch


In [None]:
#@title Upload pdb
from pathlib import Path
uploaded = files.upload()
pdb_path = Path(list(uploaded.keys())[0])

In [None]:
#@title Inpainting
#@markdown Full description: [RFDesign/inpainting/README.md](https://github.com/RosettaCommons/RFDesign/blob/main/inpainting/README.md)
#@markdown 
#@markdown Protein inpainting is a method for "conditional joint protein
#@markdown sequence/structure generation". This means that given some combination of
#@markdown protein sequence and structure that you have, you can use this method to
#@markdown simultaneously generate more sequence and structure conditioned on that input. 
#@markdown 
#@markdown **Things inpainting is good at:**
#@markdown 
#@markdown - Refining non-ideal parts of proteins 
#@markdown - Resampling protein structures near a starting structure 
#@markdown - Re-looping proteins (i.e., keep tertiary/secondary structure but changing the
#@markdown   order in which elements appear in sequence space)
#@markdown - Rigidly fusing two protein domains 
#@markdown - Loop building 
#@markdown - Scaffolding medium-sized motifs
#@markdown 
#@markdown **Things that are currently challenging with inpainting:**
#@markdown - Generating large amounts of protein from very little or no protein structure
#@markdown   (it's worth a try, but don't expect whole proteins to come out consistently) 
#@markdown - Inpainting with excluded volumes (though it can be done)

assert 'pdb_path' in locals()

contigs = 'A1-3' #@param {type: "string"}
num_designs = 1 #@param {type: "integer"}

# The script doesn't exit with exit code 0 even if it succeeds
from subprocess import CalledProcessError
try:
  %shell cd RFDesign/inpainting && \
  python inpaint.py --pdb  ../../{pdb_path.name} \
  --dump_pdb \
  --contigs  {contigs} \
  --num_designs {num_designs} \
  --out ../../{pdb_path.stem}
except CalledProcessError:
  pass

In [None]:
#@title Download results
zip_name = f'{pdb_path.stem}_inpaintings.zip'
%shell zip -q -r {zip_name} {pdb_path.stem}_*
files.download(zip_name)