# Fast & Accurate PDB Prediction with ESMFold

Having the ability to use AlphaFold2, ESM, and other recent structural modeling NNs is great, but what if you don't want to leave Python, don't want to spin up a GPU, want to avoid conterization, or need to massively scale out your PDB file prediction / creation?

You can predict a PDB file for proteins up to 1024+ in length using the highly accurate ESMFold, scaled out and pre-loaded into memory on BioLM.ai. The API docs show an [example protein and PDB string response](https://api.biolm.ai/#ef0eeaf6-380a-4535-98d2-de85cac6d1bb).

In [None]:
from helpers import api_caller  # Helper to make API calls to BioLM
from IPython.display import JSON

TOKEN = ''  # !!! YOUR API TOKEN HERE !!!

In [None]:
SEQ = "MAETAVINHKKRNSPRIVQSNDLEAAYSLSRDQKRMLYLFVDQIRKSDGTLQEHDGICEIHVAKYAEIFGLTSAEASKDIRQALKSFAGKEVVFYRPEEDAGDEKGYESFPWFIKRAHSPSRGLYSVHINPYLIPFFIGLQNRFTQFRLSETKEITNPYAMRLYESLCQYRKPDGSGIVSLKIDWIIERYQLPQSYQRMPDFRRRFLQVCVNEINSRTPMRLSYIEKKKGRQTTHIVFSFRDITSMTTG"

print("Sequence length: {}".format(len(SEQ)))

Sequence length: 249


In [None]:
SLUG = 'esmfold-multichain'  # Model on BioLM.ai to use

# JSON ayload to send to model endpoint
data = {
  "instances": [{
    "data": {"text": SEQ}
  }]
}

## Make API Request

There is already a server on BioLM with ESMFold loaded into memory, so predictions should be fast.

In [None]:
import time

s = time.time()  # Start time
pdb_pred = await api_caller(
    model_slug=SLUG,
    action='predict',
    data=data,
    api_token=TOKEN    
)
e = time.time()  # End time
d = e - s  # Duration

print(f'Response time: {d:.4}s')

Response time: 0.345s


If the model was starting cold, there would be an initial wait time of several minutese to load this large model into memory, after which subsequent API requests would respond normally, without delay. This is what is known as a model cold-start time. It is generally not very noticeable, except in this case since ESMFold is one of the largest protein models to date.

## Visualize Structure in 3D

We have the PDB file contents as a string. We can use it directly to visualize the structure.

In [None]:
# View the file contents first
import json

json.dumps(pdb_pred)[:1000]  # Look at the first 1000 characters, since PDBs are long...

'{"predictions": [{"pdb": ["PARENT N/A\\nATOM      1  N   MET A   1     -23.877  39.961   4.458  1.00 95.19           N  \\nATOM      2  CA  MET A   1     -23.050  39.282   3.464  1.00 96.49           C  \\nATOM      3  C   MET A   1     -21.917  38.512   4.134  1.00 95.32           C  \\nATOM      4  CB  MET A   1     -22.480  40.286   2.461  1.00 94.53           C  \\nATOM      5  O   MET A   1     -20.957  39.112   4.621  1.00 87.33           O  \\nATOM      6  CG  MET A   1     -23.528  40.905   1.550  1.00 90.04           C  \\nATOM      7  SD  MET A   1     -22.803  42.080   0.342  1.00 92.34           S  \\nATOM      8  CE  MET A   1     -23.056  43.644   1.226  1.00 90.35           C  \\nATOM      9  N   ALA A   2     -22.052  37.250   4.365  1.00 94.99           N  \\nATOM     10  CA  ALA A   2     -21.009  36.419   4.962  1.00 95.55           C  \\nATOM     11  C   ALA A   2     -19.907  36.114   3.952  1.00 92.68           C  \\nATOM     12  CB  ALA A   2     -21.606  35.122

In [None]:
# If you wish to view the full result, you can expand the tree in the cell below
JSON(pdb_pred)

<IPython.core.display.JSON object>

In [None]:
# FOR IN-BROWSER JUPYTER-LITE ONLY #
import micropip
await micropip.install('py3Dmol')

In [None]:
import py3Dmol  # Install with `pip install py3Dmol` if running notebook elsewhere

In [None]:
pdb_string = pdb_pred['predictions'][0]['pdb'][0]

In [None]:
view = py3Dmol.view(js='https://3Dmol.org/build/3Dmol-min.js', width=800, height=400)
view.addModel(pdb_string, 'pdb')
view.setStyle({'model': -1}, {"cartoon": {'color': 'spectrum'}})
view.zoomTo()

<py3Dmol.view at 0x2a14ca0>

In [None]:
#Link to 