<a href="https://colab.research.google.com/github/khanfs/ComputationalBiology-xGenomics/blob/main/PDB_APIs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Protein Data Bank APIs**

The RCSB Protein Data Bank is a structural biology database consisting of 3D structures of proteins, nucleic acids and complex assemblies. Experiments comprising X-ray crystallography, the most common method, nuclear magnetic resonance (NMR), and cryo-electron microscopy (cryo-EM) generate structural data. Each entry contains:

1. the 3D coordinates of the atoms and the bonds connecting these atoms for proteins, ligand, cofactors, water molecules, and ions
2. 3D visualisations of the protein structures, with ligand interactions if available
3. meta-information on the structural data such as the PDB ID, the authors, the deposition date, the structural determination method used
4. structural quality metrics - structural resolution - it measures the collected data quality, which has the unit Å (Angstrom); the lower the value, the higher the quality of the structure.

**Resources**
* [Primary PyPDB functions](https://academic.oup.com/view-large/35641249)
* [Functions for searching the RCSB PDB for lists of PDB IDs](https://github.com/williamgilpin/pypdb/blob/master/pypdb/pypdb.py)

**Publications:**

*  [PyPDB: a Python API for the Protein Data Bank](https://academic.oup.com/bioinformatics/article/32/1/159/1743800) 

In [1]:
# Install PDB API
! pip install pypdb

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
! pytest

platform linux -- Python 3.7.13, pytest-3.6.4, py-1.11.0, pluggy-0.7.1
rootdir: /content, inifile:
plugins: typeguard-2.7.1
[1mcollecting 0 items                                                             [0m[1mcollected 0 items                                                              [0m



In [3]:
from IPython.display import HTML

In [4]:
# (*) will import every function from the pydb package
from pypdb import *

In [8]:
#Return list of proteins based on search query
found_pdbs = Query("ribosome").search()
print(found_pdbs[:10])

['486D', '6C0F', '6CB1', '1T1O', '6EM1', '1T1M', '7OHS', '1QI7', '6YLX', '6EM5']


In [7]:
# Search PDB by PubMed ID Number
found_pdbs = Query(27499440, "PubmedIdQuery").search()
print(found_pdbs[:10])

['5IMT', '5IMW', '5IMY']
