# rcsbsearchapi

Access the RCSB advanced search from python: [rcsbsearchapi.readthedocs.io](https://rcsbsearchapi.readthedocs.io)

    pip install rcsbsearchapi
    
## Demo

We are interested in how the antiviral drug boceprevir interacts with the Covid-19 virus. 
- Source Organism is "COVID-19 virus"
- Structure title contains "protease"
- Bound to ligand "Boceprevir"

[RCSB Query](http://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22logical_operator%22%3A%22and%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_entity_source_organism.taxonomy_lineage.name%22%2C%22operator%22%3A%22exact_match%22%2C%22value%22%3A%22COVID-19%22%2C%22negation%22%3Afalse%7D%2C%22node_id%22%3A0%7D%2C%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22value%22%3A%22protease%22%2C%22negation%22%3Afalse%7D%2C%22node_id%22%3A1%7D%2C%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22chem_comp.name%22%2C%22operator%22%3A%22contains_words%22%2C%22value%22%3A%22Boceprevir%22%2C%22negation%22%3Afalse%7D%2C%22node_id%22%3A2%7D%5D%7D%2C%22return_type%22%3A%22entry%22%2C%22request_info%22%3A%7B%22query_id%22%3A%2270e677a6376b4c5eba8b4f2b73866c92%22%2C%22src%22%3A%22ui%22%7D%7D)

In [1]:
from rcsbsearchapi import rcsb_attributes as attrs, TextQuery
import nglview



## Operator syntax
- Uses python comparison operators for basic attributes (`==`, `<`, `<=`, etc)
- Combine using set operators (`&`, `|`, `~`, etc)
- Execute queries as functions

In [2]:
q1 = attrs.rcsb_entity_source_organism.taxonomy_lineage.name == "COVID-19 virus"
q2 = TextQuery("protease")
#q3 = attrs.chem_comp.name.contains_words("Boceprevir")
q4 = attrs.rcsb_entry_info.resolution_combined > 1.5
query = q1 & q2 & ~q4 #&q3, chem_comp is not an attribute of SchemaComp

list(query())

['7JKV',
 '7K3T',
 '7LKT',
 '7P51',
 '5RGJ',
 '5RGK',
 '5RGR',
 '6XBG',
 '7JQ2',
 '5R82',
 '5RED',
 '5RF3',
 '5RF6',
 '5RF9',
 '5RFB',
 '5RFC',
 '5RFD',
 '5RFE',
 '5RFV',
 '5RFW',
 '5RHB',
 '6W79',
 '7BIJ',
 '7NTQ',
 '7T44',
 '7T46',
 '7AR5',
 '7AR6',
 '7D1M',
 '7K6D',
 '6Y84',
 '6YB7',
 '7DJR',
 '7K40',
 '7X6J',
 '8FWN',
 '5RF8',
 '7AOL',
 '7AWR',
 '7AXM',
 '7NFV',
 '8DI3',
 '7AEH',
 '7AQE',
 '5R8T',
 '5RGW',
 '5RH4',
 '6WNP',
 '6XHM',
 '7KPH',
 '7MB1',
 '7SFH',
 '7VAH',
 '8ACD',
 '8DRT',
 '8DRX',
 '8OKN',
 '7NTS',
 '8OKL',
 '5RL2',
 '5SMN',
 '6XR3',
 '7DVW',
 '7KYU',
 '8DOX',
 '7ZV7',
 '7UV5',
 '7T2T',
 '6WEY',
 '6XKH',
 '7THH',
 '8F46',
 '5RS7',
 '5RS8',
 '5RS9',
 '5RSB',
 '5RSC',
 '5RSD',
 '5RSE',
 '5RSF',
 '5RSG',
 '5RSH',
 '5RSI',
 '5RSJ',
 '5RSK',
 '5RSL',
 '5RSM',
 '5RSN',
 '5RSO',
 '5RSP',
 '5RSQ',
 '5RSR',
 '5RSS',
 '5RST',
 '5RSV',
 '5RSW',
 '5RSY',
 '5RSZ',
 '5RT0',
 '5RT1',
 '5RT2',
 '5RT3',
 '5RT4',
 '5RT5',
 '5RT7',
 '5RT8',
 '5RTA',
 '5RTB',
 '5RTC',
 '5RTD',
 '5RTE',
 

In [3]:
nglview.show_pdbid('7brp')

NGLWidget()

## Fluent syntax

A second syntax is available with a [fluent interface](https://en.wikipedia.org/wiki/Fluent_interface), similar to popular data science packages like tidyverse and Apache Spark. Function calls  are chained together.

Here's an example around a second antiviral, remdesivir. The drug interferes with RNA polymerase, replacing an adenine and causing early chain termination. When integrated into RNA, the nucleotide formed from remdesivir has residue code F86.

In [4]:
attrs.struct.title.contains_phrase("RNA polymerase")\
    .or_(attrs.struct.title).contains_words("RdRp")\
    .and_(attrs.rcsb_entity_source_organism.taxonomy_lineage.name).exact_match("COVID-19 virus")\
    .exec()\
    .iquery()\

#.and_(attrs.rcsb_chem_comp_container_identifiers.comp_id).exact_match("F86")\
#AttributeError: 'SchemaGroup' object has no attribute 'rcsb_chem_comp_container_identifiers'

100%|██████████| 1/1 [00:00<?, ?it/s]


['6XQB',
 '7DTE',
 '7L1F',
 '7OZU',
 '7OZV',
 '7BZF',
 '7C2K',
 '7OYG',
 '6M71',
 '7BW4',
 '7D4F',
 '7DFG',
 '7DFH',
 '7DOI',
 '7DOK',
 '7BTF',
 '7AAP',
 '7B3B',
 '7B3C',
 '7B3D']

In [5]:
view = nglview.show_pdbid('7B3C', default_representation=False)
#view.get_state()['_camera_orientation']
o = [6, 3, 23, 0, 23, 1, -6, 0, -2, 24, -2, 0, -84, -92, -109, 1]
view.control.orient(o)
view.add_surface(sele="protein", opacity=.8, color="electrostatic")
view.add_cartoon(sele="rna", color="cyan")
view.add_licorice(sele="rna", color="cyan")
view.add_spacefill(sele="F86")
view

NGLWidget()

## Try it!

[rcsbsearchapi.readthedocs.io](rcsbsearchapi.readthedocs.io)