# rcsbsearchapi quickstart

This notebook contains examples from the rcsbsearchapi [quickstart](https://rcsbsearchapi.readthedocs.io/en/latest/quickstart.html)

In [None]:
from rcsbsearchapi.search import TextQuery, Terminal
from rcsbsearchapi import rcsb_attributes as attrs

## Operator syntax

Here is an example from the RCSB Search API page, using the operator syntax. This query finds symmetric dimers having a twofold rotation with the DNA-binding domain of a heat-shock transcription factor.

Note the use of standard comparison operators (`==`, `>` etc) for rcsb attributes and set operators for combining queries.

In [None]:
# Create terminals for each query
q1 = TextQuery("heat-shock transcription factor")
q2 = attrs.rcsb_struct_symmetry.symbol == "C2"
q3 = attrs.rcsb_struct_symmetry.kind == "Global Symmetry"
q4 = attrs.rcsb_entry_info.polymer_entity_count_DNA >= 1

# combined using bitwise operators (&, |, ~, etc)
query = q1 & q2 & q3 & q4 # AND of all queries

# Call the query to execute it
for assemblyid in query("assembly"): # return type specified as "assembly"
    print(assemblyid)


Attribute names can be found in the [RCSB schema](http://search.rcsb.org/rcsbsearch/v2/metadata/schema). They can also be found via tab completion, or by iterating:

In [None]:
[a.attribute for a in attrs if "authors" in a.attribute]

## Fluent syntax

Here is the same example using the fluent syntax:

In [None]:
# Start with a Attr or TextQuery, then add terms
results = TextQuery('"heat-shock transcription factor"') \
    .and_("rcsb_struct_symmetry.symbol").exact_match("C2") \
    .and_("rcsb_struct_symmetry.kind").exact_match("Global Symmetry") \
    .and_("rcsb_entry_info.polymer_entity_count_DNA").greater_or_equal(1) \
    .exec("assembly")

# Exec produces an iterator of IDs
for assemblyid in results:
    print(assemblyid)

## Computed Structure Models

The RCSB PDB Search API page provides information on how to include Computed Models into a search query. Here is a code example below.

This query returns ID's for experimental and computed models associated with "hemoglobin". Queries with only computed models or only experimental models can be made.

In [9]:
q1 = TextQuery("hemoglobin")
# add parameter as a list with either "computational" or "experimental" or both as list values
q2 = q1(return_content_type=["computational", "experimental"])
list(q2)

['2GTL',
 '1HV4',
 '4YU4',
 '4V93',
 '1WMU',
 '1XQ5',
 '2Z6N',
 '4YU3',
 '3GOU',
 '3D4X',
 '2QLS',
 '1G08',
 '1G09',
 '1G0A',
 '3PEL',
 '1XZY',
 '3EU1',
 '3GQP',
 '3GQR',
 '3GYS',
 '3MJP',
 '3PI9',
 '3PIA',
 '1HDS',
 '2ZFB',
 '3A59',
 '3WR1',
 '3BCQ',
 '2D2M',
 '2D2N',
 '2PGH',
 '3CIU',
 '1V4W',
 '1V4X',
 '2RAO',
 '1V75',
 '3CY5',
 '1SHR',
 '1SI4',
 '3WTG',
 '1G0B',
 '3GDJ',
 '3K8B',
 '3PI8',
 '6IHX',
 '3BOM',
 '1Y01',
 '1HBR',
 '3D1A',
 '3DHR',
 '1V4U',
 '3FS4',
 '2B7H',
 '1NQP',
 '6II1',
 '1FHJ',
 '6SVA',
 '3GKV',
 '7E96',
 '7E97',
 '7E99',
 '2R80',
 '3EOK',
 '3LQD',
 '1LA6',
 '3NG6',
 '2D5X',
 '3MJU',
 '1IWH',
 '1SPG',
 '1W09',
 '1W0A',
 '1W0B',
 '2H8F',
 '4NI1',
 '1S5X',
 '1S5Y',
 '2AA1',
 '1QPW',
 '3IA3',
 '2QMB',
 '2QU0',
 '3A0G',
 '1FN3',
 '1FSX',
 '5C6E',
 '1Y8H',
 '1Y8I',
 '1Y8K',
 '3NFE',
 '5LFG',
 '1FAW',
 '2DHB',
 '4ODC',
 '4MQJ',
 '2H8D',
 '4H2L',
 '3HYU',
 '4NI0',
 '4MQK',
 '6R2O',
 '7E98',
 '6HBW',
 '5M3L',
 '2RI4',
 '3FH9',
 '3DHT',
 '3S65',
 '3S66',
 '6NQ5',
 '1RTX',
 

## Return Types

A search query can return different result types when a return type is specified. Below are examples on specifying return types Polymer Entities,

Non-polymer Entities, Polymer Instances, and Molecular Definitions. More information on return types can be found in the RCSB PDB Search API page.

In [12]:
q1 = Terminal("rcsb_entry_container_identifiers.entry_id", "in", ["4HHB"]) # query for 4HHB deoxyhemoglobin

print("Polymer Entities:")
for poly in q1("polymer_entity"): # include return type as a string parameter for query object
    print(poly)
print("Non-polymer Entities:")
for nonPoly in q1("non_polymer_entity"):
    print(nonPoly)
print("Polymer Instances:")
for polyInst in q1("polymer_instance"):
    print(polyInst)
print("Molecular Definitions:")
for mol in q1("mol_definition"):
    print(mol)

Polymer Entities:
4HHB_1
4HHB_2
Non-polymer Entities:
4HHB_3
4HHB_4
Polymer Instances:
4HHB.A
4HHB.B
4HHB.C
4HHB.D
Molecular Definitions:
ALA
ARG
ASN
ASP
CYS
GLN
GLU
GLY
HEM
HIS
LEU
LYS
MET
PHE
PO4
PRO
SER
THR
TRP
TYR
VAL


For a more practical example, see the [Covid-19 notebook](covid.ipynb)