<a href="https://colab.research.google.com/github/rcsb/py-rcsb-api/blob/master/notebooks/sequence_coord_quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RCSB PDB Sequence API: Examples

If you're looking for an introduction please refer to the [readthedocs: Sequence](https://rcsbapi.readthedocs.io/en/dev-it-docs/search_api/quickstart.html)

\
Start by installing the package:

```pip install rcsb-api```

In [None]:
%pip install rcsb-api

In [None]:
from rcsbapi.sequence import Alignments, Annotations
import json  # for easy-to-read output

## Getting Started

The RCSB PDB Sequence Coordinates API allows querying for alignments between structural and sequence databases, integrating protein positional features from multiple resources. The API supports requests using GraphQL, a language for API queries. This package simplifies generating queries in GraphQL syntax.

There are two main types of queries: Alignments and Annotations

### Alignments

Alignments queries request data about alignments between an object in a supported database to all objects of another supported database.


In [None]:
from rcsbapi.sequence import Alignments

# Fetch alignments between a UniProt Accession and PDB Entities
query = Alignments(
    db_from="UNIPROT",
    db_to="PDB_ENTITY",
    query_id="P01112",
    return_data_list=["query_sequence", "target_alignments", "aligned_regions"]
)
query.exec()

### Annotations

In [None]:
from rcsbapi.sequence import Annotations

# Fetch all positional features for a particular PDB Instance
query = Annotations(  # type: ignore
    reference="PDB_INSTANCE",
    sources=["UNIPROT"],
    queryId="2UZI.C",
    return_data_list=["target_id", "features"]
)
query.exec()


## Additional Examples

For examples using other query types like GroupAlignments, GroupAnnotations, and GroupAnnotationsSummary or for examples using filters, check below.

### Annotations Query with Filter

In [None]:
from rcsbapi.sequence import Annotations, AnnotationFilterInput

# Fetch protein-ligand binding sites for PDB Instances that fall within Human Chromosome 1
query = Annotations(
    reference="NCBI_GENOME",
    sources=["PDB_INSTANCE"],
    query_id="NC_000001",
    filters=[
        AnnotationFilterInput(
            field="TYPE",
            operation="EQUALS",
            values=["BINDING_SITE"],
            source="UNIPROT"
        )
    ],
    return_data_list=["features.description"]
)
query.exec()

### Alignments with Range

In [None]:
from rcsbapi.sequence import Alignments

# Only return alignments data that fall in given range
query = Alignments(
    db_from="NCBI_PROTEIN",
    db_to="PDB_ENTITY",
    queryId="XP_642496",
    range=[1, 100],
    return_data_list=["target_alignments"]
)
query.exec()


### GroupAlignments

In [None]:
from rcsbapi.sequence import GroupAlignments

# TODO: add description
query = GroupAlignments(
    group="MATCHING_UNIPROT_ACCESSION",
    group_id="P01112",
    return_data_list=["target_alignments.aligned_regions", "target_id"],
)
query.exec()


### GroupAlignments with Filter

filter specify which IDs to return results for

In [None]:
from rcsbapi.sequence import GroupAlignments

# TODO: add description
query = GroupAlignments(
    group="MATCHING_UNIPROT_ACCESSION",
    group_id="P01112",
    return_data_list=["target_alignments.aligned_regions", "target_id"],
    filter=["8CNJ_1", "8FG4_1"]
)
query.exec()


### GroupAnnotations

Get annotations for structures in groups

In [None]:
from rcsbapi.sequence import GroupAnnotations

# TODO Add description
query = GroupAnnotations(
    group="MATCHING_UNIPROT_ACCESSION",
    group_id="P01112",
    sources=["PDB_ENTITY"],
    return_data_list=["features.name","features.feature_positions", "target_id"]
)
query.exec()

### GroupAnnotations with Filter

filters specify what annotations will be retrieved.

In [None]:
from rcsbapi.sequence import GroupAnnotations, AnnotationFilterInput

# TODO: Add description
query = GroupAnnotations(
    group="MATCHING_UNIPROT_ACCESSION",
    group_id="P01112",
    sources=["PDB_ENTITY"],
    filters=[
        AnnotationFilterInput(
            field="TYPE",
            operation="EQUALS",
            values=["BINDING_SITE"],
            source="UNIPROT"
        )
    ],
    return_data_list=["features.name", "features.feature_positions", "target_id"],
)
query.exec()

### GroupAnnotationsSummary

In [None]:
from rcsbapi.sequence import GroupAnnotationsSummary

# TODO: add description
query = GroupAnnotationsSummary(
    group="MATCHING_UNIPROT_ACCESSION",
    group_id="P01112",
    sources=["PDB_INSTANCE"],
    return_data_list=["target_id", "features.type"]
)
query.exec()

### GroupAnnotations with Filter

filters specify what annotations will be retrieved.

In [None]:
from rcsbapi.sequence import GroupAnnotationsSummary, AnnotationFilterInput

# TODO: add description
query = GroupAnnotationsSummary(
    group="MATCHING_UNIPROT_ACCESSION",
    group_id="P01112",
    sources=["PDB_INSTANCE"],
    filters=[
        AnnotationFilterInput(
            field="TYPE",
            operation="EQUALS",
            values=["BINDING_SITE"],
            source="UNIPROT"
        )
    ],
    return_data_list=["target_id", "features.type"]
)
query.exec()