<a href="https://colab.research.google.com/github/rcsb/rcsb-training-resources/blob/master/training-events/2025/python-rcsb-api/search_data_workflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Install `rcsb-api`
%pip install --upgrade rcsb-api

# Using Search and Data APIs Together

The Search and Data APIs are most powerful when used together.

You can use the Search API to identify structures of interest and then use the Data API to request information from that refined list of IDs.

In the example below, we will use the Search API to request structures with the HIV protease inhibitor, ritonavir, bound. Then, we will identify which amino acid residues interact with ritonavir using the Data API.

## Search API Query

In [None]:
from rcsbapi.search import search_attributes as attrs

# Search for all structures bound to ritonavir
q1 = attrs.rcsb_chem_comp_annotation.annotation_lineage.id == "J05AE03"
q2 = attrs.rcsb_chem_comp_annotation.type == "ATC"

search_query = q1 & q2
search_results = list(search_query())

Once you have this list of structures, you can request data on each structure's interactions with ritonavir

## Data API Query

In [None]:
from rcsbapi.data import DataQuery as Query

# ligand interactions are contained in instance features
data_query = Query(
    input_type="entries",
    input_ids=search_results,
    return_data_list=["rcsb_polymer_instance_feature"]
)

data_results = data_query.exec()
print(data_results)

Some paths are being autocompleted based on the current API. If this code is meant for long-term use, use the set of fully qualified paths below:
    [
        "polymer_entities.polymer_entity_instances.rcsb_polymer_instance_feature",
    ]


{'data': {'entries': [{'rcsb_id': '1HXW', 'polymer_entities': [{'polymer_entity_instances': [{'rcsb_polymer_instance_feature': [{'type': 'CATH', 'assignment_version': 'v4_3_0', 'provenance_source': 'CATH', 'reference_scheme': None, 'ordinal': 1, 'name': 'Acid Proteases', 'feature_positions': [{'beg_comp_id': None, 'beg_seq_id': 1, 'values': None, 'end_seq_id': 99, 'value': None}], 'feature_id': '2.40.70.10', 'description': None, 'additional_properties': [{'name': 'CATH_NAME', 'values': ['Acid Proteases']}, {'name': 'CATH_DOMAIN_ID', 'values': ['1hxwA00']}]}, {'type': 'SCOP', 'assignment_version': '2.08-stable', 'provenance_source': 'SCOPe', 'reference_scheme': None, 'ordinal': 3, 'name': 'Human immunodeficiency virus type 1 protease', 'feature_positions': [{'beg_comp_id': None, 'beg_seq_id': 1, 'values': None, 'end_seq_id': 99, 'value': None}], 'feature_id': 'd1hxwa_', 'description': None, 'additional_properties': [{'name': 'SCOP_NAME', 'values': ['Human immunodeficiency virus type 1 p

## Parse Results

Responses to Data API queries will be returned in JSON format. Once you get a response, you can parse it into a format that is most helpful for you.

Below, we will