What are all the different ways that HuBMAP data can be accessed programmatically? This survey could be the basis of user-facing documentation, or it could guide efforts to align and unify resources.

Scope includes:
- HTTP APIs without Python SDK wrappers are included
- Config files on github included, if there's nothing better

Out of scope:
- Code and data which are not freely available
- Tools that are only of use to HuBMAP developers
- Software libraries not in Python
- Bioinformatics tools for handling particular file types

## TSV Download

**Maintainer**: Harvard

**Description**: Simple HTTP interface to pull entity metadata. URL queries supported.

**Backing API**: Search API

**Doc Style**: Short pargraph of MD, checked in to `portal-ui`

**Doc URL**: https://portal.hubmapconsortium.org/apis

**Source URL**: https://github.com/hubmapconsortium/portal-ui/blob/main/context/app/routes_api.py#L34

In [2]:
import csv
import io
import requests
import urllib

query = {'assay_type': 'CODEX'}
url_base = 'https://portal.hubmapconsortium.org/metadata/v0/datasets.tsv'
url_query = urllib.parse.urlencode(query)
csv_text = requests.get(f'{url_base}?{url_query}').text
datasets = [d for d in csv.DictReader(io.StringIO(csv_text), dialect=csv.excel_tab)]

datasets[0].keys()

odict_keys(['uuid', 'hubmap_id', 'acquisition_instrument_model', 'acquisition_instrument_vendor', 'analyte_class', 'assay_category', 'assay_type', 'donor.hubmap_id', 'execution_datetime', 'is_targeted', 'number_of_antibodies', 'number_of_channels', 'number_of_cycles', 'operator', 'operator_email', 'pi', 'pi_email', 'preparation_instrument_model', 'preparation_instrument_vendor', 'protocols_io_doi', 'reagent_prep_protocols_io_doi', 'resolution_x_unit', 'resolution_x_value', 'resolution_y_unit', 'resolution_y_value', 'resolution_z_unit', 'resolution_z_value', 'section_prep_protocols_io_doi'])

In [8]:
datasets[0]['preparation_instrument_model']

'The model number/name of the instrument used to prepare the sample for the assay'

In [7]:
{d['preparation_instrument_model'] for d in datasets[1:]}

{'prototype robot - Stanford/Nolan Lab', 'version 1 robot'}

## Cells SDK

**AKA**: hubmap-api-py-client

**Maintainer**: CMU / Harvard

**Description**: Idiomatic wrapper around Cells API providing set operations and result filtering

**Backing API**: https://github.com/hubmapconsortium/cross_modality_query AKA "Cells API"

**Doc style**: Python doc tests in markdown on github

**Doc URL**: https://github.com/hubmapconsortium/hubmap-api-py-client

**Source URL**: https://github.com/hubmapconsortium/hubmap-api-py-client

In [None]:
%pip install hubmap-api-py-client

In [None]:
from hubmap_api_py_client import Client
client = Client('https://cells.dev.hubmapconsortium.org/api/')

gene_symbol = client.select_genes().get_list()[0]['gene_symbol']
cells_with_gene = client.select_cells(where='gene', has=[f'{gene_symbol} > 0.5'], genomic_modality='rna')
assert len(cells_with_gene) > 0

gene_symbol

# Reached out to Sean -- Not sure why it's not working.

## Entity API

**Maintainer**: PSC

**Description**: Wrapper around Neo4J database-of-record. It has methods for traversing the provenance graph, and can return the details for individual entities; It does not provide search functionality.

**Doc Style**: Smart API; Interactive

**Doc URL**: https://smart-api.info/ui/0065e419668f3336a40d1f5ab89c6ba3

**Source URL**: https://github.com/hubmapconsortium/entity-api/

In [21]:
import requests

entity_api_url = 'https://entity.api.hubmapconsortium.org/'

requests.get(f'{entity_api_url}entity-types').json()

['Collection', 'Dataset', 'Donor', 'Sample', 'Upload']

In [25]:
id = 'HBM668.QFDW.774' # UUID also supported
entity = requests.get(f'{entity_api_url}entities/{id}').json()
entity['title']

'snATAC-seq (SNARE-seq2) [SnapATAC] data from the lung (right) of a 37.0-year-old black or african american male'

In [28]:
ancestors = requests.get(f'{entity_api_url}ancestors/{id}').json()
[(a['entity_type'], a['uuid']) for a in ancestors]

[('Dataset', '4bc9b335040544bc76d87acb189e594a'),
 ('Sample', '27997171ea74885abbd91a99cac360d9'),
 ('Sample', '6e5e5be224d88f38aa390d0c389839c7'),
 ('Sample', '0e1c2d399477b244ac006eb58918ec0c'),
 ('Donor', '4397fcd072ac96299992b47da1dbae64'),
 ('Dataset', '18f644163d1114f46dc67cc75f0a8edd'),
 ('Dataset', 'c277864db8e229bb4336428b5e1e096d'),
 ('Dataset', 'b94df37c7a261274840750d994bc42a9')]

## TODO

**AKA**:

**Maintainer**:

**Description**:

**Backing API**:

**Doc Style**:

**Doc URL**:

**Source URL**:

## TODO

**AKA**:

**Maintainer**:

**Description**:

**Backing API**:

**Doc Style**:

**Doc URL**:

**Source URL**: