# cellxgene-census

`cellxgene-census` is a Python client to query the concatenated cellxgene datasets.

This notebook shows how to query registered h5ad files from metadata.

For more background, see:

- [CELLxGENE Census](https://chanzuckerberg.github.io/cellxgene-census/)
- [TileDB-SOMA](https://github.com/single-cell-data/TileDB-SOMA)

## Setup



First, load the public instance:

```bash
lamin load laminlabs/cellxgene-census
```

In [None]:
import lamindb as ln
import lnschema_bionty as lb

In [None]:
lb.settings.organism = "human"

In [None]:
ln.track()

## Search metadata

In [None]:
lb.CellType.search("effector Tcell").head()

In [None]:
ln.Transform.filter().df()

## Ontological hierarchies

In [None]:
teff = lb.CellType.filter(id=617).one()

In [None]:
teff.view_parents(distance=2, with_children=True)

In [None]:
teff.children.df()

In [None]:
teff_with_children = [teff.name] + teff.children.list("name")

## Query `H5AD` files by metadata

In [None]:
features = ln.Feature.lookup()
assays = lb.ExperimentalFactor.lookup()
cell_types = lb.CellType.lookup()
tissues = lb.Tissue.lookup()
ulabels = ln.ULabel.lookup()
suspension_types = ulabels.is_suspension_type.children.all().lookup()

In [None]:
%%time

ln.File.filter(
    organism=lb.settings.organism,
    cell_types__name__in=teff_with_children,
    tissues=tissues.brain,
    ulabels=suspension_types.cell,
    experimental_factors=assays.ln_10x_3_v3,
).distinct().df()

## Access a queried `H5AD` file

In [None]:
file = ln.File.filter(uid="uttAfutzAzJLIltepw1l").one()
file

Optionally, query for a collection you found from https://cellxgene.cziscience.com/collections:

```python
ln.File.filter(ulabels__name__contains="Mapping single-cell transcriptomes in the intra-tumoral and associated territories of kidney cancer").one()
```

Note that most recent collections may not have been added yet.

Describe all linked metadata:

In [None]:
file.describe()

Check the corresponding collection/publication:

In [None]:
collection = file.labels.get(features.collection).one()
collection

```{tip}

Use `dataset.backed()`, `dataset.stage()`, `dataset.load()` to access the underlying `h5ad` file.

See {class}`~lamindb.Dataset` for details.
```

If you are interested in how the human part of the instance was created: see {doc}`census-registries`.

If you are interested in querying from `cellxgene-census` using LaminDB registries: see {doc}`query-census`.

If you want to see the full docs, see [here](https://cellxgene-census-lamin-c192.netlify.app/notebooks).

```{toctree}
:maxdepth: 1
:hidden:

census-registries
query-census
```