![scrna3/6](https://img.shields.io/badge/scrna3/6-lightgrey)
[![Jupyter Notebook](https://img.shields.io/badge/Source%20on%20GitHub-orange)](https://github.com/laminlabs/lamin-usecases/blob/main/docs/scrna2.ipynb)
[![lamindata](https://img.shields.io/badge/Source%20%26%20report%20on%20LaminHub-mediumseagreen)](https://lamin.ai/laminlabs/lamindata/record/core/Transform?id=agayZTonayqAz8)

# Query artifacts

Here, we'll query artifacts and inspect their metadata.

This guide can be skipped if you are only interested in how to leverage the overall collection.

In [None]:
import lamindb as ln
import bionty as bt
import anndata as ad

In [None]:
ln.transform.stem_uid = "agayZTonayqA"
ln.transform.version = "1"
ln.track()

## Query artifacts by provenance metadata

In [None]:
users = ln.User.lookup()

In [None]:
ln.Transform.filter(created_by=users.testuser1).search("scrna")

In [None]:
transform = ln.Transform.filter(uid="Nv48yAceNSh85zKv").one()

In [None]:
ln.Artifact.filter(transform=transform).df()

## Query artifacts by biological metadata 

In [None]:
assays = bt.ExperimentalFactor.lookup()
organism = bt.Organism.lookup()
cell_types = bt.CellType.lookup()

In [None]:
query = ln.Artifact.filter(
    experimental_factors=assays.single_cell_rna_sequencing,
    organism=organism.human,
    cell_types=cell_types.gamma_delta_t_cell,
)

In [None]:
query.df()

## Inspect artifact metadata

In [None]:
query_set = ln.Artifact.filter().all()

artifact1, artifact2 = query_set[0], query_set[1]

In [None]:
artifact1.describe()

In [None]:
artifact1.view_lineage()

In [None]:
artifact2.describe()

In [None]:
artifact2.view_lineage()

## Compare features

Here we compute shared genes:

In [None]:
artifact1_genes = artifact1.features["var"]
artifact2_genes = artifact2.features["var"]

shared_genes = artifact1_genes & artifact2_genes
len(shared_genes)

In [None]:
shared_genes.list("symbol")[:10]

## Compare cell types

In [None]:
artifact1_celltypes = artifact1.cell_types.all()
artifact2_celltypes = artifact2.cell_types.all()

shared_celltypes = artifact1_celltypes & artifact2_celltypes
shared_celltypes_names = shared_celltypes.list("name")
shared_celltypes_names

## Load the individual artifacts

We could either load the artifacts into memory or access them in `backed` mode through `.backed()` to lazily load their content.

Let's load them into memory:

In [None]:
adata1 = artifact1.load()
adata2 = artifact2.load()

We can now subset the two collections by shared cell types:

In [None]:
adata1_subset = adata1[adata1.obs["cell_type"].isin(shared_celltypes_names)]
adata2_subset = adata2[adata2.obs["cell_type"].isin(shared_celltypes_names)]