# Track sample-level metadata

We already saw how to link data objects to entities representing features during ingestion.

For sample-level metadata, the underlying schema is often more complicated, and hence it's best done in a separate step.

Here, we walk through this process.

In [None]:
import lamindb as ln
import lnschema_bionty as bt
import lnschema_lamin1 as ln1

ln.track()

Samples, i.e., metadata associated with observations, are linked with the same approach post-ingestion.

We'll need to lazily relationships of objects, and hence, we need to keep track of a session.

In [None]:
ss = ln.Session()

Let's first query an scRNA-seq dataset stored as an `.h5ad` file.

In [None]:
file = ss.select(ln.File, suffix=".h5ad").first()

In [None]:
file

For instance, let's annotate a scRNA-seq dataset with its readout type (scRNA-seq), the tissue, and the species.

## Readout

In [None]:
ro_lookup = bt.Readout.bionty.lookup()
scrnaseq = ro_lookup.single_cell_RNA_sequencing

scrnaseq

In [None]:
readout = bt.Readout(name=scrnaseq.name)

readout

Link the readout against the data object.

In [None]:
file.readouts.append(readout)

## Biosample

In [None]:
biosample = ln1.Biosample(name="Mouse Lymph Node")

### Species

We already have mouse in the database, hence let's just query it. No need to create a new record.

In [None]:
species = ln.select(bt.Species, name="mouse").one()

species

In [None]:
biosample.species = species

### Tissue

In [None]:
tissue_lookup = bt.Tissue.bionty.lookup()

In [None]:
tissue_lookup.lymph_node

In [None]:
tissue = bt.Tissue(name=tissue_lookup.lymph_node.name)

In [None]:
tissue

In [None]:
biosample.tissue = tissue

## Link against file

Link against the data object:

In [None]:
file.biosamples.append(biosample)

## Add to the DB

We can add everything to the DB in one transaction:

In [None]:
ss.add([readout, biosample])

Let us close the session.

In [None]:
ss.close()

```{Tip}

Manage `Session` closing with a context manager instead of manually closing it!

With it the above would look like:

```{code}
with ln.Session() as ss:
    # manipulate data
```

## Query for linked metadata

In [None]:
ln.select(ln.File).where(
    ln.File.readouts,
    bt.Readout.name == scrnaseq.name,
).df()

In [None]:
ln.select(ln.File).join(ln.File.biosamples).where(
    ln1.Biosample.species, bt.Species.name == "mouse"
).df()

## What's in the database?

### Biological entities

In [None]:
ln.view(schema="bionty")

### Wetlab

In [None]:
ln.view(schema="lamin1")

In [None]:
# integrity checks
with ln.Session() as ss:
    mouselymph = ss.select(ln.File, name="Mouse Lymph Node scRNA-seq").one()

    mouselymph_hash = mouselymph.hash
    assert mouselymph_hash == "Qprqj0O23197Ko-VobaZiw"

    mouselymph_features_hash = mouselymph.features[0].id
    assert mouselymph_features_hash == "2Mv3JtH-ScBVYHilbLaQ"