# Track sample-level metadata

We already saw how to link data objects to entities representing features during ingestion.

For sample-level metadata, the underlying schema is often more complicated, and hence it's best done in a separate step.

Here, we walk through this process.

In [None]:
import lamindb as ln
import lnschema_bionty as lnbt

ln.track()

Samples, i.e., metadata associated with observations, are linked with the same approach post-ingestion.

We'll need to lazily relationships of objects, and hence, we need to keep track of a session.

In [None]:
ss = ln.Session()

Let's first query an scRNA-seq dataset stored as an `.h5ad` file.

In [None]:
file = ss.select(ln.File, suffix=".h5ad").first()

In [None]:
file

For instance, let's annotate a scRNA-seq dataset with its readout type (scRNA-seq), the tissue, and the species.

## Readout

In [None]:
readout_bionty = lnbt.Readout.bionty()  # equals to bionty.Readout()
readout_bionty_lookup = readout_bionty.lookup()

In [None]:
readout_bionty_lookup.single_cell_RNA_sequencing

In [None]:
readout_record = lnbt.Readout(readout_bionty_lookup.single_cell_RNA_sequencing)

readout_record

### CellType

In [None]:
celltype_bionty = lnbt.CellType.bionty()  # equals to bionty.CellType()
celltype_bionty_lookup = celltype_bionty.lookup()

In [None]:
celltype_bionty_lookup.CD8_positive_alpha_beta_memory_T_cell

In [None]:
celltype_record = lnbt.CellType(
    celltype_bionty_lookup.CD8_positive_alpha_beta_memory_T_cell
)

In [None]:
celltype_record

## Link against file

Link metadata records against the data object:

In [None]:
file.readouts.append(readout_record)

In [None]:
file.cell_types.append(celltype_record)

## Add to the DB

We can add everything to the DB in one transaction:

In [None]:
ss.add(file)

Let us close the session.

In [None]:
ss.close()

```{Tip}

Manage `Session` closing with a context manager instead of manually closing it!

With it the above would look like:

```{code}
with ln.Session() as ss:
    # manipulate data
```

## Query file from linked metadata

In [None]:
ln.select(
    ln.File.name,
    ln.File.suffix,
    lnbt.Readout.name,
    lnbt.Readout.molecule,
    lnbt.Readout.instrument,
).where(
    ln.File.readouts,
    lnbt.Readout.name.contains("single-cell"),
).df()

In [None]:
ln.select(
    ln.File.name, ln.File.suffix, lnbt.CellType.name, lnbt.CellType.ontology_id
).where(
    ln.File.cell_types,
    lnbt.CellType.name.contains("T cell"),
).df()

## What's in the database?

### Biological entities

In [None]:
ln.view(schema="bionty")

In [None]:
# integrity checks
with ln.Session() as ss:
    mouselymph = ss.select(ln.File, name="Mouse Lymph Node scRNA-seq").one()

    mouselymph_hash = mouselymph.hash
    assert mouselymph_hash == "Qprqj0O23197Ko-VobaZiw"

    mouselymph_features_hash = mouselymph.features[0].id
    assert mouselymph_features_hash == "2Mv3JtH-ScBVYHilbLaQ"