# Link data objects to sample-level metadata

We already saw how to link data objects to entities representing features during ingestion.

For sample-level metadata, the underlying schema is often more complicated, and hence it's best done in a separate step.

Here, we walk through this process.

In [None]:
import lamindb as ln
import lamindb.schema as lns
import lamindb.knowledge as lnk

ln.nb.header()

Samples, i.e., metadata associated with observations, are linked with the same approach post-ingestion.

Let's first query an scRNA-seq dataset stored as an `.h5ad` file.

In [None]:
dobject = ln.select(lns.DObject, suffix=".h5ad").first()

In [None]:
dobject

For instance, let's annotate a scRNA-seq dataset with its readout type (scRNA-seq), the tissue, and the species.

## Species

In [None]:
species = lns.bionty.Species(common_name="mouse")

In [None]:
species

## Tissue

In [None]:
tissue_lookup = lnk.Tissue().lookup

In [None]:
tissue_lookup.afferent_lymphatic_vessel

In [None]:
tissue = lns.bionty.Tissue(id=tissue_lookup.afferent_lymphatic_vessel)

In [None]:
tissue

## Readout

In [None]:
lnk.lookup.readout.single_cell_RNA_sequencing

In [None]:
readout = lns.wetlab.Readout(efo_id=lnk.lookup.readout.single_cell_RNA_sequencing)

In [None]:
readout

## Create a biosample record

In [None]:
biosample = lns.wetlab.Biosample(name="Mouse Lymph Node")

In [None]:
biosample.tissue = tissue

In [None]:
biosample.species_id = species.id  # need to have a besser UX around this

Currently, we still work with another entity `biometa`. We'll like drop it.

In [None]:
biometa = lns.wetlab.Biometa(name="Mouse Lymph Node scRNA-seq")

In [None]:
biometa.readout = readout

In [None]:
biometa.biosample = biosample

Link against the data object:

In [None]:
biometa.dobjects.append(dobject)

## Add to the DB

We can add everything to the DB in one transaction:

In [None]:
ln.add(biometa)

## Query for linked metadata

In [None]:
stmt = ln.select(lns.wetlab.Biometa).join(
    lns.wetlab.Readout, efo_id=lnk.lookup.readout.single_cell_RNA_sequencing
)
stmt.df()

In [None]:
ln.select(lns.DObject).join(lns.DObject.biometas).where(
    lns.wetlab.Biometa.id == stmt.first().id
).df()