# Link data objects to sample-level metadata

We already saw how to link data objects to entities representing features during ingestion.

For sample-level metadata, the underlying schema is often more complicated, and hence it's best done in a separate step.

Here, we walk through this process.

In [None]:
import lamindb as ln
import lamindb.schema as lns
import lamindb.knowledge as lnk

ln.nb.header()

Samples, i.e., metadata associated with observations, are linked with the same approach post-ingestion.

We'll need to lazily relationships of objects, and hence, we need to keep track of a session.

In [None]:
ss = ln.Session()

Let's first query an scRNA-seq dataset stored as an `.h5ad` file.

In [None]:
dobject = ss.select(ln.DObject, suffix=".h5ad").first()

In [None]:
dobject

For instance, let's annotate a scRNA-seq dataset with its readout type (scRNA-seq), the tissue, and the species.

## Readout

In [None]:
lnk.lookup.readout.single_cell_RNA_sequencing

In [None]:
readout = lns.wetlab.Readout(efo_id=lnk.lookup.readout.single_cell_RNA_sequencing)

Link the readout against the data object.

In [None]:
dobject.readouts.append(readout)

## Biosample

In [None]:
biosample = lns.wetlab.Biosample(name="Mouse Lymph Node")

### Species

In [None]:
species = lns.bionty.Species(name="mouse")

In [None]:
species

In [None]:
# the species record already exists, hence we only link the id instead of adding the full record
# need to have a more intuitive UX around this that clarifies why we have to do this
biosample.species_id = species.id

### Tissue

In [None]:
tissue_lookup = lnk.Tissue().lookup

In [None]:
tissue_lookup.lymph_node

In [None]:
tissue = lns.bionty.Tissue(ontology_id=tissue_lookup.lymph_node)

In [None]:
tissue

In [None]:
biosample.tissue = tissue

## Link against dobject

Link against the data object:

In [None]:
dobject.biosamples.append(biosample)

## Add to the DB

We can add everything to the DB in one transaction:

In [None]:
ss.add([readout, biosample])

Let us close the session.

In [None]:
ss.close()

```{Tip}

Manage `Session` closing with a context manager instead of manually closing it!

With it the above would look like:

```{code}
with ln.Session() as ss:
    # manipulate data
```

## Query for linked metadata

In [None]:
ln.select(ln.DObject).where(
    ln.DObject.readouts,
    lns.wetlab.Readout.efo_id == lnk.lookup.readout.single_cell_RNA_sequencing,
).df()

In [None]:
ln.select(ln.DObject).join(ln.DObject.biosamples).where(
    lns.wetlab.Biosample.species, lns.bionty.Species.name == "mouse"
).df()