# Knowledge-aware entities

To enable easier data integration, metadata fields are curated against standardized vocabularies. This process is often manual and quite time consuming, as it requires to look up the metadata terms and determine which one to use.

With {class}`lamindb.knowledge` ([bionty](https://lamin.ai/docs/bionty)), we offer lookups via tab completion of names.

In [None]:
import lamindb as ln
import lamindb.schema as lns
import lamindb.knowledge as lnk

ln.nb.header()

## Knowledge lookup

For instance, you can retrieve the Cell Type ontology id of "gamma delta T cell":

In [None]:
ct_lookup = lnk.CellType().lookup

In [None]:
ct_lookup.gamma_delta_T_cell

## Knowledge aware schema tables

You can also directly create a record for the CellType table in lamindb:

In [None]:
lns.bionty.CellType(ontology_id=ct_lookup.gamma_delta_T_cell)

Similarly, assay readout can be looked up:

```{note}

lookup namespace can only starts with a letter, so we add `LOOKUP_` to the terms that aren't letters
```

In [None]:
lnk.lookup.readout.LOOKUP_10x_3_v1

## Curate and add table records to knowledge ware schema tables

Now fast-forward, assuming we've done the curation (here we load a curated dataset with annotated ontology terms), let's link all the biosamples in a cross-tissue scRNA-seq dataset:

In [None]:
adata = ln.dev.datasets.anndata_human_immune_cells()

meta = adata.obs.drop_duplicates(subset=adata.obs.columns)
meta.shape

In [None]:
meta.head()

Let's first add all the cell types:

`.curate` allows you to check the passed ids are present in the knowledge table, it returns a new DataFramed indexed with the curated ids and a boolean `__curated__` columns.

Here we saw all terms can be linked. 🎉

In [None]:
celltype_curate = lnk.CellType().curate(meta, column="cell_type_ontology_term_id")

We can go ahead and create records of the CellType table:

In [None]:
celltype_records = [
    lns.bionty.CellType(ontology_id=i) for i in celltype_curate.index.unique()
]

In [None]:
celltype_records[:3]

We can do the same for tissues:

In [None]:
tissue_curate = lnk.Tissue().curate(meta, column="tissue_ontology_term_id")

In [None]:
tissue_records = [
    lns.bionty.Tissue(ontology_id=i) for i in tissue_curate.index.unique()
]

In [None]:
tissue_records[:3]

Finally, let's add them to the database:

In [None]:
ln.add(celltype_records)
ln.add(tissue_records);

Check they are in the database:

In [None]:
ln.select(lns.bionty.Tissue, name="blood").one()