# Manage biological registries 

If you only work with pre-defined ontologies (public or in-house), [Bionty](https://lamin.ai/docs/bionty/) is sufficient!

If you'd like to maintain in-house registries for basic entities along with ontologies, manage them using `lnschema_bionty`.

```{toctree}
:hidden:
:maxdepth: 1

../lnschema-bionty
```

Let us start with an instance that has `lnschema_bionty` mounted:

In [None]:
!lamin init --storage ./test-registries --schema bionty

In [None]:
import lamindb as ln
import lnschema_bionty as lb

ln.settings.verbosity = 3  # show hints

## Look up an ontology entry

Let us first grab a public ontology for cell types:

In [None]:
celltype_bionty = lb.CellType.bionty()  # same as bionty.CellType()

In [None]:
celltype_bionty

And generate a lookup object:

In [None]:
celltype_bionty_lookup = celltype_bionty.lookup()

There are 2680 terms in it:

In [None]:
len(celltype_bionty_lookup)

In [None]:
celltype_bionty_lookup.gamma_delta_t_cell

## Create a record for an in-house registry

In [None]:
celltype_record = lb.CellType.from_bionty(celltype_bionty_lookup.gamma_delta_t_cell)

celltype_record

You can add it to the DB to seed an in-house ontology:

In [None]:
celltype_record.save()

In [None]:
lb.CellType.select().df()

In [None]:
lb.CellType.select(name=celltype_record.name).one()

You can now work with a lookup object with much less terms: `lb.CellType.lookup()`

## Parse records from data

Often, you want to parse records from data and map it onto a reference. {func}`~lamindb.parse` takes any iterable and maps it on your in-house reference.

Consider a DataFrame-based example:

In [None]:
adata = ln.dev.datasets.anndata_with_obs()

In [None]:
adata.obs.head()

In [None]:
adata.obs.cell_type.value_counts()

You can parse the cell types and create records in 3 ways:

1. parse based on cell type name column
2. parse based on cell type id column
3. parse based on both columns

Use the cell type name column:

In [None]:
cell_types = ln.parse(adata.obs.cell_type, lb.CellType.name)

cell_types

Use the cell type id column, which has an empty string for "my new cell type":

In [None]:
ln.parse(adata.obs.cell_type_id, lb.CellType.ontology_id)

Use both columns:

In [None]:
ln.parse(
    adata.obs,
    {"cell_type_id": lb.CellType.ontology_id, "cell_type": lb.CellType.name},
)

(Note: no additional fields are mapped from bionty if multiple columns are parsed.)

If we're happy with `cell_types`, we save them to the DB in one transaction:

In [None]:
ln.save(cell_types);

Our in-house registry grew a bit:

In [None]:
lb.CellType.select().df()

The same workflow works for all of `lnschema_bionty`'s ORMs.

## Track underlying ontology sources

Under-the-hood, ontology sources are tracked:

In [None]:
lb.BiontySource.select(currently_used=True).df()

Each record is linked to a versioned bionty source (if it was created from bionty):

In [None]:
cell_type_record = lb.CellType.select(name="hepatocyte").one()
cell_type_record.bionty_source

In [None]:
!lamin delete test-registries
!rm -r test-registries