# Look up records of Bionty entities

Entities and ontologies can be complex with many different identifiers.

Here we show Bionty's lookup model for species, genes, proteins and cell markers. You'll see how to

- initialize an Entity object
- access the reference table via `.df`
- lookup an entity record via `.lookup.{term}`

In [None]:
import bionty as bt

In [None]:
species_bionty.fields

## `.df()`: reference table

Data scientists love DataFrames, and every entity has a reference table containing all the fields.

In [None]:
species_bionty = bt.Species()

species_bionty

In [None]:
df = species_bionty.df()

In [None]:
df.head()

To access the information of, for example the human, pig, and mouse species, we select the corresponding species through Pandas:

In [None]:
df.loc[["human", "mouse", "pig"]]

## .lookup(): Lookup terms and records with autocompletion

Terms can be searched with auto-complete using a lookup object.

In [None]:
lookup = species_bionty.lookup()

Pythonic terms can be directly fetched via dot `.` accessor:

In [None]:
lookup.chimpanzee

For non-pythonic string, use bracket `[]` for autocompletion:

In [None]:
lookup["white-tufted-ear marmoset"]

By default, the `name` field is used to generate lookup keys.

You can specify another field to look up:

In [None]:
lookup = species_bionty.lookup(species_bionty.scientific_name)

In [None]:
lookup.choloepus_hoffmanni

## Gene

Next let's take a look at genes, which follows the same design choices as `Species`.

The only difference is the `Gene` class will initialize with a `species` parameter, therefore you will only retrieve gene entries of the specified species.

In [None]:
gene_bionty = bt.Gene(species="human")

In [None]:
gene_bionty

In [None]:
df = gene_bionty.df()

In [None]:
df.head()

In [None]:
gene_bionty_lookup = gene_bionty.lookup()

In [None]:
gene_bionty_lookup.TCF7

Convert between identifiers just using Pandas:

In [None]:
df.loc[df["symbol"].isin(["BRCA1", "BRCA2"])]

The mouse reference is also available from ensembl:

In [None]:
gene_bionty_mouse = bt.Gene("mouse")

In [None]:
df = gene_bionty_mouse.df()
df.head()

## Protein

The protein reference uses UniProt id as the standardized identifier.

In [None]:
protein_bionty = bt.Protein(species="human")

In [None]:
protein_bionty

In [None]:
protein_bionty_lookup = protein_bionty.lookup()

In [None]:
protein_bionty_lookup.ABC_transporter_domain_containing_protein

In [None]:
df = protein_bionty.df()
df.head()

## Cell marker

The cell marker ontologies works similarly.

In [None]:
cell_marker_bionty = bt.CellMarker(species="human")

In [None]:
cell_marker_bionty

In [None]:
df = cell_marker_bionty.df()
df.head()

In [None]:
cell_marker_bionty_lookup = cell_marker_bionty.lookup()

In [None]:
cell_marker_bionty_lookup.CD45