# Ingesting data with features

In [None]:
from nbproject import header
from lamindb.do import ingest
from bionty import Gene, lookup
import scanpy as sc

header()

## Dataset

Here we have two datasets, data1 is indexed with gene symbols while data2 has a column containing ensembl ids

In [None]:
data1 = sc.datasets.pbmc68k_reduced()
data2 = sc.datasets.pbmc3k()

In [None]:
data1.var.head()

Note that gene id column name must match the database field, you can look them up in `bt.lookup.gene_ids.`

In [None]:
data2.var.rename(columns={"gene_ids": lookup.gene_ids.ensembl_gene_id}, inplace=True)
data2.var.head()

## Curate features

For data1, we specify the feature model using bionty Gene with id as hgnc_symbol

In [None]:
feature_model1 = Gene(id=lookup.gene_ids.hgnc_symbol)

In [None]:
ingest.add(data1, feature_model=feature_model1)

For data2, we'd like to ingest features based on the ensembl ids

In [None]:
feature_model2 = Gene(id=lookup.gene_ids.ensembl_gene_id)

In [None]:
ingest.add(data2, feature_model=feature_model2)

In [None]:
ingest.status

`.logs` stores info of the mapped features

In [None]:
next(iter(ingest.logs.items()))

## Commit the ingestion

In [None]:
ingest.commit()