# Ingest data from bfx pipeline runs

Here we show how to ingest a file from bfx runs.

In [None]:
import lamindb as ln
import lnbfx

ln.nb.header()

Here, we ingest a set of bioinformatics output files generated by [Cell Ranger](https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger).

Because we already ingested this before in the guide, we'll delete it first.

In [None]:
dobject = ln.db.select.dobject(name="sample_1_R1").one()
ln.db.delete.dobject(dobject.id)

Let's get it again:

In [None]:
bfx_run_output = ln.datasets.dir_scrnaseq_cellranger()
filepath = bfx_run_output / "fastq/sample_1_R1.fastq.gz"

filepath

We can directly choose metadata of an existing BFX pipeline through [`lnbfx`](https://lamin.ai/docs/lnbfx), or create one ourselves! 

In [None]:
bfx_pipeline = lnbfx.lookup.pipeline.cell_ranger_v7_0_0

In [None]:
bfx_pipeline

Let us use these metadata values to insert a row in our pipeline table:

In [None]:
pipeline = ln.db.insert.pipeline(**bfx_pipeline)

In [None]:
pipeline

And create a pipeline_run:

In [None]:
pipeline_run = ln.schema.core.pipeline_run(
    pipeline_id=pipeline.id, pipeline_v=pipeline.v, name="bfx_run_001"
)

In [None]:
pipeline_run

Create a test `biosample` and a test `biometa` object:

In [None]:
biosample = ln.db.insert.biosample(name="test_biosample")

In [None]:
biosample

In [None]:
biometa = ln.schema.wetlab.biometa(biosample_id=biosample.id)

In [None]:
biometa

Let us create an ingest object to track ingestion from the pipeline_run.

In [None]:
ingest = ln.db.Ingest(dsource=pipeline_run)

The `biometa` table has a corresponding link table `dobject_biometa`, and we can hence use the following call to link it against dobject:

In [None]:
# TODO: debug tomorrow
# ingest.add(filepath).link(biometa);

Complete the ingestion

In [None]:
# ingest.commit()

Select dobject by linked metadata

In [None]:
# ln.db.select.dobject(where=dict(biosample=dict(name="test_biosample"))).df()

In [None]:
# ln.db.select.dobject(where=dict(pipeline_run=dict(name="bfx_run_001"))).df()