# Integrate redun & LaminDB for processing fasta files

This use-case example starts out with Rico Meinl's [GitHub repository](https://github.com/ricomnl/bioinformatics-pipeline-tutorial/tree/redun), which accompanies his [blog post](https://ricomnl.com/blog/bottom-up-bioinformatics-pipeline-extension-redun/).

It demonstrates how to use Lamin with the workflow management tool [redun](https://github.com/insitro/redun).

## Motivation

While redun focuses on managing worfklows for running data pipelines, LaminDB takes a bird's eye view on R&D data.

redun offers features for scheduling, executing, and data lineage tracking for computational pipelines.

LaminDB, in comparison, tracks data lineage across computational pipelines, interactive analysis (notebooks), and UI-submitted data.

In addition, LaminDB offers features for querying R&D data for biological entities.

Here, we demonstrate integrating the fine-grained data lineage tracking of redun with LaminDB's overarching tracking.

## Setting up an instance

In [None]:
!lndb login testuser1@lamin.ai --password cEvcwMJFX4OwbsYVaMt2Os6GxxGgDUlBGILs2RyS

In [None]:
!lndb init --storage ./fasta

## Track the workflow as a pipeline

In [None]:
import lamindb as ln
import lamindb.schema as lns
from pathlib import Path
import redun_lamin_fasta

In [None]:
ln.nb.header()

Create a pipeline record:

In [None]:
pipeline = lns.Pipeline(
    name="lamin-redun-fasta",
    v=redun_lamin_fasta.__version__,
    reference="https://github.com/laminlabs/redun-lamin-fasta",
)

Add the record to the db.

In [None]:
ln.add(pipeline)

## Track the input files

Here, we emulate a source location for the input files. They'll be linked against the present notebook.

In [None]:
for filepath in Path("./fasta/").glob("*.fasta"):
    filepath = ln.DObject(filepath)
    ln.add(filepath)