# Integrate redun & LaminDB for processing fasta files

This use-case example starts out with Rico Meinl's [GitHub repository](https://github.com/ricomnl/bioinformatics-pipeline-tutorial/tree/redun), which accompanies his [blog post](https://ricomnl.com/blog/bottom-up-bioinformatics-pipeline-extension-redun/).

And, of course, it is based on the workflow management tool [redun](https://github.com/insitro/redun).

## Motivation

While redun focuses on managing worfklows for running data pipelines, LaminDB takes a bird's eye view on R&D data.

redun offers rich features for scheduling, executing, and data lineage tracking for computational pipelines.

LaminDB, in comparison, tracks data lineage across computational pipelines, interactive analysis (notebooks), and UI-submitted data.

But more importantly, data lineage is only one aspect of navigating data in LaminDB.

LaminDB offers configurable, knowledge-coupled schema modules and rich features for querying R&D data across biological entities.

Here, we demonstrate how to integrate the fine-grained data lineage tracking of redun with LaminDB's overarching tracking.

## Setting up an instance

In [None]:
!lndb login testuser1@lamin.ai --password cEvcwMJFX4OwbsYVaMt2Os6GxxGgDUlBGILs2RyS

In [None]:
!lndb init --storage redun-lamin-fasta

## Tracking this workflow as a pipeline

In [None]:
import lamindb as ln
import lamindb.schema as lns
from pathlib import Path
import redun_lamin_fasta

In [None]:
ln.nb.header()

In [None]:
pipeline = lns.Pipeline(
    name="lamin-redun-fasta",
    v=redun_lamin_fasta.__version__,
    reference="https://github.com/laminlabs/redun-lamin-fasta",
)

In [None]:
pipeline

In [None]:
ln.add(pipeline)

## Tracking the input files

In [None]:
ingest = ln.Ingest()

In [None]:
for filepath in Path("./fasta/").glob("*.fasta"):
    ingest.add(filepath)

In [None]:
ingest.commit()