# Nextflow

[Nextflow](https://www.nextflow.io/) is a workflow management system used for executing scientific workflows across platforms scalably, portably, and reproducibly.

Here, we'll run a demo of the microscopy pipeline [mcmicro](https://github.com/labsyspharm/mcmicro) to correct uneven illumination. [Reference](https://mcmicro.org/)

```{note}

Typically, you run the Nextflow workflow from the command line or Seqera Platform and then register input and output data with a script.
The Seqera Platform allows for [post-run scripts](https://docs.seqera.io/platform/23.4.0/launch/advanced#pre-and-post-run-scripts) that can automate this process.
```

In [None]:
!lamin init --storage ./test-nextflow --name test-nextflow

In [None]:
import lamindb as ln

## Run mcmicro pipeline

Run mcmicro pipeline and track input/output data with the script below:

In [None]:
import subprocess
import lamindb as ln
import yaml
import shutil
from pathlib import Path

transform = ln.Transform(
    name="MCMICRO",
    version="1.0.0",
    type="pipeline",
    reference="https://github.com/labsyspharm/mcmicro",
)
ln.context.track(transform=transform)
run = ln.context.run

mcmicro_input = ln.Artifact.using("laminlabs/lamindata").get(uid="iTLHluoQczqH6ZypgDxA")
input_dir = mcmicro_input.cache()
if not (dest := Path.cwd() / Path(input_dir.path).name).exists():
    shutil.copytree(input_dir.path, dest)

report = "exemplar-001-mcmicro-execution_report.html"
subprocess.run(
    [
        "nextflow",
        "run",
        "https://github.com/labsyspharm/mcmicro",
        "--in",
        dest,
        "--start-at",
        "illumination",
        "--stop-at",
        "registration",
        "-with-report",
        report,
    ]
)

nextflow_id = subprocess.getoutput(
    "nextflow log | tail -n 1 | awk -F '\t' '{print $6}'"
)

ulabel = ln.ULabel(name="nextflow").save()
run.transform.ulabels.add(ulabel)

report_artifact = ln.Artifact(
    report, description=f"nextflow report of {nextflow_id}", visibility=0, run=False
).save()
run.report = report_artifact
run.reference = nextflow_id
run.reference_type = "nextflow_id"
run.save()

with open(f"{dest}/qc/params.yml") as params_file:
    qc_params = yaml.safe_load(params_file)
ln.Param(name="qc_params", dtype="dict").save()
run.params.add_values({"qc_params": qc_params})

Path(f"{dest}/registration").joinpath(f"{Path(dest).name}.ome.tif").rename(
    Path(f"{dest}/registration/exemplar-001.ome.tif")
)
output = ln.Artifact.from_dir(f"{dest}/registration")
ln.save(output)
ln.context.finish()

## Data lineage

View data lineage:

In [None]:
output = ln.Artifact.filter(key__icontains="exemplar-001.ome.tif").one()

In [None]:
output.view_lineage()

## View transforms and runs in LaminHub

[![hub](https://img.shields.io/badge/View%20in%20LaminHub-mediumseagreen)](https://lamin.ai/laminlabs/lamindata/transform/vMwsczN6lGZWRm8w/foyuuRRmEt7KYiaU8hPD)

<img src="https://lamin-site-assets.s3.amazonaws.com/.lamindb/FtEgTeQ9FXdbVWNnVTZ2.png" width="900px">

## View the database content

In [None]:
ln.view()

In [None]:
# clean up the test instance:
!rm -rf test-nextflow
!lamin delete --force test-nextflow