You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
v0.3.0 — idiomatic, portable Nextflow data plane
Epic #18: the generated Nextflow now stages only what each process needs and
carries files as first-class per-file path items — the same pipeline runs
unchanged on AWS Batch + S3, AWS HealthOmics, and local, with a materially
leaner DAG.
Highlights
- De-bundled data plane (#13): payloads cross a process boundary as
tuple(sidecar, individual leaf paths), not one staged bundle dir — files are
first-class, per-file-staged, and de-duplicated by the backend.
- Emit-once routing (#14): a call/return that forwards a producer's outputs
verbatim routes that channel straight through — no intervening re-materialize.
- Fold BIND (#16): non-split and split stage calls resolve their bindings inline
in the stage task — no standalone BIND_* plumbing processes.
- Publish without the single-node funnel (#12): a sidecar-only LAYOUT computes
the outs/ layout and a compressed manifest, and a PUBLISH_LEAF fan-out
publishes each output leaf in parallel. The physical outs/ tree is unchanged.
- Compressed output manifest (#11): manifest.json.gz — a flat, versioned index
of published outputs a control plane ingests in one GetObject (no S3 LIST).
- Data-movement benchmark + gate (#17): make bench reports tasks / plumbing /
DAG edges / per-file transfer multiplier and guards against regressions.
DAG reduction (benchmark, before -> after)
- chain: tasks 13 -> 8, DAG edges 40 -> 22, per-file transfer multiplier 11 -> 6
- split: tasks 22 -> 20, DAG edges 28 -> 25, per-file transfer multiplier 21 -> 19
Fidelity
- Byte-identical to real Martian (mrp) across the local differential suite and
golden e2e, and validated LIVE on AWS Batch + S3 (13/13) and AWS HealthOmics
(4/4), with manifest.json.gz set-equal to the published outs/ tree on both.
- New reusable parallel harnesses: test/e2e/aws_run.sh, test/e2e/aws_healthomics.sh.
No transpiler changes were required for either cloud backend.