Draft
Conversation
Wraps `dorado aligner` (minimap2) to align unaligned ONT BAMs (e.g. from dorado/basecaller) while preserving modification tags (MM/ML). CPU-only — no GPU required. - main.nf: DORADO_ALIGNER process, process_high label, vendor container - meta.yml: EDAM ontologies for BAM/FASTA/FAI/TSV inputs and outputs - tests: stub + real test against GIAB HG002 unaligned BAM and hg38 slice (snapshot reduced to stable fields to avoid BAM header path drift) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds
dorado/aligner— a wrapper around Oxford Nanopore'sdorado aligner, which uses minimap2 under the hood to align unaligned ONT BAMs (e.g. produced bydorado/basecaller) while preserving modified base tags (MM/ML) and other BAM auxiliary tags.CPU-only (no GPU required) —
dorado alignerwraps minimap2 which is CPU-native. GPU is only needed for basecalling.Why a separate module from
dorado/basecaller?Follows nf-core convention of one tool-subcommand per module. This is also the approach recommended by @Kevin-Brockers / @dialvarezs in #11122 (dorado/basecaller review). Keeping basecall and align separate lets users:
minimap2/alignif preferredTest data
Test paths depend on nf-core/test-datasets#1969 (unaligned HG002 GIAB 10-read BAM). Tests were verified locally against the same files before PR; CI will go green once #1969 merges.
Verified real-test output:
Snapshot strategy
Real-test snapshot intentionally captures only filename + dorado version (not BAM MD5), because
dorado alignerembeds absolute paths in@SQ UR:and@PG CL:BAM header lines — these vary between test environments. Stub-test snapshot uses fullprocess.out(stable because all files are empty touch'd).PR checklist
topic: versionsprocess_high(CPU/RAM scales with input)docker.io/nanoporetech/dorado:shac8f356489fa8b44b31beba841b84d2879de2088e(ONTPL license — not on bioconda; vendor container pattern matches nf-core/parabricks)nf-core modules test dorado/aligner --profile singularity✅nf-core modules lint dorado/aligner✅ (2 expected warnings re: vendor Docker Hub image, same as dorado/basecaller)Conda and Docker profile tests not yet run (dorado not on bioconda; Docker not available on my dev host — expecting CI to validate).
🤖 Generated with Claude Code