feat(gedi): add gedi/indexgenome and gedi/price modules#11693
Merged
Conversation
Adds two modules wrapping the GEDI / PRICE toolkit (`bioconda::gedi=1.0.6a`) for Ribo-seq translated-ORF discovery. PRICE (Erhard et al. 2018, doi:10.1038/nmeth.4631) calls translated ORFs from ribosome profiling data with near-cognate start codon detection. `gedi/indexgenome` wraps `gedi -e IndexGenome`, producing the `.oml` genome index directory consumed by PRICE. `gedi/price` wraps `bamlist2cit` + `gedi -e Price`, taking a cohort of Ribo-seq BAMs plus the genome index and emitting ORF predictions (`*.orfs.tsv` + `*.cit` + sidecars). One-shot across the cohort - PRICE is not per-sample. Both modules use Wave-built community containers from `bioconda::gedi=1.0.6a`. The bioconda recipe was merged 2026-05-16; using Wave directly for now. Source: nf-core/riboseq#174. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously hard-coded the output directory as `price_index`. Switching to
`${prefix}` (default `${meta.id}`, overridable via `task.ext.prefix`) lets
callers control the directory name and matches the nf-core convention for
publishable directory outputs.
The default ${meta.id} keeps the directory keyed to the reference id, so
when `gedi/price` opens `${index}/${meta2.id}.oml`, the lookup still
resolves provided meta ids match (already the case in the test chain).
Snapshot regenerated: the index directory name in the output snapshot
changes from `price_index` to the test's `homo_sapiens_chr20` (its meta.id).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 tasks
Replaces the stub-only PRICE test with an end-to-end test that runs PRICE on a minimal cohort of four Ribo-seq samples (chr19+chr22, protein-coding-only reference). The cohort produces 380 ORF calls; snapshot captures the orfs.tsv line count for stability validation. Fixtures published in nf-core/test-datasets PR nf-core#2061. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…factor
The earlier `${prefix}` refactor (commit 0ca4c45) changed the index
output declaration from `path("price_index")` to `path("${prefix}")`,
but the meta.yml output entry still hard-coded `price_index` — causing
CI lint to flag `correct_meta_outputs: Module meta.yml does not match
main.nf`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… emit
After the `${prefix}` refactor (commit 0ca4c45) the index output line
was the only `tuple val(meta), path(...)` emit in the module, so the
52-space alignment padding it kept from when the path was `price_index`
no longer aligns with anything.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dules into gedi-add-modules
Two cross-cutting fixes from review of nf-core#11693: - Licence was Apache-2.0 in both meta.yml files; the upstream repo erhard-lab/gedi is GPL-3.0. Corrected. - "GEDI (Gene Expression Data Integration)" was unverified — the upstream README/wiki/paper don't expand the acronym that way. Replaced with the upstream one-liner phrasing. PRICE meta.yml also adds the verified PRICE expansion (Probabilistic Inference of Codon Activities by an EM algorithm) from the GEDI wiki. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nf-core/test-datasets#2061 merged; fixtures now live on the modules branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
apeltzer
approved these changes
May 19, 2026
manascripts
pushed a commit
to manascripts/modules
that referenced
this pull request
May 21, 2026
* feat(gedi): add gedi/indexgenome and gedi/price modules Adds two modules wrapping the GEDI / PRICE toolkit (`bioconda::gedi=1.0.6a`) for Ribo-seq translated-ORF discovery. PRICE (Erhard et al. 2018, doi:10.1038/nmeth.4631) calls translated ORFs from ribosome profiling data with near-cognate start codon detection. `gedi/indexgenome` wraps `gedi -e IndexGenome`, producing the `.oml` genome index directory consumed by PRICE. `gedi/price` wraps `bamlist2cit` + `gedi -e Price`, taking a cohort of Ribo-seq BAMs plus the genome index and emitting ORF predictions (`*.orfs.tsv` + `*.cit` + sidecars). One-shot across the cohort - PRICE is not per-sample. Both modules use Wave-built community containers from `bioconda::gedi=1.0.6a`. The bioconda recipe was merged 2026-05-16; using Wave directly for now. Source: nf-core/riboseq#174. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(gedi/indexgenome): use ${prefix} for the index output directory Previously hard-coded the output directory as `price_index`. Switching to `${prefix}` (default `${meta.id}`, overridable via `task.ext.prefix`) lets callers control the directory name and matches the nf-core convention for publishable directory outputs. The default ${meta.id} keeps the directory keyed to the reference id, so when `gedi/price` opens `${index}/${meta2.id}.oml`, the lookup still resolves provided meta ids match (already the case in the test chain). Snapshot regenerated: the index directory name in the output snapshot changes from `price_index` to the test's `homo_sapiens_chr20` (its meta.id). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(gedi/price): add real test using minimised chr19+chr22 fixtures Replaces the stub-only PRICE test with an end-to-end test that runs PRICE on a minimal cohort of four Ribo-seq samples (chr19+chr22, protein-coding-only reference). The cohort produces 380 ORF calls; snapshot captures the orfs.tsv line count for stability validation. Fixtures published in nf-core/test-datasets PR nf-core#2061. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gedi/indexgenome): update meta.yml output name after ${prefix} refactor The earlier `${prefix}` refactor (commit 0ca4c45) changed the index output declaration from `path("price_index")` to `path("${prefix}")`, but the meta.yml output entry still hard-coded `price_index` — causing CI lint to flag `correct_meta_outputs: Module meta.yml does not match main.nf`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style(gedi/indexgenome): collapse leftover alignment padding on index emit After the `${prefix}` refactor (commit 0ca4c45) the index output line was the only `tuple val(meta), path(...)` emit in the module, so the 52-space alignment padding it kept from when the path was `price_index` no longer aligns with anything. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gedi): correct licence (GPL-3.0) and Gedi description in meta.yml Two cross-cutting fixes from review of nf-core#11693: - Licence was Apache-2.0 in both meta.yml files; the upstream repo erhard-lab/gedi is GPL-3.0. Corrected. - "GEDI (Gene Expression Data Integration)" was unverified — the upstream README/wiki/paper don't expand the acronym that way. Replaced with the upstream one-liner phrasing. PRICE meta.yml also adds the verified PRICE expansion (Probabilistic Inference of Codon Activities by an EM algorithm) from the GEDI wiki. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(gedi/price): point fixtures at nf-core/test-datasets@modules nf-core/test-datasets#2061 merged; fixtures now live on the modules branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds two new modules wrapping the GEDI / PRICE toolkit for Ribo-seq translated-ORF discovery (Erhard et al. 2018, doi:10.1038/nmeth.4631).
gedi/indexgenomeWraps
gedi -e IndexGenometo build the.omlGEDI genome index from a FASTA + GTF pair.tuple val(meta), path(fasta), path(gtf)tuple val(meta), path("${prefix}")(emit: index) — directory containing${meta.id}.oml+ sidecars; directory name defaults tometa.id, overridable viatask.ext.prefixtopic: versionsgedi/priceWraps
bamlist2cit+gedi -e Price. Takes a cohort of Ribo-seq BAMs plus the GEDI genome index and emits PRICE's ORF predictions. PRICE estimates a shared codon-position model across all input BAMs, so the cohort is processed as a single invocation (not per-sample).tuple val(meta), path(bams, stageAs: 'bams/*'), path(bais, stageAs: 'bams/*')— Ribo-seq cohorttuple val(meta2), path(index)— directory fromgedi/indexgenomeorfs_tsvplus optionalorfs_cit/orfs_metadata/codons_cit/model/signal/paramsidecars, andtopic: versionsContainers
Wave-built community containers backed by
bioconda::gedi=1.0.6a:community.wave.seqera.io/library/gedi_indexgenome:cfca16738f306c86community.wave.seqera.io/library/gedi_price:2392624d5f803049Test data
gedi/priceis exercised against a minimal Ribo-seq cohort (four samples, chr19+chr22, protein-coding-only reference) atdata/genomics/homo_sapiens/riboseq_expression/price/(added via nf-core/test-datasets#2061). Every fixture file < 4 MiB, ~11 MB total. PRICE produces 380 ORF calls; the test snapshot captures theorfs.tsvline count for stability.Test plan
nf-core modules test --profile docker gedi/indexgenome— green, two-pass stablenf-core modules test --profile docker gedi/price— green, two-pass stablenf-core modules lint --dir . gedi/{indexgenome,price}— 0 failures (warnings limited to Wave URL probe / tag-version cosmetic, same as other Wave-container modules)Source: nf-core/riboseq#174.