Add mutdensity subgenic and fix bugs#401
Conversation
- update naming - update omega container
- add proper CPU definition in estimator
There was a problem hiding this comment.
Pull request overview
This pull request introduces a modular workflow for enriching genomic panels with subgenic regions (exons/domains) and refactors the mutation density calculation and OMEGA analysis pipelines to use these enriched panels. The changes also upgrade the OMEGA tool to version 0.2.1, improve error handling for DNA to protein mapping failures, and standardize terminology from "hotspots" to "subgenic regions" throughout the codebase.
- Adds a new
ENRICHPANELSworkflow to centralize panel expansion logic with subgenic regions - Updates all mutation density and OMEGA processes to use enriched panels instead of original consensus panels
- Upgrades OMEGA container image and improves version reporting for better reproducibility
Reviewed changes
Copilot reviewed 9 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
workflows/deepcsa.nf |
Integrates the new ENRICHPANELS workflow and updates downstream processes (mutation density, OMEGA) to use enriched panels; removes direct DNA2PROTEINMAPPING call and delegates to ENRICHPANELS |
subworkflows/local/enrichpanels/main.nf |
New workflow that orchestrates DNA2PROTEINMAPPING and panel expansion for all panel types (all, prot, nonprot, synonymous, exons) |
subworkflows/local/omega/main.nf |
Removes inline panel expansion logic; now receives pre-enriched panels and subgenic region definitions from ENRICHPANELS workflow |
subworkflows/local/createpanels/main.nf |
Adds .first() calls to ensure single-file outputs for domains and postprocessed panels |
modules/local/expand_regions/main.nf |
Renames outputs from "hotspots" to "subgenic"; updates script to call add_subgenicregions.py instead of add_hotspots.py; fixes exons channel condition logic |
modules/local/bbgtools/omega/preprocess/main.nf |
Updates container to bbglab/omega:0.2.1; dynamically captures omega version at runtime |
modules/local/bbgtools/omega/mutabilities/main.nf |
Updates container to bbglab/omega:0.2.1; updates version reporting |
modules/local/bbgtools/omega/estimator/main.nf |
Updates container to bbglab/omega:0.2.1; changes label to cpu_medium; enables --cores parameter; removes deprecated impact groups; updates version reporting |
conf/tmp_quick_fixes.config |
Adds error handling for DNA2PROTEINMAPPING process with retry and ignore strategy |
bin/omega_select_mutdensity.py |
Filters out subgenic regions (containing "--") from gene mutation densities; removes else clause in error handling |
bin/compute_mutdensity.py |
Merges subgenic region annotations from enriched panels into mutation data for accurate mutation density calculation by region |
bin/add_subgenicregions.py |
Renames output files from "hotspots" to "subgenic"; improves panel name extraction logic |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
m-huertasp
left a comment
There was a problem hiding this comment.
I think this is quite good job! I might be missing some context regarding where and why we are doing this but I added some comments. If you need any further help, feel free to let me know. I’d also maybe need you to explain the PR in more detail.😃
FedericaBrando
left a comment
There was a problem hiding this comment.
Given Marta's review I did a quick look on the code and it seems all correct. Running a linter every now and then helps with the deprecation errors - also, you can select nextflow version in the IDE to match the error and be sure that it doesn't break in prod.
Overall keep up the good work! 👊🏻
This PR includes the restructurization of the calls to EXPANDREGIONS to obtain a value of mutation density for subgenic regions defined from exons or domains.
It also includes some bug fixing for cases where the DNA2PROTEINMAPPING fails and then omega was not executed leading to other errors downstream.
AI summary
This pull request introduces a new workflow for enriching genomic panels with subgenic regions and refactors how panels are expanded and used throughout the pipeline. It also upgrades the
omegatool version and improves reproducibility and flexibility in several processes. The changes streamline the workflow, making panel enrichment modular and easier to maintain, and ensure that downstream analyses use the correct, enriched panels.Workflow and Pipeline Refactoring
ENRICHPANELSworkflow (subworkflows/local/enrichpanels/main.nf) to handle the enrichment of consensus panels with subgenic regions, domains, and exons. This workflow centralizes and modularizes region expansion logic.workflows/deepcsa.nf) to use the newENRICHPANELSworkflow for generating expanded panels and subgenic region JSONs, and updated downstream processes (e.g., mutation density, OMEGA analysis) to use these enriched panels instead of the original ones. [1] [2] [3] [4]EXPAND_REGIONSprocess from the OMEGA analysis subworkflow, delegating all panel expansion to the new enrichment workflow. [1] [2] [3]Process and Container Updates
OMEGA_PREPROCESS,OMEGA_MUTABILITIES,OMEGA_ESTIMATOR) to use the newdocker.io/bbglab/omega:0.2.1container instead of the previous one, and improved version reporting by dynamically capturing the actual tool version at runtime. [1] [2] [3] [4] [5] [6] [7] [8]Panel Expansion and Subgenic Regions
EXPAND_REGIONSprocess, including output file names and script logic, to better reflect the expanded panel content. [1] [2]Error Handling and Configuration
DNA2PROTEINMAPPINGprocess, allowing up to two retries before ignoring errors, which increases pipeline robustness.Miscellaneous Improvements
These changes collectively improve the modularity, maintainability, and reliability of the pipeline, especially around the management and enrichment of gene panels.