Add profile smoothing and contamination check#436
Merged
FerriolCalvet merged 6 commits intodevfrom Mar 23, 2026
Merged
Conversation
Member
FerriolCalvet
commented
Mar 20, 2026
- Add the option to smooth the mutational profile when having less than 200 mutations in the sample (this is hardcoded since we conducted entropy tests to define the threshold).
- Fix the flagging of omegas.
- Added the computation of proportion of SNP sites mutated to inform contamination in samples independent from the samples being analyzed in the same batch/cohort.
- functionally working but not correct - pending implementation change for WGS correction
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces optional Bayesian smoothing for mutational profiles (using a cohort-derived prior profile), adjusts omega-flagging/visualization logic, and adds an additional contamination indicator based on the proportion of mutated SNP sites.
Changes:
- Add a
profile_smoothingparameter and plumb a cohort prior profile intoCOMPUTE_PROFILEto enable smoothing for low-mutation samples. - Extend
mut_profile.pywith Bayesian update logic and new CLI flags (--smoothed,--prior_profile). - Refactor contamination checking to include SNP-site mutation proportion output and revise omega-flagging annotation/plots.
Reviewed changes
Copilot reviewed 5 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| subworkflows/local/mutationprofile/main.nf | Adds cohort prior computation and passes it into profile computation when smoothing is enabled. |
| nextflow_schema.json | Adds profile_smoothing parameter to the schema. |
| nextflow.config | Adds default params.profile_smoothing = false. |
| modules/local/compute_profile/main.nf | Extends module inputs and wires new smoothing args into the mut_profile.py profile call. |
| conf/modules.config | Adds config hook intended to toggle smoothing based on params.profile_smoothing. |
| bin/mut_profile.py | Implements Bayesian smoothing and exposes CLI options to activate it. |
| bin/check_contamination.py | Refactors loading and adds SNP-based contamination summary output. |
| bin/annotate_omega_failing.py | Splits flagged tables by criteria and updates annotation/plotting functions accordingly. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.