New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New module: Freyja #1903
New module: Freyja #1903
Conversation
multiqc/utils/search_patterns.yaml
Outdated
@@ -345,6 +345,10 @@ flash/hist: | |||
flexbar: | |||
contents: "Flexbar - flexible barcode and adapter removal" | |||
shared: true | |||
freyja: | |||
fn: "*.tsv" | |||
contents: "summarized" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels not very specific - do you think more conditions can be added? Is it fair to expect that files are named *demix_outs.tsv
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not...
Perhaps we could check more contents lines? Like require lineages
and abundances
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah definitely if its better to add those checks. I thought it would be enough because of the number of lines we specify.
multiqc/modules/freyja/freyja.py
Outdated
|
||
# Percentages don't always add up to 1, show a warning if this is the case | ||
if sum(d.values()) != 1: | ||
log.warning(f"Freyja {s_name}: percentages don't sum to 1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often does this happen? In all the 3 test examples, the percentages do not sum to 1. If it happens frequently, we probably shouldn't be polluting the logs with these warnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing it will always be the case unless amplicon or any other targeted sequencing approach is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, merged before I remembered that we didn't fully address this issue. I think I will remove this warning in a separate PR, or make it a debug message.
multiqc/modules/freyja/freyja.py
Outdated
headers["Top_lineage_freyja"] = { | ||
"title": "Top lineage (Freyja)", | ||
"description": "The most abundant lineage in the sample", | ||
"scale": "RdYlGn-rev", # Not sure if this is the best scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one of the categorical scales should be good here (like Set2
, etc, see: https://multiqc.info/docs/development/plots/#table-colour-scales), though I can't make it work 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's not working because the color is applied to the bar, which has a width of zero here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I pushed a change to use categorical colours instead, let me know if it looks good:
looks goodAnd would love @ewels opinion on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks a lot for contributing this module! I added a few changes to the branch. I don't have a strong opinion on this, but one of a bit unusual things for MultiQC is the use of a non-numerical column in the general stats (i.e., the most abundant lineage here). I'm interested in @ewels's take on it, whether we want it, and what would be the best color scale. The barplot pretty much shows the same information, but we want something to be in the general stats anyway, right? I added the use of a categorial scale into this PR, but I don't like that the generated colors in the result don't match the colors used in the barplot; I'm going to look into whether I can make them match. |
multiqc/modules/freyja/freyja.py
Outdated
|
||
> **Note**: Lineage designation is based on the used WHO nomenclature, which could vary over time. | ||
""", | ||
plot=bargraph.plot(self.freyja_data, None, pconfig), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting order was simple (pushed a change), however, making sure the barchart colors are the same as the general stats column colors is not obvious: for the column, more pastel colors look better, whereas on the barchart, the bright default highcharts colors look the best.
Stacked on top #2017 - do no merge until that PR is merged. |
… than 1 (as it is almost always going to be the case), see #1903
Adress Issue #1902
The nf-core/viralrecon pipeline is being extended with the Freyja module to analyse mixed SARS-CoV-2 samples (i.e wastewater samples)
PR: nf-core/viralrecon#375
CHANGELOG.md
has been updated--lint
flag)docs/README.md
is updated with link to belowdocs/modulename.md
is createdself.add_section
)