Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make NanoStat module recognize more report flavors #1594

Merged
merged 12 commits into from
Jan 15, 2022
Merged

Make NanoStat module recognize more report flavors #1594

merged 12 commits into from
Jan 15, 2022

Conversation

MillironX
Copy link
Contributor

  • This comment contains a description of changes (with reason)
  • CHANGELOG.md has been updated

As noted in #1547, the NanoStat module is currently ignoring NanoStat reports generated from fasta and fastq files. Fix that by

  1. Broadening the modules search pattern to include fasta NanoStat reports
  2. Adding logic to separate quality metrics from alignment and summary metrics
  3. Bypassing the quality plot if adding a fasta NanoStat report

I have tested this PR with every test file contained in MultiQC/test-data#221, and received output from every file.

Use the string "General summary:" followed by 9 spaces in the first line
to distinguish a NanoStat report. Those 9 spaces probably aren't supposed
to be there, but they are found in all of the test files, and Linus says
it's a feature and not a bug once people start depending on it.

Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
The [NanoStat source code](https://github.com/wdecoster/nanostat/blob/f50c7530f072e9e2b3c830334d806a619dc37c68/nanostat/NanoStat.py#L42-L55)
reveals that the number of space characters after the first line is
actually variable based upon the output of the next few lines. Update the
search pattern with a regular expression that can handle such variability.

Fixes: 38899d7 (Update NanoStat search pattern)
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
@MillironX
Copy link
Contributor Author

Looking through the CI logs, it looks like different flavors of NanoStat report cannot be parsed within the same run using the current logic. I will look into workarounds for that.

Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
Signed-off-by: Thomas A. Christensen II <25492070+MillironX@users.noreply.github.com>
ewels added a commit to MultiQC/test-data that referenced this pull request Jan 15, 2022
Check that we can handle bad files as search pattern is quite general now.
See MultiQC/MultiQC#1594
Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Tested locally and I'm happy. Thanks very much!

@ewels ewels merged commit 9e860be into MultiQC:master Jan 15, 2022
vladsavelyev pushed a commit to vladsavelyev/MultiQC_TestData that referenced this pull request Apr 16, 2022
Check that we can handle bad files as search pattern is quite general now.
See MultiQC/MultiQC#1594
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants