Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kaiju: fix "division by zero" #2179

Merged
merged 3 commits into from
Nov 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/changelog.py
Original file line number Diff line number Diff line change
Expand Up @@ -374,7 +374,7 @@ def _skip_existing_entry_for_this_pr(line, same_section=True):
updated_lines.append("\n")
_updated_lines = [_l for _l in section_lines + new_lines if _l.strip()]
if section == "### Module updates":
_updated_lines = sorted(_updated_lines)
_updated_lines = sorted(_updated_lines, key=lambda x: x.lower())
updated_lines.extend(_updated_lines)
updated_lines.append("\n")
if new_lines:
Expand Down
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ Highlights:

### Module updates

- **Pangolin**: update for v4: add QC Note , update tool versions columns ([#2157](https://github.com/ewels/MultiQC/pull/2157))
- **fastp**: add version parsing ([#2159](https://github.com/ewels/MultiQC/pull/2159))
- **fastp**: correctly parse sample name from --in1/--in2 command. Prefer file name if not `fastp.json`; fallback to file name when error ([#2139](https://github.com/ewels/MultiQC/pull/2139))
- **Kaiju**: fix "division by zero" ([#2179](https://github.com/ewels/MultiQC/pull/2179))
- **Nanostat**: account for both tab and spaces in v1.41+ search pattern ([#2155](https://github.com/ewels/MultiQC/pull/2155))
- **Pangolin**: update for v4: add QC Note , update tool versions columns ([#2157](https://github.com/ewels/MultiQC/pull/2157))
- **Picard**: Generalize to directly support Sentieon and Parabricks outputs ([#2110](https://github.com/ewels/MultiQC/pull/2110))
- **Sentieon**: Removed the module in favour of directly supporting parsing by the **Picard** module. Note that any code that relies on the module name needs to be updated, e.g. `-m sentieon` will no longer work, the exported plot and data files will be prefixed as `picard` instead of `sentieon`, etc. Note that the Sentieon module used to fetch the sample names from the file names by default, and now it follows the Picard module's logic, and prioritizes the commands recorded in the logs. To override, use the `use_filename_as_sample_name` config flag ([#2110](https://github.com/ewels/MultiQC/pull/2110))

Expand Down
23 changes: 11 additions & 12 deletions multiqc/modules/kaiju/kaiju.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,23 +170,22 @@ def kaiju_stats_table(self):
}
tdata = {}
for s_name, d in self.kaiju_sample_total_readcounts.items():
tdata[s_name] = {}
tdata[s_name]["% Unclassified"] = (
self.kaiju_sample_unclassified[s_name] * 100 / self.kaiju_sample_total_readcounts[s_name]
)
tdata[s_name] = {
# Default values in case if data is not available for this sample at this rank
"assigned": 0,
"% Assigned": 0,
}
total = self.kaiju_sample_total_readcounts[s_name]
if total == 0:
continue
tdata[s_name]["% Unclassified"] = self.kaiju_sample_unclassified[s_name] * 100 / total
if s_name in self.kaiju_data[general_taxo_rank.lower()]:
tdata[s_name]["assigned"] = (
self.kaiju_sample_total_readcounts[s_name]
total
- self.kaiju_sample_unclassified[s_name]
- self.kaiju_data[general_taxo_rank.lower()][s_name]["cannot be assigned"]["count"]
)
tdata[s_name]["% Assigned"] = (
tdata[s_name]["assigned"] * 100 / self.kaiju_sample_total_readcounts[s_name]
)
else:
# don't have the value for this samples at this rank
tdata[s_name]["assigned"] = 0
tdata[s_name]["% Assigned"] = 0
tdata[s_name]["% Assigned"] = tdata[s_name]["assigned"] * 100 / total
self.general_stats_addcols(tdata, headers)

def top_five_barplot(self):
Expand Down