Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hifiasm module broke #2267

Closed
4 tasks done
mbeavitt opened this issue Jan 19, 2024 · 4 comments
Closed
4 tasks done

hifiasm module broke #2267

mbeavitt opened this issue Jan 19, 2024 · 4 comments
Labels
bug: module Bug in a MultiQC module

Comments

@mbeavitt
Copy link
Contributor

Description of bug

The hifiasm stderr log did not work with multiqc v1.19

File that triggers the error

Methanocorpusculum_labreanum_test.stderr.log

MultiQC Error log

/// MultiQC 🔍 | v1.19
  
  |           multiqc | Search path : /home/tubergene/nf-core-bhasm/work/d9/a7e07c700f58c4fa5b2fbfe078b54a
  |         searching | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 15/15  
  |    custom_content | software_versions: Found 1 sample (html)
  |    custom_content | nf-core-bhasm-methods-description: Found 1 sample (html)
  |    custom_content | nf-core-bhasm-summary: Found 1 sample (html)
  ╭──────────────── Oops! The 'hifiasm' MultiQC module broke... ─────────────────╮
  │ Please copy this log and report it at                                        │
  │ https://github.com/ewels/MultiQC/issues                                      │
  │ Please attach a file that triggers the error. The last file found was:       │
  │ ./5/Methanocorpusculum_labreanum_test.stderr.log                             │
  │                                                                              │
  │ Traceback (most recent call last):                                           │
  │   File "/usr/local/lib/python3.11/site-packages/multiqc/multiqc.py", line 72 │
  │     output = mod()                                                           │
  │              ^^^^^                                                           │
  │   File "/usr/local/lib/python3.11/site-packages/multiqc/modules/hifiasm/hifi │
  │     self.parse_hifiasm_log_files()                                           │
  │   File "/usr/local/lib/python3.11/site-packages/multiqc/modules/hifiasm/hifi │
  │     data = self.extract_kmer_graph(f["f"])                                   │
  │            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                   │
  │   File "/usr/local/lib/python3.11/site-packages/multiqc/modules/hifiasm/hifi │
  │     count = int(spline[3])                                                   │
  │                 ~~~~~~^^^                                                    │
  │ IndexError: list index out of range                                          │
  │                                                                              │
  ╰──────────────────────────────────────────────────────────────────────────────╯
  |          nanostat | Found 1 reports
  |           multiqc | Report      : multiqc_report.html
  |           multiqc | Data        : multiqc_data
  |           multiqc | Plots       : multiqc_plots
  |           multiqc | MultiQC complete

Before submitting

  • I have read the troubleshooting documentation.
  • I am using the latest release of MultiQC.
  • I have included a full MultiQC log, not truncated.
  • I have attached an input file (.zip if necessary) that triggers the error.
@mbeavitt
Copy link
Contributor Author

please ignore the fact that I have some weird flags in the initial command - I know bacteria are not tetraploid but this is test data for a larger pipeline :)

@mbeavitt
Copy link
Contributor Author

mbeavitt commented Jan 19, 2024

Aha, found the issue. I think it's perhaps because I'm using a small pacbio input file, and so sometimes there are no asterisks in the log file line. The code in the hifiasm module's extract_kmer_graph function can be updated to:

def extract_kmer_graph(fin):
    """Extract the kmer graph from file in"""
    data = dict()

    found_histogram = False

    for line in fin:
        if line.startswith("[M::ha_hist_line]"):
            found_histogram = True
            spline = line.strip().split()
            # Occurrence of kmer
            occurrence = spline[1][:-1]
            # Special case
            if occurrence == "rest":
                continue
            # Count of the occurrence
            print(spline)
            if "*" in spline[2]:
                count = int(spline[3])
            else:
                count = int(spline[2])
            data[int(occurrence)] = count
            print(data)
        # If we are no longer in the histogram
        elif found_histogram:
            return data

Before:

                count = int(spline[3])

After:

            if "*" in spline[2]:
                count = int(spline[3])
            else:
                count = int(spline[2])

@mbeavitt
Copy link
Contributor Author

I made a PR:

#2268

@vladsavelyev vladsavelyev added this to the MultiQC v1.20 milestone Jan 19, 2024
@vladsavelyev vladsavelyev added the bug: module Bug in a MultiQC module label Jan 19, 2024
@vladsavelyev
Copy link
Member

Thanks a lot @mbeavitt for a detailed bug report, and even for contributing a fix 🙏

vladsavelyev added a commit to MultiQC/test-data that referenced this issue Jan 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: module Bug in a MultiQC module
Projects
None yet
Development

No branches or pull requests

2 participants