Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMI-Tools: support extract command #2284

Closed
2 of 4 tasks
mschubert opened this issue Jan 26, 2024 · 3 comments · Fixed by #2296
Closed
2 of 4 tasks

UMI-Tools: support extract command #2284

mschubert opened this issue Jan 26, 2024 · 3 comments · Fixed by #2296

Comments

@mschubert
Copy link

mschubert commented Jan 26, 2024

Description of bug

The regexes provided in the UMI-Tools module

regexes = {
"total_umis": r"INFO total_umis (\d+)",
"unique_umis": r"INFO #umis (\d+)",
"input_reads": r"INFO Reads: Input Reads: (\d+)",
"output_reads": r"INFO Number of reads out: (\d+)",
"positions_deduplicated": r"INFO Total number of positions deduplicated: (\d+)",
"mean_umi_per_pos": r"INFO Mean number of unique UMIs per position: ([\d\.]+)",
"max_umi_per_pos": r"INFO Max. number of unique UMIs per position: (\d+)",
"version": r"# UMI-tools version: ([\d\.]+)",
}

do not match the UMI-Tools log file output when using --extract_method regex (exact command via nf-core/rnaseq here). The corresponding log file looks the following (full file attached):

2024-01-25 00:20:39,419 INFO Parsed 10000000 reads
2024-01-25 00:20:40,180 INFO Input Reads: 10077011
2024-01-25 00:20:40,180 INFO regex does not match read1: 9215286
2024-01-25 00:20:40,180 INFO regex matches read1: 861725
2024-01-25 00:20:40,180 INFO regex does not match read2: 752463
2024-01-25 00:20:40,180 INFO regex matches read2: 109262
2024-01-25 00:20:40,180 INFO Reads output: 109262

MultiQC version: 1.14 (but regexes are the same in latest)
UMI-Tools version: 1.1.4 (latest)

File that triggers the error

RMGI_wt_250_48h-3.umi_extract.log

MultiQC Error log

No error, but UMI-Tools panel is absent.

Before submitting

  • I have read the troubleshooting documentation.
  • I am using the latest release of MultiQC.
  • I have included a full MultiQC log, not truncated.
  • I have attached an input file (.zip if necessary) that triggers the error.
@vladsavelyev
Copy link
Member

Thanks for the issue! Indeed, The UMI-Tools MultiQC module only supports the dedup command.

I added some basic support for extract now with #2296, let me know if you have ideas of any better visualisations for it apart from a simple table.

vladsavelyev added a commit to MultiQC/test-data that referenced this issue Feb 8, 2024
@vladsavelyev vladsavelyev added this to the MultiQC v1.20 milestone Feb 8, 2024
@vladsavelyev vladsavelyev changed the title UMI-Tools module does not capture regex logging UMI-Tools: support extract command Feb 8, 2024
@mschubert
Copy link
Author

mschubert commented Feb 8, 2024

This is great, thanks!

For the vis, I'd probably suggest a stacked bar graph with colors: (1) mismatched read1, (2) matched read1 but mismatched read 2, (3) matched both reads; total of these is the input read number

@vladsavelyev
Copy link
Member

Should be done with #2296

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants