Adding Element Biosciences AVITI Bases2fastq support to multiqc #1990

blajoie · 2023-08-17T19:21:14Z

…dling

…d lengths

…t-new

blajoie · 2023-08-17T22:36:03Z

Example MultiQC reports cc @YuheCheng62

Cloudbreak-DVT-ecoli-wgs-2x150
12 runs, 1152 samples total. Each run is a 96plex of Ecoli WGS libraries, sequenced on AVITI.
https://element-public-data.s3.us-west-2.amazonaws.com/multiqc/Cloudbreak-DVT-ecoli-wgs-2x150.html

Cloudbreak-DVT-human-wgs-2x150
10 runs, 20 samples total. Each run is a 2plex of Human WGS libraries, sequenced on AVITI.
https://element-public-data.s3.us-west-2.amazonaws.com/multiqc/Cloudbreak-DVT-human-wgs-2x150.html

Cloudbreak-DVT-human-rnaseq-2x75
10 runs, 160 samples total. Each run is a 16plex of Human UHHR Rna-Seq libraries, sequenced on AVITI.
https://element-public-data.s3.us-west-2.amazonaws.com/multiqc/Cloudbreak-DVT-human-rnaseq-2x75.html

blajoie · 2023-08-17T22:37:22Z

PR into MultiQC_TestData for supprting test data. cc @YuheCheng62
MultiQC/test-data#263

ewels

Very fast first-pass review to check some of the common gotchas that come up in PRs. A few tweaks to change here, I'll come back for a more thorough review soon. (Haven't even tried running it yet).

docs/modules/base2fastq.md

multiqc/utils/search_patterns.yaml

setup.py

multiqc/modules/bases2fastq/bases2fastq.py

ewels · 2023-08-18T12:28:44Z

ok I still have 4 minutes before my next meeting so tried generating a report very quickly with the test data. Speed notes:

Got errors about making Matplotlib figures. Haven't looked into why yet.
Colours in tables are all defaults. Colour matters.
Please add a column or two to general stats for comparison against results from other tools. Yield maybe?
Run name - set Scale to None for text cells, it'll fix text wrapping.
Please write more help text. Explain the why of the plot, what to look for, implications, how to spot bad samples etc. Current help text is more suitable as the description field.

vladsavelyev · 2023-08-18T13:49:54Z

multiqc/modules/bases2fastq/bases2fastq.py

+        self.groupDict = dict()
+        self.groupLookupDict = dict()


We should stick to the lower_case_with_underscores naming convention for variables and functions.

(and also note to self - this is something that can be checked automatically with linters)

multiqc/modules/bases2fastq/bases2fastq.py

vladsavelyev · 2023-08-18T14:03:42Z

multiqc/modules/bases2fastq/plot_runs.py

+    plotContent = dict()
+    for s_name in runData.keys():
+        runStats = dict()
+        runStats.update({"#Polonies": runData[s_name]["NumPolonies"]})


#Polonies should be rounded to read or bases counts in the general stats table. See how it's done for the bcl2fastq module as a reference.

Same for other metrics. Please take a look if similar metrics that are already reported in other modules, and try to use the same naming style, value formatting, and color scheme. E.g. bcl2fastq has Q30, yield, mean base quality. FastQC can be a reference for other metrics and plots.

vladsavelyev · 2023-08-18T14:10:26Z

multiqc/modules/bases2fastq/plot_samples.py

+    return html, plotName, anchor, description, helptext, plotContent
+
+
+def plot_per_cycle_N_content(sampleData, groupLookupDict, colorDict):


Can the FastQC code for that plot be adapted here?

vladsavelyev · 2023-08-18T14:16:01Z

I tried to open the first example that you shared (
https://element-public-data.s3.us-west-2.amazonaws.com/multiqc/Cloudbreak-DVT-ecoli-wgs-2x150.html) and it froze my browser :( I think the culprits are the sample-level plots, but can't understand what exactly is the bottleneck in rendering. Would be good to look more because even the test sample data makes the rendering veeery slow. We might want to revert to static plots here. But something to get back to when we replace the plotting library.

Overall, the module is a great addition, but I would ask to adjust the code style (particularly, Python's standard is to use lower_case_underscore naming for variables and functions), and to look at similar modules like fastqc bclf2fastq to reuse and adapt the existing code for similar plots and metrics.

CHANGELOG.md

docs/README.md

docs/modules/bases2fastq.md

vladsavelyev

A few change suggestions, and make sure to update the branch from master. Otherwise, good with me!

Co-authored-by: Vlad Savelyev <vladislav.sav@gmail.com>

vladsavelyev

Awesome stuff!

vladsavelyev · 2023-09-05T09:31:19Z

multiqc/modules/bases2fastq/plot_runs.py

+        "format": "{d}",
+        "description": "The (total) number of polonies calculated for the run",
+        "min": 0,
+        "scale": "RdYlGn",


Just repeating some code comments that might be lost now :)

#Polonies should be rounded to read or bases counts in the general stats table. See how it's done for the bcl2fastq module as a reference.

Same for other metrics. Could you take a look if similar metrics that are already reported in other modules, and try to use the same naming style, value formatting, and color scheme. E.g. bcl2fastq has Q30, yield, mean base quality. FastQC can be a reference for other metrics and plots.

vladsavelyev · 2023-09-05T09:32:11Z

multiqc/modules/bases2fastq/plot_samples.py

+    return html, plot_name, anchor, description, helptext, plot_content
+
+
+def plot_per_cycle_N_content(sample_data, group_lookup_dict, color_dict):


Would be great to adapt the FastQC code for that plot

vladsavelyev

Minor changes requested - see the code comments above!

ewels · 2023-09-14T22:56:07Z

ok, the remaining changes are very minor. I just tried to fix the merge conflict but couldn't push it back - because the fork is under Elembio (and not a user account), GitHub is a bit more strict about allowing us to push into the PR directly.

To keep the momentum, I've created a new branch that we can work in and an associated PR: #2044

Hopefully we can do these final minor changes and get this merged soon 😄

multiqc/modules/bases2fastq/bases2fastq.py

ewels · 2023-09-22T09:01:06Z

Locking the conversation here so that we remember to move over to #2044

ewels · 2023-09-22T09:01:16Z

Actually I think I will just close this PR for clarity.

yuhe.cheng62 and others added 17 commits August 15, 2023 13:28

Add base2fastq support

5dd6a02

change search pattern and README

b061eeb

adding support for runname + analysisid UUID

d425d5c

fix bugs when R1 length and R2 length are different

6b8d4d7

embed analysis id into workflow name

1ef0522

filter samples with < 10k polonies, improve root-root-analayis-id han…

e2c51a4

…dling

add error handling for too large RunStats.json

4800895

improve b2f search patterns, handle missing json due to size

6366f88

use sampleid to map to a unique analysis of b2f

a5cd5dc

formatted, change warning info format

4388321

fix root

85a9cad

adjust r1/r2 split for when user is including runs with differing rea…

28b3652

…d lengths

Merge branch 'master' of github.com:ewels/MultiQC into add-b2f-suppor…

47d6f68

…t-new

add seaborn to setup

1c48ccc

update changelog

416faa1

Resolve warnings in lint

5ab7100

Add read length check. Add data source and data file

cac9530

YuheCheng62 force-pushed the add-b2f-support-new branch from af20cbb to cac9530 Compare August 17, 2023 22:17

pre-commit

ca2d356

YuheCheng62 force-pushed the add-b2f-support-new branch from a08a9dc to ca2d356 Compare August 17, 2023 22:25

Merge branch 'master' into add-b2f-support-new

a772859

ewels requested changes Aug 18, 2023

View reviewed changes

vladsavelyev reviewed Aug 18, 2023

View reviewed changes

multiqc/modules/bases2fastq/bases2fastq.py Outdated Show resolved Hide resolved

vladsavelyev reviewed Aug 18, 2023

View reviewed changes

YuheCheng62 added 3 commits August 24, 2023 11:56

Merge branch 'master' into add-b2f-support-new

a0bc746

Merge branch 'master' into add-b2f-support-new

82209f6

Merge branch 'master' into add-b2f-support-new

b7f9e13

vladsavelyev reviewed Aug 28, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

vladsavelyev reviewed Aug 28, 2023

View reviewed changes

docs/README.md Outdated Show resolved Hide resolved

vladsavelyev reviewed Aug 28, 2023

View reviewed changes

docs/modules/bases2fastq.md Outdated Show resolved Hide resolved

vladsavelyev self-requested a review August 28, 2023 12:31

vladsavelyev requested changes Aug 28, 2023

View reviewed changes

YuheCheng62 and others added 5 commits August 28, 2023 10:46

Update CHANGELOG.md

94496c2

Co-authored-by: Vlad Savelyev <vladislav.sav@gmail.com>

Update docs/README.md

3c42985

Co-authored-by: Vlad Savelyev <vladislav.sav@gmail.com>

Update docs/modules/bases2fastq.md

f7e69af

Co-authored-by: Vlad Savelyev <vladislav.sav@gmail.com>

Update change log to add bases2fastq module

e7da35e

Merge branch 'master' into add-b2f-support-new

5e0134a

vladsavelyev self-requested a review August 28, 2023 22:12

vladsavelyev approved these changes Aug 28, 2023

View reviewed changes

vladsavelyev added the awaits-review Awaiting final review and merge. label Aug 28, 2023

ewels removed the awaits-review Awaiting final review and merge. label Sep 1, 2023

vladsavelyev self-requested a review September 1, 2023 11:45

vladsavelyev reviewed Sep 5, 2023

View reviewed changes

vladsavelyev self-requested a review September 5, 2023 09:32

vladsavelyev requested changes Sep 5, 2023

View reviewed changes

ewels added this to the MultiQC v1.16 milestone Sep 14, 2023

ewels mentioned this pull request Sep 14, 2023

Elembio bases2fastq #2044

Open

vladsavelyev reviewed Sep 22, 2023

View reviewed changes

multiqc/modules/bases2fastq/bases2fastq.py Show resolved Hide resolved

MultiQC locked as off-topic and limited conversation to collaborators Sep 22, 2023

ewels closed this Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding Element Biosciences AVITI Bases2fastq support to multiqc #1990

Adding Element Biosciences AVITI Bases2fastq support to multiqc #1990

blajoie commented Aug 17, 2023 •

edited by ewels

blajoie commented Aug 17, 2023

blajoie commented Aug 17, 2023

ewels left a comment

ewels commented Aug 18, 2023

vladsavelyev Aug 18, 2023 •

edited

vladsavelyev Aug 18, 2023 •

edited

vladsavelyev Aug 18, 2023

vladsavelyev Aug 18, 2023

vladsavelyev commented Aug 18, 2023 •

edited

vladsavelyev left a comment •

edited

vladsavelyev left a comment

vladsavelyev Sep 5, 2023

vladsavelyev Sep 5, 2023

vladsavelyev Sep 5, 2023

vladsavelyev left a comment •

edited

ewels commented Sep 14, 2023

ewels commented Sep 22, 2023

ewels commented Sep 22, 2023

		return html, plotName, anchor, description, helptext, plotContent


		def plot_per_cycle_N_content(sampleData, groupLookupDict, colorDict):

		return html, plot_name, anchor, description, helptext, plot_content


		def plot_per_cycle_N_content(sample_data, group_lookup_dict, color_dict):

Adding Element Biosciences AVITI Bases2fastq support to multiqc #1990

Adding Element Biosciences AVITI Bases2fastq support to multiqc #1990

Conversation

blajoie commented Aug 17, 2023 • edited by ewels

blajoie commented Aug 17, 2023

blajoie commented Aug 17, 2023

ewels left a comment

Choose a reason for hiding this comment

ewels commented Aug 18, 2023

vladsavelyev Aug 18, 2023 • edited

Choose a reason for hiding this comment

vladsavelyev Aug 18, 2023 • edited

Choose a reason for hiding this comment

vladsavelyev Aug 18, 2023

Choose a reason for hiding this comment

vladsavelyev Aug 18, 2023

Choose a reason for hiding this comment

vladsavelyev commented Aug 18, 2023 • edited

vladsavelyev left a comment • edited

Choose a reason for hiding this comment

vladsavelyev left a comment

Choose a reason for hiding this comment

vladsavelyev Sep 5, 2023

Choose a reason for hiding this comment

vladsavelyev Sep 5, 2023

Choose a reason for hiding this comment

vladsavelyev Sep 5, 2023

Choose a reason for hiding this comment

vladsavelyev left a comment • edited

Choose a reason for hiding this comment

ewels commented Sep 14, 2023

ewels commented Sep 22, 2023

ewels commented Sep 22, 2023

blajoie commented Aug 17, 2023 •

edited by ewels

vladsavelyev Aug 18, 2023 •

edited

vladsavelyev Aug 18, 2023 •

edited

vladsavelyev commented Aug 18, 2023 •

edited

vladsavelyev left a comment •

edited

vladsavelyev left a comment •

edited