New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace beeswarm plot with different visualisation #2066
Comments
Yes, this is a known issue. Generally the idea is to switch to static image plots when sample numbers are very high. This caps the maximum report filesize and also the JavaScript run time (eg. doesn't hang the browser). However, there is no flat-image plot for beeswarm plots yet (or heatmaps). This should be resolved when we replace the plotting library in the near future. We will also look into replacing beeswarm plots with a better plot type - eg. violin or similar. |
I think I'll close this for now as there's nothing specific that we'll do for the pangolin module. This will be a good test case for #1789 though! |
Actually I changed my mind. We didn't have an issue specifically for switching out the beeswarm plot yet. So I've changed the title and commandeered the issue 😅 |
@vladsavelyev - see above for a nice example dataset above to try out the new plot with huge sample numbers.. |
Note to self: needed to convert the downloaded file from TSV to CSV. Then tell MultiQC not to ignore the large file. This worked for me: multiqc -f . --cl-config "log_filesize_limit: 2000000000" With this (unzipped) file: |
This is effectively addressed by adding the violin plot in #2292 (Still not ideal that the long tick labels getting cropped, something to fix in the future). |
Description of bug
@wm75 informed me about this very large Pangolin dataset posted on galaxy (https://usegalaxy.eu/published/history?id=5ee10825304a885f) and figured it would be interesting to test using MultiQC. Specifially this dataset: https://usegalaxy.eu/api/datasets/4838ba20a6d867654919ea0761c5ed4d/display?to_ext=tabular which translates to a ~80 Mb CSV. In contains 375676 samples!
Running it was no big issue, it took about 2 min and used a maximum of ~5Gb memory on my Macbook.
Trying to view the report however the browser first hangs (see image below).
After a while it loads, but scrolling is quite laggy with the large number of samples. The table is also converted to a beeswarm plot which I am not sure is very informative, or at least some columns should probably be removed. See below
File that triggers the error
No response
MultiQC Error log
Before submitting
The text was updated successfully, but these errors were encountered: