Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New JSON format on MultiQC 1.20 #11

Open
fgvieira opened this issue Mar 31, 2024 · 5 comments · May be fixed by #13
Open

New JSON format on MultiQC 1.20 #11

fgvieira opened this issue Mar 31, 2024 · 5 comments · May be fixed by #13

Comments

@fgvieira
Copy link

Does TidyMultiqc supports the new json format on MultiQC version 1.20 (MultiQC/MegaQC#519)?

@multimeric
Copy link
Owner

No, but thanks for the reminder. I'll take a look.

@multimeric
Copy link
Owner

So it seems that the format is fundamentally the same, the only difference is that it now uses Plotly, so I guess that has resulted in a change in the plot data format. For example, .report_plot_data.qualimap_coverage_histogram.datasets is different between the two formats.

Likely I will just have to add some more plot parsers and possibly update the vignette. However, the default functionality works as is, especially if you don't want to extract plot data.

@multimeric multimeric linked a pull request Apr 1, 2024 that will close this issue
@multimeric
Copy link
Owner

Hi @fgvieira, can you please test if my branch works for your use case? You can test it using remotes::install_github("multimeric/TidyMultiqc", ref="multiqc_1.2").

@fgvieira
Copy link
Author

fgvieira commented Apr 2, 2024

Thanks for the super fast reply!

Parsing the general and raw, it seems to work fine:

df <- load_multiqc("multiqc_data.json",
  sections=c("general", "raw"),
  find_metadata = function(sample, parsed) {
    parsed[c(
      "config_creation_date",
      "config_version",
      "config_output_dir"
    )]
  }
)

But, when parsing plot:

> df <- load_multiqc("multiqc_data.json",
  sections=c("general", "raw", "plot"), plots=list_plots("multiqc_data.json")$id,
  find_metadata = function(sample, parsed) {
    parsed[c(
      "config_creation_date",
      "config_version",
      "config_output_dir"
    )]
  }
)

I get an error that seems to be related to modules that were run multiple times.

Example data:
multiqc_data.json.zip

@multimeric
Copy link
Owner

I really don't recommend that you try to grab all the data like this. The general data is generally more sensible than raw, so only use the latter if absolutely essential. In terms of plots, it's not trivial to implement parsers for each plot type, so I've only done so for a small subset. If there is a specific plot that you think you need for your analysis then feel free to open an issue about it. However, requesting all plot data doesn't really make sense to me. If you want all the data, or you want to answer a very specific question about the data, then I would suggest loading the JSON file yourself in R.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants