Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report volcano plot will now also plot points with pval 0 #229

Closed
wants to merge 1 commit into from

Conversation

WackerO
Copy link
Collaborator

@WackerO WackerO commented Jan 30, 2024

In response to #227

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/differentialabundance branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@WackerO
Copy link
Collaborator Author

WackerO commented Jan 30, 2024

E.g. see this report: SRP254919_zero.html.zip

Copy link

github-actions bot commented Jan 30, 2024

nf-core lint overall result: Failed ❌

Posted for pipeline commit ce334fc

+| ✅ 244 tests passed       |+
#| ❔   1 tests were ignored |#
-| ❌  30 tests failed       |-

❌ Test failures:

  • files_exist - File must be removed: lib/nfcore_external_java_deps.jar
  • nextflow_config - Config default value incorrect: params.affy_cel_files_archive is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.querygse is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.affy_cdfname is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.filtering_min_abundance is set as 1.0 in nextflow_schema.json but is 1 in nextflow.config.
  • nextflow_config - Config default value incorrect: params.filtering_min_samples is set as 1.0 in nextflow_schema.json but is 1 in nextflow.config.
  • nextflow_config - Config default value incorrect: params.differential_min_fold_change is set as 2.0 in nextflow_schema.json but is 2 in nextflow.config.
  • nextflow_config - Config default value incorrect: params.differential_max_pval is set as 1.0 in nextflow_schema.json but is 1 in nextflow.config.
  • nextflow_config - Config default value incorrect: params.limma_spacing is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.limma_block is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.limma_correlation is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.limma_p_value is set as 1.0 in nextflow_schema.json but is 1 in nextflow.config.
  • nextflow_config - Config default value incorrect: params.shinyngs_shinyapps_account is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.shinyngs_shinyapps_app_name is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.gene_sets_files is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.logo_file is set as docs/images/nf-core-differentialabundance_logo_light.png in nextflow_schema.json but is /home/runner/work/differentialabundance/differentialabundance/docs/images/nf-core-differentialabundance_logo_light.png in nextflow.config.
  • nextflow_config - Config default value incorrect: params.css_file is set as assets/nf-core_style.css in nextflow_schema.json but is /home/runner/work/differentialabundance/differentialabundance/assets/nf-core_style.css in nextflow.config.
  • nextflow_config - Config default value incorrect: params.citations_file is set as CITATIONS.md in nextflow_schema.json but is /home/runner/work/differentialabundance/differentialabundance/CITATIONS.md in nextflow.config.
  • nextflow_config - Config default value incorrect: params.report_title is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.report_author is set as None in nextflow_schema.json but is null in nextflow.config.
  • nextflow_config - Config default value incorrect: params.report_description is set as None in nextflow_schema.json but is null in nextflow.config.
  • files_unchanged - .github/workflows/branch.yml does not match the template
  • files_unchanged - .github/workflows/linting_comment.yml does not match the template
  • files_unchanged - .github/workflows/linting.yml does not match the template
  • files_unchanged - assets/email_template.html does not match the template
  • files_unchanged - assets/email_template.txt does not match the template
  • files_unchanged - assets/nf-core-differentialabundance_logo_light.png does not match the template
  • files_unchanged - docs/images/nf-core-differentialabundance_logo_light.png does not match the template
  • files_unchanged - docs/images/nf-core-differentialabundance_logo_dark.png does not match the template
  • files_unchanged - pyproject.toml does not match the template

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.12
  • Run at 2024-01-30 15:09:20

@kdivilov
Copy link

I'm not sure if the example report was to only show how the new p-value=0 threshold on the volcano plot would look like or to show a complete working example. If the former, then it looks great. If the latter, there are a few issues:

  1. RP23-4H14.1 is not in sig_differential for both p_value_types.
  2. RP23-4H14.1 is not given de_pval_fc_label for the Adjusted p_value_types.
  3. All genes with de_pval_fc_label (Jchain, Igkc, Ighg2b, Igha) except Pou2af1 have p values = 0, so their y-values on the volcano plot should be either ceiling(max(abs(-log10(0.0003))) * 1.1) or ceiling(max(abs(-log10(0.0489))) * 1.1) for the Unadjusted and Adjusted p_value_types, respectively in the "hND6 versus mCherry in treatment" volcano plot (similar issue with the other volcano plot).

After reviewing the code, I couldn't find the reason for these issues, but I'm also not familiar with this codebase.

@WackerO
Copy link
Collaborator Author

WackerO commented Feb 1, 2024

@kdivilov It was mainly there for showing what the plot looks like, but the example is also working actually. RP23-4H14.1 is not DE according to the tables, why would you expect it to be marked as DE? I just had a look at the DE table, its entry is

gene_id	gene_name	gene_biotype	pvalue	padj	log2FoldChange	differential_status	y_data
ENSMUSG00000107358	RP23-4H14.1	TEC	0.05018516	0.9026969	-0.06082605	Not significant	0.0444580489717465

Also, sorry for the confusion, the tables below the plots are rounded. The entries where it says padj 0 aren't actually 0, but a very small number. I've attached the report again, this time just without rounding:

SRP254919.html.zip

@pinin4fjords
Copy link
Member

I'm not 100% sold on this. The data are fundamentally outside the axes, and inventing a p value ceiling such that we can show such points as a smear at the extremes of the plot doesn't add much to my mind.

The tables are there to see the gene lists. @WackerO we might want to refine the rounding logic you added to the reporting to use scientific notation rather than marking non-0 p values as 0.

@WackerO
Copy link
Collaborator Author

WackerO commented Feb 7, 2024

I'm not 100% sold on this. The data are fundamentally outside the axes, and inventing a p value ceiling such that we can show such points as a smear at the extremes of the plot doesn't add much to my mind.

The tables are there to see the gene lists. @WackerO we might want to refine the rounding logic you added to the reporting to use scientific notation rather than marking non-0 p values as 0.

I can definitely change the rounding so that the table is clearer, but I would like to argue in favor of the ceiling. This is something that is for example also done by gprofiler (clicking on random example, thenrun query will produce a plot that's capped at 16).
I think the plots are there to visualize the data that are listed in the tables, and if some entries are frequently missing, IMO the visualization is lacking (especially if those missing points are the ones with the smallest p value).

Do you have a specific reason why you don't want these points added to the plot? @pinin4fjords

@pinin4fjords
Copy link
Member

Because such points can't be represented properly, they are by definition outside the plot boundaries.

You can't just define an arbitrary ceiling- another point could come along with a lower p value than the ceiling. So you'd probably have to set a ceiling equal to the smallest non-zero p value in the data (or, say half the lowest p value). Then you end up distorting the plot for the points you can represent.

I'd much rather just show a warning that there are points with p values of 0 that can't be shown- and the user can go find those in the tables.

@WackerO WackerO mentioned this pull request Feb 8, 2024
10 tasks
@WackerO
Copy link
Collaborator Author

WackerO commented Feb 27, 2024

#232 solves this, closing this PR

@WackerO WackerO closed this Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants