Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non bonferroni comparisons corrections have multiple significances? #30

Closed
soorajachar opened this issue Sep 6, 2021 · 9 comments · Fixed by #31
Closed

Non bonferroni comparisons corrections have multiple significances? #30

soorajachar opened this issue Sep 6, 2021 · 9 comments · Fixed by #31

Comments

@soorajachar
Copy link

Hello thank you for this updated version of statannot it's been very useful. I am currently having an issue with non-bonferroni multiple comparisons corrections; they seem to give me two significances like this (with the holms-bonferroni; the "* (ns)" label is what seems strange to me):

Screen Shot 2021-09-06 at 11 23 34 AM

vs. the bonferroni correction, which only gives me a single significance for each comparison:

Screen Shot 2021-09-06 at 11 25 02 AM

Strangely I do get this warning for using holms-bonferroni, but not with bonferroni alone:

/opt/anaconda3/lib/python3.7/site-packages/statannotations/_Plotter.py:338: UserWarning: Invalid x-position found. Are the same parameters passed to seaborn and statannotations calls? or are there few data points?
  "Invalid x-position found. Are the same parameters passed "

Thanks in advance.

@trevismd
Copy link
Owner

trevismd commented Sep 7, 2021

Hello @soorajachar,
Thank you for your comment and your questions.

About the annotation produced

  • Preamble : If you also plot the graph without any correction method (default), you will see that the stars are the same as with "HB" correction but different with "Bonferroni". If you choose the full or simple text_format, you'll see that the reported p-values are actually different.
  • When using certain correction methods, it is more common to report "corrected" p-values, as with "Bonferonni". Eg, a 0.03, *, becomes 0.06 and will therefore be annotated as "ns".
  • With other methods such as HB, we chose to keep reporting the "original" or "raw" p-value, but add a "(ns)" tag when, even though the raw p-value is below the initial significance threshold, it is not to be considered significant (because of the multiple comparisons correction). i.e., we will keep reporting and translating the original 0.03 p-value (hence the *), and add "(ns)".

I hope it is now at least more clear. Please tell me if it is not the case, if there is a possible bug, or if you'd prefer things to be handled differently in some way.

About the warning message

This warning is not linked to the correction method. It seems that you are overlaying the bars with data points, but each bar only has 3 corresponding points. This is what is referred to in "or are there few data points?". You can see that the three points are not in the same x position in both charts, and they would not always be in the same place on each plot if you were to re-run them.
Usually, it will not result in any problem (and the warning can be ignored), but in some cases, I suspect it could lead to weird looking brackets spanning over several bars. In these cases, re-running the plot can do the trick.
The warning is there to make sure the plots are checked in some cases, such as 'batch' processing.

@soorajachar
Copy link
Author

soorajachar commented Sep 7, 2021

Thank you I believe that makes things more clear. To clarify one point though, does this mean with type I corrections (like holm-bonferroni, benjamin-hochberg etc.) the corrected p-value will always be either significant "*" or not significant "ns"? There will no longer be levels of significance (**, *** etc.)? If so it may be good to add a non-default option to the Annotater class to only show the corrected significance for these type I corrections (either * or ns) and discard the original significance, as it can get a bit crowded to show both at once with many comparisons, and is also a bit difficult to explain.

@trevismd
Copy link
Owner

trevismd commented Sep 8, 2021

No, for these types of corrections, all the thresholds configured (and shown on the star notation legend) are still used, so you could also have ** (ns), *** (ns), and so on.

Really, these methods only add these "(ns)" where appropriate.

Isn't it fairly easy to explain like this ?

The idea is to keep displaying the raw p-value. I guess there could be an option to report a corrected one and have the same behaviour as with "Bonferonni", although to me we lose information in favor of little more clarity.

@soorajachar
Copy link
Author

Right I meant that the Type I corrections would never have anything besides an (ns) as the corrected threshold (there would never be a "** (*)" for example). I agree that the (ns) is useful to have, and I think it should be the default option, but it can cause problems in terms of readability when conducting exploratory analysis with a lot of comparisons, so I think it still would be useful to have an option to turn off the original p-values and just display the ns when doing a Type I correction.

@trevismd
Copy link
Owner

It's a good idea, thank you for sharing your thoughts!

I think I can come up with an implementation fairly quickly. I hope you'll give some feedback then too ;-)

@soorajachar
Copy link
Author

No problem thanks for listening to my feedback. Let me know if I can be of any help.

@trevismd
Copy link
Owner

No problem thanks for listening to my feedback. Let me know if I can be of any help.

Thanks! Would you have the time to review my proposal in #31, or just see if it works for you ?

@soorajachar
Copy link
Author

Looks good to me; I think adding the option for custom formatting of the "ns" string is an especially good idea as different journals have different formatting standards for reporting multiple comparisons corrections.

@liviu-
Copy link

liviu- commented Mar 15, 2024

The idea is to keep displaying the raw p-value. I guess there could be an option to report a corrected one and have the same behaviour as with "Bonferonni", although to me we lose information in favor of little more clarity.

Is it not possible to show both the raw and the adjusted pvalues at least when using verbose output (not in the figure)? The corrected pvalue is useful to report as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants