-
-
Notifications
You must be signed in to change notification settings - Fork 7.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boxplot percentiles for whiskers #10357
Comments
Ping @phobson. Oddly, I get different results than you. Can you check your actual outcome? I get:
This says to me that Matplotlib is using the data for the percentiles (i.e. using the last value less than the 95th percentile) whereas numpy is interpolating to where the 95th percentile would be. I don't quite get what algorithm they are using for that, I assume just linear interpolation? |
You are right, I get the same
I might have changed the set of values before copying them here. Apologies. |
I'll mark as API consistency, but its really consistency with numpy and/or a documentation issue (where we need to be explicit that we are not consistent w/ numpy). But I don't commonly use whisker plots, so I won't personally comment on what is the "right" thing to be doing here... |
This is a nuance of boxplots themselves. That nuance is that you don't show any values that you don't actually have. When you provide percentiles as as the loval = np.percentile(x, whis[0])
hival = np.percentile(x, whis[1]) But then we go through the same compression process to find the move extreme data point within those ranges, e.g., # get high extreme
wiskhi = np.compress(x <= hival, x)
if len(wiskhi) == 0 or np.max(wiskhi) < q3:
stats['whishi'] = q3
else:
stats['whishi'] = np.max(wiskhi) The point of all that is that the fences always represent an actual value in the dataset I'd be willing to entertain the idea that we should special case percentile-based whiskers. |
I suspect, that compression should not be used when using percentiles. Does someone have experience with different whisker usages across fields? Or alternatively, does someone has access to |
@timhoffm Well, something like Sci-Hub might be an option? I am likely to have access to those papers but I guess that it would pose some copyright issues posting them directly here (or even on the devel mailing-list) :/... |
@timhoffm I have those papers. How do I share them? |
This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help! |
Bug report
I'm plotting a boxplot with the data attached and found that when setting the whiskers to 5th and 95th percentiles their are different from the numpy calculation of the same percentiles.
Code for reproduction
Actual outcome
Expected outcome
Matplotlib version
print(matplotlib.get_backend())
): Qt5AggInstalled using manjaro (distro linux) package manager.
The text was updated successfully, but these errors were encountered: