New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor hist for less numerical errors #22773
Conversation
Is the numerical problem the diff? Would it make sense to just convert the numpy bin edges to float64 before the diff? |
Hard to say. But the problem is that one does quite a bit of computations and at some stage there are rounding errors that leads to that there are overlaps or gaps between edges. So postponing diff will reduce the risk that this happens (on the other hand, one may get cancellations as a result, but I do not think that will happen more now since the only things we add here are about the same order of magnitude).
Yes, or even float32, but as argued in the issue, one tend to use float16 for memory limited environments, so not clear if one can afford it. Here, I am primarily trying to see the effect of it. As we do not deal with all involved computations here, some are also in |
It seems like we do not have any test images that are negatively affected by this at least... But it may indeed not be the best solution to the problem. |
Ahh, but even if the data to hist is |
I think you just want another type catch here (I guess I'm not sure the difference between diff --git a/lib/matplotlib/axes/_axes.py b/lib/matplotlib/axes/_axes.py
index f1ec9406ea..88d90294a3 100644
--- a/lib/matplotlib/axes/_axes.py
+++ b/lib/matplotlib/axes/_axes.py
@@ -6614,6 +6614,7 @@ such objects
m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
tops.append(m)
tops = np.array(tops, float) # causes problems later if it's an int
+ bins = np.array(bins, float) # causes problems is float16!
if stacked:
tops = tops.cumsum(axis=0)
# If a stacked density plot, normalize so the area of all the |
Numpy accepts builtin python types and maps them to numpy types: https://numpy.org/doc/stable/reference/arrays.dtypes.html#specifying-and-constructing-data-types The mapping can be platform specific. E.g. |
PR Summary
Should help with #22622
Idea is to do computation on the edges rather than the widths and then do diff on the result. This may be numerically better (or not...). Or rather, it is probably numerically worse, but will give visually better results...
Probably the alternative approach of providing a flag to
bar
/barh
, making sure that adjacent bars are actually exactly adjacent may be a better approach, but I wanted to see what comes out of this first...PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).