You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
I'm trying to bin a two dimensional histogram using the df.count method. I wish for the histogram to be bound inside the min/max points for each axis. In other words I want a histogram to stretch out over the whole chart. I'm expecting to get a histogram that has at least one non-zero bin in every edge row or column. The problem is I get back histograms that have multiple contiguous zero rows or columns on the border.
How do I generate a histogram of two columns where each edge contains the bounding min or max value for the row / column?
Here is an example of a histogram that I generated which is not bound by non-zero bins along the edges. The top, bottom, and right edges of this histogram have a lot of empty area:
The bin values match what is rendering in the chart:.
You probably have some outliers in your data. And, in Vaex, the histogram bins are half open [min, max). A dirty way to include the last value in the last bin is to do. limits=[[xmin, xmax+eps], [ymin, ymax+eps], ...] where eps=1e-10, or ideally (1e-16/(xmin-xmax). Does that make sense?
I think I understand. Let me clarify: So by half open, do you mean that, for the max value, the bins go up to but don't include the last point? I should add eps caculation to my max values to include the max point?
Also should that value be be (1e-16/(xmax - xmin)) or (1e-16/(xmin - xmax))?
Thank you so much. I tried the formula provided and it looks like for one of my axes eps is too small. It gets rounded off. When I tried eps=1e-10 it works. Again, I appreciate you pointing me in the right direction and your quick response!
Description
I'm trying to bin a two dimensional histogram using the df.count method. I wish for the histogram to be bound inside the min/max points for each axis. In other words I want a histogram to stretch out over the whole chart. I'm expecting to get a histogram that has at least one non-zero bin in every edge row or column. The problem is I get back histograms that have multiple contiguous zero rows or columns on the border.
How do I generate a histogram of two columns where each edge contains the bounding min or max value for the row / column?
Here is an example of a histogram that I generated which is not bound by non-zero bins along the edges. The top, bottom, and right edges of this histogram have a lot of empty area:
The bin values match what is rendering in the chart:.
In my code, I first get the limits:
then I get and return the bins:
Software information
Additional information
Please state any supplementary information or provide additional context for the problem (e.g. screenshots, data, etc..).
The text was updated successfully, but these errors were encountered: