Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix hist limit issue for step* for both linear and log scale #1075

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
48 changes: 33 additions & 15 deletions lib/matplotlib/axes.py
Expand Up @@ -7920,7 +7920,7 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
_barfunc = self.barh
else: # orientation == 'vertical'
_barfunc = self.bar

for m, c in zip(n, color):
patch = _barfunc(bins[:-1]+boffset, m, width, bottom,
align='center', log=log,
Expand All @@ -7938,15 +7938,37 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,

x[0::2], x[1::2] = bins, bins

minimum = min(bins)
minimum = np.min(n)

if align == 'left' or align == 'center':
x -= 0.5*(bins[1]-bins[0])
elif align == 'right':
x += 0.5*(bins[1]-bins[0])

if log:
y[0],y[-1] = minimum, minimum
#in the case of log scale there are three cases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap these comment lines to less than 75 characters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

75? I thought we did 80?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I was wrong, in PEP8, docstrings are recommended to be at most 72 characters.

#to consider for the y axis limit
#1) has zero or negative bin but other bins may be 0<n<1
# due to weighting or normalization. We want
# (positive minimum)*0.1 to be the minimum
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you mean np.any() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange, my line comments got applied to the wrong lines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you have since fixed the np.any() I thought you needed.

#2) has zero/negative bin but the rest are greater than 1.
# this one we want minimum at 0.1 (normal cases)
#3) all zero/negative bin .... no idea how to display
# this anyway so put minimum at 0.1
ndata = np.array(n)
if np.any(ndata<=0):
warnings.warn(
'Histogram bins with zero or negative count '+
'are not displayed in log scale.')

if np.any(ndata>0):
if np.all(ndata[ndata!=0]>=1):
minimum = 0.1 # case 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be log_base ** -1? (i.e. 0.1 only makes sense when the log base is 10).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but how do I access the base of log scale axis?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I missed the fact that hist doesn't take a base as argument. hist is a bit of an anomaly -- it takes a 'log' argument, whereas most other things (except bar*) work by requiring a subsequent call to set_xscale or set_yscale, which of course take an optional "base". Making this also work that way would be a little bit of work, because this minimum adjustment stuff you've added here would have to be moved to the draw phase. Ideally, that's where I'd like to see this go, but I don't know if that's worth abandoning this fix, and we've have to maintain backward compatiblity anyway -- this "log" kwarg goes back at least 5 years, as far as I can tell.

We could either add an additional keyword "log_base", or extend "log" so it can be "True" (meaning base 10 as it does now), or a number, meaning a base. I would also recommend updating bar* to be in line with that.

However, I'm aware that none of this is premised on any real world use case -- just my own desire for consistency across uses of log scale, so I could be persuaded to leave this as-is by further discussion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be kind of hard to set the minimum correctly if log base can be changed between to calls. Consider this

ax.hist( x1, log=10)
ax.hist(x2, log=2)

The first hist call would have no idea that eventually the base of log scale will be changed to 2.

But, I do not think it would be hard for use to do another call to set_ylim though. The histogram will be painted all the way down (see line 7892) anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- that, in a nutshell, is why I'm not a fan of "log" being a kwarg to this method. You may want to hold of on implementing any sort of fix for this issue until we discuss it -- it may be best to just leave as-is.

@efiring, @WeatherGod, @pelson: any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w.r.t. not being able to access the axes's log base, we are assuming base 10 throughout this part of the code, and we have always done so before. We could always look to future PRs to address the handling of arbitrary log bases. I see this more as a good bug fix to an outstanding issue.

I guess it would be smart, though, for future developers, to somehow comment the spots that are assuming base 10 so that they know to change it if/when we extend the interface in the future to accept arbitary log bases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with @WeatherGod here: let's leave this assuming base 10 for now.

else:
minimum = np.min(ndata[ndata>0])*0.1 #case 1)
else:
minimum = 0.1 #case 3)

if orientation == 'horizontal':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set up your editor to strip trailing white spaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you have since fixed the training white spaces.

self.set_xscale('log')
else: # orientation == 'vertical'
Expand All @@ -7957,7 +7979,11 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
for m, c in zip(n, color):
y[1:-1:2], y[2::2] = m, m
if log:
y[y<minimum]=minimum
tiniest = np.finfo(np.float).tiny
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need some of this logic here in the case of multiple calls to hist() or mixed plotting. If there are existing bounds, they should be used (at least, that is what this logic attempts to do. It might need some serious updating, though).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the part I'm not sure about the inner working. This logic seems to start around line 7999.

So, as far as I understand how limit work for multiple plots. The existing desired range is store in self.dataLim.intervalx/y, right? So, all i have to do is make a super range that contains the new limit and old limit, right?

#make sure they don't show up in display
y[y<minimum]= tiniest
#making sure it is filled all the way down the axis
y[0],y[-1] = tiniest, tiniest
if orientation == 'horizontal':
x,y = y,x

Expand All @@ -7970,22 +7996,14 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,

# adopted from adjust_x/ylim part of the bar method
if orientation == 'horizontal':
xmin0 = max(_saved_bounds[0]*0.9, minimum)
xmin = minimum
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments above wrt multiple calls to hist() goes here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of the part I don't fully understand. What function does 0.9 serve and where should I get the old bound. What's the difference between self.datalim.bound and self.datalim.intervalx/y?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time for a full explanation as I am about to head off on
vacation, but perhaps I can provide some clues for you. I think this code
part has been neglected. Other functions like plot() and scatter() utilize
pads for the data limits so that the axes wouldn't be extremely tight
around the data being plotted. This value used to be a hard-coded 0.05 on
each endside (for a total of 0.1). In plot() and scatter() (and possibly
bar()) we now handle this a lot more intelligently, but I think hist() was
left behind in this regard.

Note that data limits don't necessarially mean the same as view limits. I
am not sure what the distinction is between data limits and data interval
is, though.

I hope this helps!
Ben Root

xmax = self.dataLim.intervalx[1]
for m in n:
xmin = np.amin(m[m!=0]) # filter out the 0 height bins
xmin = max(xmin*0.9, minimum)
xmin = min(xmin0, xmin)
self.dataLim.intervalx = (xmin, xmax)
elif orientation == 'vertical':
ymin0 = max(_saved_bounds[1]*0.9, minimum)
ymin = minimum
ymax = self.dataLim.intervaly[1]
for m in n:
ymin = np.amin(m[m!=0]) # filter out the 0 height bins
ymin = max(ymin*0.9, minimum)
ymin = min(ymin0, ymin)
self.dataLim.intervaly = (ymin, ymax)

if label is None:
labels = [None]
elif is_string_like(label):
Expand Down
Binary file not shown.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.