Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

fix hist limit issue for step* for both linear and log scale #1075

Closed
wants to merge 5 commits into from

6 participants

@piti118

Fix two closely related issue:
#1066
and
#196

lib/matplotlib/axes.py
@@ -7946,6 +7946,22 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
x += 0.5*(bins[1]-bins[0])
if log:
+ #in the case of log scale there are three cases to consider for the y axis limit
+ #1) has zero or negative bin but other bins may be 0<n<1 due to weighting. we want positive minimum*0.1 to be the minimum
@WeatherGod Collaborator

Please keep within 80 characters line width as per PEP8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@mdboom mdboom commented on the diff
lib/matplotlib/axes.py
((7 lines not shown))
+ #1) has zero or negative bin but other bins may be 0<n<1
+ # due to weighting or normalization. We want
+ # (positive minimum)*0.1 to be the minimum
+ #2) has zero/negative bin but the rest are greater than 1.
+ # this one we want minimum at 0.1 (normal cases)
+ #3) all zero/negative bin .... no idea how to display this
+ # anyway so put minimum at 0.1
+ ndata = np.array(n)
+ if np.any(ndata<=0):
+ warnings.warn(
+ 'Histogram bins with zero or negative count '+
+ 'are not displayed in log scale.')
+
+ if np.any(ndata>0):
+ if np.all(ndata[ndata!=0]>=1):
+ minimum = 0.1 # case 2)
@mdboom Owner
mdboom added a note

Shouldn't this be log_base ** -1? (i.e. 0.1 only makes sense when the log base is 10).

@piti118
piti118 added a note

Yes, but how do I access the base of log scale axis?

@mdboom Owner
mdboom added a note

Ah, I see. I missed the fact that hist doesn't take a base as argument. hist is a bit of an anomaly -- it takes a 'log' argument, whereas most other things (except bar*) work by requiring a subsequent call to set_xscale or set_yscale, which of course take an optional "base". Making this also work that way would be a little bit of work, because this minimum adjustment stuff you've added here would have to be moved to the draw phase. Ideally, that's where I'd like to see this go, but I don't know if that's worth abandoning this fix, and we've have to maintain backward compatiblity anyway -- this "log" kwarg goes back at least 5 years, as far as I can tell.

We could either add an additional keyword "log_base", or extend "log" so it can be "True" (meaning base 10 as it does now), or a number, meaning a base. I would also recommend updating bar* to be in line with that.

However, I'm aware that none of this is premised on any real world use case -- just my own desire for consistency across uses of log scale, so I could be persuaded to leave this as-is by further discussion.

@piti118
piti118 added a note

This might be kind of hard to set the minimum correctly if log base can be changed between to calls. Consider this

ax.hist( x1, log=10)
ax.hist(x2, log=2)

The first hist call would have no idea that eventually the base of log scale will be changed to 2.

But, I do not think it would be hard for use to do another call to set_ylim though. The histogram will be painted all the way down (see line 7892) anyway.

@mdboom Owner
mdboom added a note

Yes -- that, in a nutshell, is why I'm not a fan of "log" being a kwarg to this method. You may want to hold of on implementing any sort of fix for this issue until we discuss it -- it may be best to just leave as-is.

@efiring, @WeatherGod, @pelson: any thoughts?

@WeatherGod Collaborator

w.r.t. not being able to access the axes's log base, we are assuming base 10 throughout this part of the code, and we have always done so before. We could always look to future PRs to address the handling of arbitrary log bases. I see this more as a good bug fix to an outstanding issue.

I guess it would be smart, though, for future developers, to somehow comment the spots that are assuming base 10 so that they know to change it if/when we extend the interface in the future to accept arbitary log bases.

@mdboom Owner
mdboom added a note

I tend to agree with @WeatherGod here: let's leave this assuming base 10 for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@WeatherGod
Collaborator

log in hist has always been an odd duck. It tries to be like the bar* functions, but not quite. Note, however, that unlike the bar* functions, a log kwarg in hist actually has a purpose besides convenience. While one could pre-transform their data for the bar* functions, one can't with hist because what is plotted is the result of an internal aggregation calculation.

There is another PR that I am waiting for a few last polishings that also addresses an issue for log scaling of the results for hist2d(). Might be a good idea to make sure both functions support the same "log" kwarg inputs.

@WeatherGod WeatherGod commented on the diff
lib/matplotlib/axes.py
@@ -7946,7 +7946,29 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
x += 0.5*(bins[1]-bins[0])
if log:
- y[0],y[-1] = minimum, minimum
+ #in the case of log scale there are three cases
@WeatherGod Collaborator

Wrap these comment lines to less than 75 characters

@pelson Collaborator
pelson added a note

75? I thought we did 80?

@WeatherGod Collaborator

Actually, I was wrong, in PEP8, docstrings are recommended to be at most 72 characters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
lib/matplotlib/axes.py
@@ -7946,7 +7946,29 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
x += 0.5*(bins[1]-bins[0])
if log:
- y[0],y[-1] = minimum, minimum
+ #in the case of log scale there are three cases
+ #to consider for the y axis limit
+ #1) has zero or negative bin but other bins may be 0<n<1
+ # due to weighting or normalization. We want
+ # (positive minimum)*0.1 to be the minimum
@WeatherGod Collaborator

Don't you mean np.any() here?

@piti118
piti118 added a note

What do you mean?

@WeatherGod Collaborator

strange, my line comments got applied to the wrong lines.

@WeatherGod Collaborator

Looks like you have since fixed the np.any() I thought you needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@WeatherGod WeatherGod commented on the diff
lib/matplotlib/axes.py
((12 lines not shown))
+ #3) all zero/negative bin .... no idea how to display this
+ # anyway so put minimum at 0.1
+ ndata = np.array(n)
+ if np.any(ndata<=0):
+ warnings.warn(
+ 'Histogram bins with zero or negative count '+
+ 'are not displayed in log scale.')
+
+ if np.any(ndata>0):
+ if np.all(ndata[ndata!=0]>=1):
+ minimum = 0.1 # case 2)
+ else:
+ minimum = np.min(ndata[ndata>0])*0.1 #case 1)
+ else:
+ minimum = 0.1 #case 3)
+
if orientation == 'horizontal':
@WeatherGod Collaborator

Set up your editor to strip trailing white spaces.

@WeatherGod Collaborator

Looks like you have since fixed the training white spaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@WeatherGod WeatherGod commented on the diff
lib/matplotlib/axes.py
@@ -7957,7 +7979,11 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
for m, c in zip(n, color):
y[1:-1:2], y[2::2] = m, m
if log:
- y[y<minimum]=minimum
+ tiniest = np.finfo(np.float).tiny
@WeatherGod Collaborator

we need some of this logic here in the case of multiple calls to hist() or mixed plotting. If there are existing bounds, they should be used (at least, that is what this logic attempts to do. It might need some serious updating, though).

@piti118
piti118 added a note

Yes, that's the part I'm not sure about the inner working. This logic seems to start around line 7999.

So, as far as I understand how limit work for multiple plots. The existing desired range is store in self.dataLim.intervalx/y, right? So, all i have to do is make a super range that contains the new limit and old limit, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@WeatherGod WeatherGod commented on the diff
lib/matplotlib/tests/test_axes.py
((35 lines not shown))
+ ax.hist(d,bins=40,weights=w,histtype='stepfilled')
+
+
+@image_comparison(baseline_images=['hist_step_nw_log'])
+def test_hist_step_nw_log():
+ #test with negative weight
+ np.random.seed(0)
+ ndata = 10000
+ d= np.random.randn(ndata)+500
+ w= np.random.randn(ndata)
+ fig = plt.figure()
+ ax = fig.add_subplot(111)
+ ax.hist(d,bins=40,weights=w,histtype='stepfilled',log=True)
+
+
+@image_comparison(baseline_images=['hist2d'])
@WeatherGod Collaborator

I think you accidentally added these hist2d tests to this PR. Would be more appropriate over in the hist2d PR... with images.

@piti118
piti118 added a note

hmmm bad rebasing I think. It's actually in master or official repo. If I make a commit to remove them will it be gone? #826

@WeatherGod Collaborator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@WeatherGod
Collaborator

Great job with the additional tests. Only one request here about them. Think we could have corresponding tests for the horizontal case?

@WeatherGod WeatherGod commented on the diff
lib/matplotlib/axes.py
@@ -7970,22 +7996,14 @@ def hist(self, x, bins=10, range=None, normed=False, weights=None,
# adopted from adjust_x/ylim part of the bar method
if orientation == 'horizontal':
- xmin0 = max(_saved_bounds[0]*0.9, minimum)
+ xmin = minimum
@WeatherGod Collaborator

My comments above wrt multiple calls to hist() goes here.

@piti118
piti118 added a note

This is kind of the part I don't fully understand. What function does 0.9 serve and where should I get the old bound. What's the difference between self.datalim.bound and self.datalim.intervalx/y?

@WeatherGod Collaborator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@lpsinger

Does this pull request need any more work? histtype='stepfilled' also produces pretty unsightly results on master.

@piti118

I think it needs testing on multiple plots being drawn together. I'm not 100% sure what _save_bounds does.

@dmcdougall
Collaborator

I just saw this. #1684 was merged and #196 was closed. Does this PR also need to be merged? I.e., does it fix a separate issue not addressed in #1684?

@dmcdougall dmcdougall was assigned
@piti118

They seems to fix the same thing. I'll close this.

@piti118 piti118 closed this
@dmcdougall
Collaborator

@piti118 Judging by the title of this PR, this patch seems to address behaviour that can be seen when using a linear scale rather than a log scale. PR #1684 only addresses an issue with log scaling. Therefore it might be the case that this may need to be reopened.

Thanks for taking the time and the effort to write a patch for the codebase, @piti118.

Again, if you see unexpected behaviour with linear scaling then please feel free to re-open this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 12, 2012
  1. @piti118
  2. @piti118

    fix case #2)

    piti118 authored
Commits on Aug 14, 2012
  1. @piti118
  2. @piti118
Commits on Oct 30, 2012
  1. @piti118

    update axes comment

    piti118 authored
Something went wrong with that request. Please try again.