Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Axes.hist(...log=True) mishandles y-axis minimum value #196

Closed
ddale opened this Issue Jun 20, 2011 · 3 comments

Comments

Projects
None yet
4 participants
Contributor

ddale commented Jun 20, 2011

Original report at SourceForge, opened Thu Feb 24 15:32:30 2011

The pyplot.hist(..., log=True) method with histtype='step' (or 'stepfilled') does not properly handle bins with zero or negative contents.

Instead of setting all invalid bin contents equal to the minimum valid bin contents, it sets them equal to the minimum bin edge. This mixes up data, using x-axis data to set the limits of the y axis. The following code demonstrates the bug in the upper and lower panels, while the middle one avoids it. The plot it produces is attached. Notice the weird diagonal filling at the corners and the incomplete filling of the main histogram. I chose histtype='stepfilled', because the problem is more visible this way, but it appears in 'step', too. I'm using matplotlib 1.0.1 on Darwin (Mac port library py26-matplotlib @1.0.1_2).

This bug is related to 3032853, but it is different. (In fact, it's possible this problem arose when 3032853 was being worked on last summer.)

The problem is in line 7747 of Axes.py, where minimum = min(bins) is inappropriate. I patched up my own copy by replacing that line with these two:

        ndata = np.array(n)
        minimum = (ndata[ndata>0].min())*0.1

...but the choice of setting invalid contents to 0.1 times the lowest valid contents is pretty arbitrary. It certainly won't be a round number if the contents are normalized. Perhaps insiders have a better idea what to do here.

import numpy
import matplotlib.pyplot as plt

Make random data; shift to ensure -2 as its minimum value

data = numpy.random.standard_normal(2000)
data += -2.0 - data.min()

Two more sets where the minimum is guaranteed to be +.1000 and 30.1

data_pos = data + 2.1
data_big = data_pos + 30

plt.clf()
ax=plt.subplot(311)
plt.hist(data, 100, histtype='stepfilled', log=True)
plt.text(.1,.9,'data.min=%.3f'%data.min(),
transform = ax.transAxes)
plt.title("There's a bug in step/stepfilled hist with log and data.min()<=0")

ax=plt.subplot(312)
plt.hist(data_pos, 100, histtype='stepfilled', log=True)
plt.title("Bug is avoided when data.min() is positive")
plt.text(.1,.9,'data.min=%.3f'%data_pos.min(),
transform = ax.transAxes)

ax=plt.subplot(313)
plt.hist(data_big, 100, histtype='stepfilled', log=True)
plt.title("But there's also a problem if data.min() is too big!")
plt.text(.1,.8,'data.min=%.3f'%data_big.min(),
transform = ax.transAxes)

SourceForge History

  • On Wed Mar 23 21:26:12 2011, by pivanov314: assigned_to: 100
  • On Thu Feb 24 15:32:30 2011, by joe_fowler: File Added: 402625: hist_log_stepfilled_bug.png
Contributor

keflavich commented Jan 27, 2012

Any word on whether this will get fixed? log histograms are really nice...

I confirm I have the same problem, and also that's is pretty annoying... will anybody fix it?

Member

dmcdougall commented Jan 27, 2013

Resolved in #1684.

@dmcdougall dmcdougall closed this Jan 27, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment