Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.3.0: type of hist return value changed #2293

Closed
jgehrcke opened this issue Aug 12, 2013 · 5 comments
Closed

1.3.0: type of hist return value changed #2293

jgehrcke opened this issue Aug 12, 2013 · 5 comments
Milestone

Comments

@jgehrcke
Copy link
Contributor

As documented, hist returns tuple : (n, bins, patches) .... In previous versions of matplotlib, n contained integer values. As of 1.3.0, the data type has changed to float. This change is not documented. It was introduced with commit 6917f4f in the 1.3.x branch and is supposed to fix some internal problem. For 1.2.x, this problem was fixed in a different way, see commit neggert@932dd88. I don't know if the above-mentioned change in 1.2.x also changed the return value data type.

In our case, the data type change has broken application code. Although the data type is not specified in the documentation, we think this change of behavior can be considered a bug and at the least needs to be noted in the changelog. For the future, the right way would be to specify the data type in the documentation.

For convenience, parts of the code in question:

m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
m = m.astype(float) # causes problems later if it's an int
if mlast is None:
    mlast = np.zeros(len(bins)-1, m.dtype)
if normed and not stacked:
    db = np.diff(bins)
    m = (m.astype(float) / db) / m.sum()
if stacked:
    m += mlast
    mlast[:] = m
n.append(m)

Multiple occurrences of m.astype(float) as well as the m.dtype in mlast = np.zeros(len(bins)-1, m.dtype) suggest that there is unreasonable redundancy introduced by the code related to the "stacked" feature.

Thanks for consideration!

@mdboom
Copy link
Member

mdboom commented Aug 12, 2013

I agree about the redundancy -- it probably comes about by an accident of how tricky this was to get right.

We can certainly document the return type, but I don't know if it's reasonable/possible to return it to an int.

@neggert: Any thoughts on this?

@neggert
Copy link
Contributor

neggert commented Aug 12, 2013

The problem stems from an undocumented "feature" in numpy.histogram. It returns the histogram as ints if you don't normalize or use weights, but floats if you do. I took the route of just forcing everything to be a float so that matplotlib doesn't carry over this undocumented numpy "feature". This was also the most straightforward way to avoid problems with int division.

We could probably put it back to the way it was with a liberal sprinkling of astype(float). Some of those are already there, which is currently redundant. However, I, at least, would prefer to get a consistent return type.

@jgehrcke
Copy link
Contributor Author

A (documented) consistent return type would be the long-term goal. In the short term, we should get the documentation right about the current behavior. Do you consider the int -> float change to be API-breaking or not (should it be listed under http://matplotlib.org/api/api_changes.html)?

@mdboom
Copy link
Member

mdboom commented Sep 17, 2013

I think this is fine to just throw in API changes. @neggert: Would you mind adding an entry to http://matplotlib.org/api/api_changes.html? I think this belongs on 1.3.x, since the (accidental and minor) API change is already there, it just wasn't documented.

@mdboom
Copy link
Member

mdboom commented Sep 27, 2013

Closing this, as #2430 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants