Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy.dot has undocumented dtype behavior, resulting in different values than (a * b).sum() #12344

Open
itamarst opened this issue Nov 6, 2018 · 4 comments

Comments

@itamarst
Copy link

itamarst commented Nov 6, 2018

I would expect (a * b).sum() and np.dot(a, b) to be the same for 1D arrays. However, the former upscales(?) the dtype, and the latter doesn't, so they give different results. This may be expected behavior, but if so the np.dot docs should mention the dtype behavior.

Reproducing code example:

In [36]: a = np.array([254, 2, 200], dtype=np.uint8)

In [37]: b = np.array([True, False, True], dtype=np.bool)

In [38]: (a * b).sum()
Out[38]: 454

In [39]: np.dot(a, b)
Out[39]: 198

In [40]: np.dot(a, b).dtype
Out[40]: dtype('uint8')

Numpy/Python version information:

Numpy 1.15.0, Python 3.6.5

@sturlamolden
Copy link
Contributor

Integer overflow: 198 + 256 = 454.

Should np.dot check for overflow before outputting an integer?

@mattip
Copy link
Member

mattip commented Dec 23, 2018

The current design is to choose the output dtype with no consideration overflow. Overflow issues seem to frequently trip up users, perhaps an overall vision of how to deal with them is needed

@itamarst
Copy link
Author

As mentioned above, a minimal approach would be to document the expected behavior better.

@eric-wieser
Copy link
Member

If anything, I'd consider the behavior of sum to be more surprising here, and in need of more documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants