Avoid `np.prod` in `make_shared_array` by h-mayorquin · Pull Request #2621 · SpikeInterface/spikeinterface

h-mayorquin · 2024-03-22T18:34:28Z

np.prod produces a numpy scalars (not integers or floats!).

Numpy scalars behave like numpy values and can overflow. Example:

In [1]: a_billion_int = 1_000_000_000

In [2]: np.prod((a_billion_int, a_billion_int, a_billion_int, a_billion_int))
Out[2]: -5527149226598858752

Python integers on the other hand do not overflow.

This PR changes this so overflow errors do not happen.

This error appear to a user here:
#1871 (comment)

zm711 · 2024-03-22T18:38:36Z


    dtype = np.dtype(dtype)
-    nbytes = int(np.prod(shape) * dtype.itemsize)
+    nbytes = prod(shape) * dtype.itemsize


This still needs to be wrapped as an int no?

Related to Windows failing.

itemsize is int, shape is int, that's unecessary. Am I wrong?

When I check they are coming up as ints, but based on the fact the tests are failing just for Windows and all you did was remove the int over the operation maybe something was being converted.... Not sure.

The thing that converts is the np.prod. because of an overflow. Let me show you:

In [1]: a_billion_int = 1_000_000_000 In [2]: np.prod((a_billion_int, a_billion_int, a_billion_int, a_billion_int)) Out[2]: -5527149226598858752

Yeah, the only thing that I can't think off is that fork (the multiprocessing mode) does not exist on windows which implies some re-instantiation and some of the attributes estimated instead of accessed... But I am out of luck.

Where else is a possible point of overflow?

Windows and linux numpy handle overflow differently. See this. Based on the discussion the behavior will likely be changed in Numpy 2.0, but in short

np.full((2, 2), np.iinfo(np.int32).max, dtype=np.int32).sum(axis=0) shows similar behavior: array([4294967294, 4294967294]) on ubuntu and array([-2, -2]) on windows.

However, this function returns an array instead of a scalar, so we can inspect the dtype. On Ubuntu the dtype is int64, but on Windows the dtype is int32.

So maybe this was due to the difference in how certain cases of overflow occur.

Thanks a bunch for sharing the issue!

So numpy 2.0 should align this.

Anything that makes the debugging easier among the operating systems. Much less surface area if one fix works for all of them. :) (although the final message wasn't completely confident sounding...)

zm711 · 2024-03-22T19:15:46Z


    dtype = np.dtype(dtype)
-    nbytes = int(np.prod(shape) * dtype.itemsize)
+    shape = (int(x) for x in shape)  # We need to be sure that shape comes in int and not number scalars


Suggested change

shape = (int(x) for x in shape) # We need to be sure that shape comes in int and not number scalars

shape = tuple(int(x) for x in shape) # We need to be sure that shape comes in int and not numy scalars

Cool you saw it :)

Yes, thanks, I realized it just now.

You know this generator notation has fucked me up a couple of times, check this out:

is_one_equal_to_0 = np.all((1 == 0 for _ in range(10))) if is_one_equal_to_0: print("This is great, 1 is equal to 0")

Syntactic suggar is nice until it is not.

That's a cool example. The tuple/generator for () and (even sometimes I forget) {} for set in addition to dictionary. Those mess with me. But yeah the accident generator is way too often.

samuelgarcia · 2024-03-25T07:42:51Z

bien joué!

do not use math prod

269e994

h-mayorquin requested a review from alejoe91 March 22, 2024 18:34

zm711 reviewed Mar 22, 2024

View reviewed changes

h-mayorquin added 2 commits March 22, 2024 13:12

fix this

a62f937

enforce tuple and NOT generator

5ee8acf

zm711 reviewed Mar 22, 2024

View reviewed changes

h-mayorquin added 3 commits March 22, 2024 13:20

typo

bd30fdf

typo II

82f730e

typo III

af379bd

alejoe91 approved these changes Mar 23, 2024

View reviewed changes

alejoe91 merged commit 263c1ec into SpikeInterface:main Mar 24, 2024

h-mayorquin deleted the patch_numpy_error branch March 24, 2024 15:37

This was referenced Apr 25, 2024

Error - Size must be positive integer in shared_memory #2761

Closed

Prepare release 0.100.6 #2776

Merged

alejoe91 added a commit to alejoe91/spikeinterface that referenced this pull request Apr 30, 2024

Propagate SpikeInterface#2621

d8b7fdb

	shape = (int(x) for x in shape) # We need to be sure that shape comes in int and not number scalars
	shape = tuple(int(x) for x in shape) # We need to be sure that shape comes in int and not numy scalars

Conversation

h-mayorquin commented Mar 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zm711 Mar 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zm711 Mar 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

samuelgarcia commented Mar 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

h-mayorquin commented Mar 22, 2024 •

edited

Loading

zm711 Mar 22, 2024 •

edited

Loading

zm711 Mar 22, 2024 •

edited

Loading