-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
np.nan appears multiple times in sets #9358
Comments
nan != nan as per IEEE
|
OK, but why |
Because: >>> x = [float('nan')] * 5 # np.nan is just defined as `float('nan')`
>>> x[0] is x[1]
True
>>> set(x)
nan
>>> y = [float('nan') for i in range(5)] # this is essentially what np.array ends up doing
>>> y[0] is y[1]
False
>>> set(y)
{nan, nan, nan, nan, nan} Also, the above clearly demonstrates this is not under control of numpy, but is python itself |
OK, sorry for reporting this here. Sorry for the loss of time. |
In principle, this could be fixed in cpython by |
I doubt you can do that, in principle I think you could also return an error when you try to hash a NaN, but I am not sure that is any better as well.... Anyway it is an old issue, which I believe the python guys have discussed probably more then once. But any serious new discussion would have to be on python-ideas probably. @Naereen the difference is that python optimizes the equality check by first doing an |
@seberg OK I get it. Thanks for your (incredibly) quick response guys! |
We can add this to the list of why I 'low-key' hate python sometimes. Here is my current work around. import math
if math.isnan(row[3]):
mySet.add('nan')
else:
mySet.add(row[3]) |
You can even use |
Might be worth reporting this upstream to Python. NVM, this seems to be a case of np.nan not being the same as python nan.
|
@charris I think this is one of the things python-ideas probably discusses every 2 years, and nobody really cares enough, or just accepts it as "well you work with NaN expect strangeness". I expect there is a python bug open somewhere. Unless we want to go probably as far as digging up some discussions and writing a PEP with a solution (whatever that is), I doubt it can go anywhere. And there is probably no good solution, since disabling |
Hi,
Here is a weird example, on Ubuntu 17.04 with Python 3.5.3 and Numpy 1.13.0:
I don't understand how
{nan,nan,nan,nan,nan}
is even possible. The array was constructed withnp.nan
replicated 5 times (and the same works withnp.full(5, np.nan)
).I searched on StackOverflow and on the documentation, but cannot find any explanation for this weird behavior.
Thanks in advance if I'm just misunderstanding something!
But in case it's a bug, I can try to help (if a maintainer advise me on the good direction).
The text was updated successfully, but these errors were encountered: