Possible excessive hash collisions from hash of nan

A potential performance issue relating to hash collisions from distinct NaNs was recently [reported and fixed](https://bugs.python.org/issue43475) in core Python. In brief, the fact that all NaNs hash to `0` makes it easy to accidentally invoke quadratic-time behaviour from NaNs going into a set or dictionary. The fix involved changing the hash of a `float` NaN to match the usual object hash.

The core issue affects NumPy, too (and in fact one of the examples given in the Python bug report to demonstrate the issue used `np.float64` rather than `float`); NumPy may want to consider changing the hash of floating-point NaNs, both to address the potential performance issue, and to remain compatible with  Python.

### Reproducing code example:

The following code takes several hours to execute on my machine:

```python
>>> import collections, numpy as np
>>> x = np.full(1000000, np.nan)
>>> c = collections.Counter(x)
```

The cause is that (a) the `float64` NaN objects realised from the array all have distinct `id`, and so are considered unequal for the purposes of dict and set membership, together with (b) all of those NaNs have the same hash value of `0`. Then we get a perfect storm of hash collisions while building the dictionary underlying the `Counter` object.

### NumPy/Python version information:

```python
>>> import sys, numpy; print(numpy.__version__, sys.version)
1.20.2 3.9.4 (default, Apr  6 2021, 11:23:37) 
[Clang 11.0.3 (clang-1103.0.32.62)]
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Possible excessive hash collisions from hash of nan #18833

Reproducing code example:

NumPy/Python version information:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Possible excessive hash collisions from hash of nan #18833

Description

Reproducing code example:

NumPy/Python version information:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions