Skip to content

Inconsistencies in nansum for float32 dtype compared to numpy #193

Open
@agoodm

Description

@agoodm

Consider this simple example:

In [1]: import numpy as np

In [2]: import bottleneck as bn

In [3]: data = 2e5*np.random.rand(int(4e7)).astype('float32')

In [4]: np.nansum(data)
Out[4]: 4000034300000.0

In [5]: bn.nansum(data)
Out[5]: 3719060258816.0

Looks like errors in the computation are compounding due to loss of precision, as the problem becomes much less apparent for smaller datasets. Repeating the above for the float64 dtype gives me much more consistent results.

In [6]: bn.nansum(data.astype('float64'))
Out[6]: 4000035580557.9033

In [7]: np.nansum(data.astype('float64'))
Out[7]: 4000035580557.979

I tested this example for bottleneck 1.1.0 and 1.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions