# Floating-point arithmetic

This is something I came across
[here](https://github.com/scikit-learn/scikit-learn/pull/31172#discussion_r2398607019)
on a PR in scikit-learn. There is also a [Wikipedia
article](https://en.wikipedia.org/wiki/Floating-point_arithmetic) on the general field
it comes from.

### On using `==`
With floats, using `==` is dangerous when the number we compare it with (here zero)
can be created either accidentally or missed slightly (which is in most cases).

It is safe when the number can only be created intentionally.

In [10]:
# example
import numpy as np

x = np.sum([0.1, 0.2, -0.3])

print(x == 0.0) # False, due to inprecision it is not precisely 0
print(np.isclose(x, 0.0)) # True

False
True


In [None]:
# but if we know that a sum is only consisting of positive numbers, the 0 cannot be
# reached but by the fact that all numbers are in fact very small:
x = np.sum(np.float32(5e-46) + np.float32(5e-46))
            
print(x == 0.0) # this is fine because the small numbers underflow
print(np.isclose(x, 0.0)) # np.isclose doesn't hurt though

True
True


In [None]:
# this seems to be the border for underflow:
print(np.float32(5e-46))
print(np.float32(5e-45))

0.0
6e-45


In [None]:
# We still have a problem though, if we deal with very large numbers. 
# In this case, a 0 can be constructed by accident:
x = np.sum([1e20, 1.0, -1e20])
x # the 1.0 gets lost due to inprecision

np.float64(0.0)

In [None]:
# neither of these then help ¯\_(ツ)_/¯
print(x == 0.0)
print(np.isclose(x, 0.0))

# information that is lost, is lost

True
True
