Float32 accuracy issue #14497

Macfly · 2024-02-14T23:13:10Z

Checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.

Reproducible example

pl.DataFrame({'data':799.32}, schema={'data':pl.Float32})['data'][0]

Log output

No response

Issue description

it prints 799.3200073242188

Expected behavior

Expecting 799.32 (like with the type pl.Float64)

Installed versions

--------Version info---------
Polars:               0.20.7
Index type:           UInt32
Platform:             Linux-6.1.75-99.163.amzn2023.x86_64-x86_64-with-glibc2.34
Python:               3.10.13 (main, Sep 21 2023, 10:13:35) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fsspec:               2023.10.0
gevent:               <not installed>
hvplot:               0.9.2
matplotlib:           3.8.2
numpy:                1.26.4
openpyxl:             3.1.2
pandas:               2.1.2
pyarrow:              15.0.0
pydantic:             2.4.2
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             0.8.2
xlsxwriter:           3.1.9

The text was updated successfully, but these errors were encountered:

ritchie46 · 2024-02-14T23:38:13Z

Maybe consider opening an issue at IEEE_754? ;)

mcrumiller · 2024-02-14T23:51:16Z

@Macfly to expound a bit on Ritchie's answer, numbers are stored in the computer in binary format with a limited number of bits, and so we cannot represent all numbers exactly. If you take a look at https://www.omnicalculator.com/other/floating-point and select Number to Floating Point and enter 799.32, you'll see that the number stored as a 32-bit float actually represents the number 799.32000732421875. If you set the number of bits to 64, you get a whole lot more precision, but it's still not exact: 799.32000000000005.

Macfly · 2024-02-15T01:35:35Z

I understand the floating accuracy but why with Numpy and Pandas, it is displayed as 799.32 with float32.
Is there something special with polars that always show the exact floating representation?

nameexhaustion · 2024-02-15T06:11:12Z

numpy does the same thing actually

>>> pl.Series([799.32], dtype=pl.Float32).item()
799.3200073242188
>>> np.float32(799.32).item()
799.3200073242188
>>> np.float32(799.32).astype(np.float64)
799.3200073242188

when you do Series([799.32], dtype=pl.Float32)[0], you are actually performing these operations:

>>> np.float64(799.32).astype(np.float32).astype(np.float64).item()
799.3200073242188

you start with a 64-bit Python float, which gets casted to a 32-bit float by polars, and then back to a 64-bit Python float when you do Series[0]

Macfly added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 14, 2024

ritchie46 added invalid A bug report that is not actually a bug and removed bug Something isn't working labels Feb 14, 2024

stinodego removed the needs triage Awaiting prioritization by a maintainer label Feb 15, 2024

stinodego closed this as not planned Won't fix, can't repro, duplicate, stale Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Float32 accuracy issue #14497

Float32 accuracy issue #14497

Macfly commented Feb 14, 2024

ritchie46 commented Feb 14, 2024

mcrumiller commented Feb 14, 2024

Macfly commented Feb 15, 2024

nameexhaustion commented Feb 15, 2024

Float32 accuracy issue #14497

Float32 accuracy issue #14497

Comments

Macfly commented Feb 14, 2024

Checks

Reproducible example

Log output

Issue description

Expected behavior

Installed versions

ritchie46 commented Feb 14, 2024

mcrumiller commented Feb 14, 2024

Macfly commented Feb 15, 2024

nameexhaustion commented Feb 15, 2024