Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong results from floordiv #14596

Closed
2 tasks done
knl opened this issue Feb 19, 2024 · 9 comments
Closed
2 tasks done

Wrong results from floordiv #14596

knl opened this issue Feb 19, 2024 · 9 comments
Assignees
Labels
bug Something isn't working P-high Priority: high python Related to Python Polars

Comments

@knl
Copy link

knl commented Feb 19, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = (
    pl.from_dict({'ts': [1391332687909301889, 1391332687909454210, 1391332687909591535]})
    .with_columns(
        pl.col('ts').cast(pl.UInt64)
    )
    .with_columns(
        pl.col('ts').floordiv(1_000_000_000. / 65536).cast(pl.UInt64).alias('jiffy'),
        pl.col('ts').map_elements(lambda v: v//(1_000_000_000. / 65536)).cast(pl.UInt64).alias('jiffy2'),
        (pl.col('ts') // (1_000_000_000. / 65536)).cast(pl.UInt64).alias('jiffy3'),
    )
    .with_columns(
        (pl.col('jiffy') * (1_000_000_000. / 65536)).cast(pl.UInt64).alias('round_ts')
    )
    .with_columns(
        (pl.col('ts') - pl.col('round_ts')).reinterpret(signed=True).alias('difference')
    )
)

print(df)

Log output

tst.py:10: PolarsInefficientMapWarning:
Expr.map_elements is significantly slower than the native expressions API.
Only use if you absolutely CANNOT implement your logic otherwise.
Replace this expression...
  - pl.col("ts").map_elements(lambda v: ...)
with this one instead:
  + pl.col("ts") // 15258.7890625

  pl.col('ts').map_elements(lambda v: v//(1_000_000_000. / 65536)).cast(pl.UInt64).alias('jiffy2'),

Issue description

I'm trying to convert StratusVOS jiffies (there are 65536 jiffies in a second) to nanoseconds and vice-versa. In python, I can do the following:

JIFFIES_IN_ONE_SECOND = 65536
1391332687909454210//(1e9/JIFFIES_IN_ONE_SECOND)

and will get the correct result 91182379034833. However, the given example in polars produces the wrong values in the last two rows.

shape: (3, 6)
┌─────────────────────┬────────────────┬────────────────┬────────────────┬─────────────────────┬────────────┐
│ ts                  ┆ jiffy          ┆ jiffy2         ┆ jiffy3         ┆ round_ts            ┆ difference │
│ ---                 ┆ ---            ┆ ---            ┆ ---            ┆ ---                 ┆ ---        │
│ u64                 ┆ u64            ┆ u64            ┆ u64            ┆ u64                 ┆ i64        │
╞═════════════════════╪════════════════╪════════════════╪════════════════╪═════════════════════╪════════════╡
│ 1391332687909301889 ┆ 91182379034824 ┆ 91182379034824 ┆ 91182379034824 ┆ 1391332687909301760 ┆ 129        │
│ 1391332687909454210 ┆ 91182379034834 ┆ 91182379034833 ┆ 91182379034834 ┆ 1391332687909454336 ┆ -126       │
│ 1391332687909591535 ┆ 91182379034843 ┆ 91182379034842 ┆ 91182379034843 ┆ 1391332687909591552 ┆ -17        │
└─────────────────────┴────────────────┴────────────────┴────────────────┴─────────────────────┴────────────┘

As you can see, the jiffy in the second row is 91182379034834, not 91182379034833. diff column shows that the nanosecond value of that jiffy is higher than the starting jiffy, meaning that floordiv did not return the integral part of the division, but rounded the result.

In addition, suggested improvement does not yield the right result either.

Expected behavior

The correct values are the ones in row jiffy2.

Installed versions

--------Version info---------
Polars:               0.20.10
Index type:           UInt32
Platform:             macOS-14.3-arm64-arm-64bit
Python:               3.10.12 (main, Sep  7 2023, 23:57:54) [Clang 11.1.0 ]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          2.2.1
connectorx:           <not installed>
deltalake:            <not installed>
fsspec:               2023.4.0
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           <not installed>
numpy:                1.25.1
openpyxl:             <not installed>
pandas:               1.5.3
pyarrow:              13.0.0
pydantic:             1.10.13
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           2.0.23
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@knl knl added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 19, 2024
@MarcoGorelli MarcoGorelli added P-high Priority: high and removed needs triage Awaiting prioritization by a maintainer labels Feb 19, 2024
@MarcoGorelli
Copy link
Collaborator

Thanks for the report

Labelling as high prio as silently incorrect results can potentially break analyses

@ritchie46
Copy link
Member

@orlp can you take a look?

@orlp orlp self-assigned this Feb 20, 2024
@orlp
Copy link
Collaborator

orlp commented Feb 20, 2024

First I note that 1e9 / JIFFIES_IN_ONE_SECOND is the value 15258.7890625 exactly (this can be exactly represented by a f64 without loss of precision).

Unfortunately 1391332687909454210 can not be exactly represented by a f64. The nearest representable f64 is 1391332687909454336, which Polars converts to before doing a float-float flooring division (and I believe Python to do the same).

So in reality you are calculating 1391332687909454336 / 15258.7890625. If we put this into WolframAlpha we find the full correct answer:
image

Now, in Polars we compute floating point flooring division as (x / y).floor(). Unfortunately the closest floating point number to 91182379034833 + 1951883/1953125 is 91182379034834, after which the floor has no effect. This can also be seen in Python:

>>> 1391332687909454210 / (1e9 / JIFFIES_IN_ONE_SECOND)
91182379034834.0
>>> 1391332687909454210 // (1e9 / JIFFIES_IN_ONE_SECOND)
91182379034833.0

Python computes floating point flooring division as (x - x % y) / y followed by a bunch of ifs to fix a variety of corner cases. This is overall more accurate, but I'm not entirely convinced it's worth the performance penalty.

@orlp
Copy link
Collaborator

orlp commented Feb 20, 2024

And furthermore, we actually optimize (x / c).floor() where c is a constant to (x * (1/c)).floor(). This is an optimization I'm particularly unwilling to let go of.

import timeit
import polars as pl
import numpy as np

a = np.random.rand(10**7)
pa = pl.Series(a)
print(timeit.timeit(lambda: a // 42.0, number=100))   # 2.65
print(timeit.timeit(lambda: pa // 42.0, number=100))  # 0.83

@mcrumiller
Copy link
Contributor

we actually optimize (x / c).floor() where c is a constant to (x * (1/c)).floor()

Does 1/c have a fast path in floating point arithmetic? Curious how two ops is faster than one otherwise. Edit: I see, precompute 1/c and then it's a single mult for all elements.

@orlp
Copy link
Collaborator

orlp commented Feb 20, 2024

After some more internal discussion we've decided that we want to keep the current behavior, where x // y for floats in Polars is defined as (x / y).floor(), and when y is a scalar constant as (x * (1/y)).floor(), even if this subtly diverges from what Python does. The primary rationale is simple: speed. What we do is much faster (up to 25x on one AMD machine, compared to numpy). It is unfortunate that it doesn't match Python or numpy's behavior, but I think it's worth it.

Python implements (as far as I can see) true flooring division in a single logical step, whereas we first do a division followed by a flooring operation. This can be subtly different. Note that in general introducing hard discontinuous points with floating point will almost always result in surprising behavior. For example in Python:

>>> 6.0 / 0.1
60.0
>>> 6.0 // 0.1
59.0

This is because 0.1 isn't exactly representable but in reality is the value 0.1000000000000000055511151231257827021181583404541015625, and thus if you perform the flooring division technically speaking the 'correct' result is ever so slightly smaller than 60, and thus rounds down to 59. Whereas with an intermediate step, where you first divide and then floor you get 60.0, like in Polars:

>>> pl.Series([6.0]) // 0.1
shape: (1,)
Series: '' [f64]
[
        60.0
]

In general in Polars we have the following guideline for floating point operations:

Polars always attempts to provide reasonably accurate results for floating point computations but does not provide guarantees on the error unless mentioned otherwise. Generally speaking 100% accurate results are infeasibly expensive to acquire (requiring much larger internal representations than 64-bit floats), and thus some error is always to be expected.

I think (x / y).floor() is a reasonable enough implementation for flooring division, it just happens that by the very nature of a discontinuous operation like floor that any tiny errors can be immensely magnified. But one also has to ask, what is the error on the input of the operation? Usually one does not read in 100% accurate floating point numbers, do one operation, and expect to write out 100% accurate floating point numbers. There generally is a data pipeline which introduces some sort of floating point error along its path, which reduces the value added in having a fused divide-and-floor operation.

Take your motivating example. You start with the integer value 1391332687909454210 and cast it to a float. But this conversion in itself already introduces error:

>>> 1391332687909454210 - int(float(1391332687909454210))
-126

For your specific example of converting integer nanosecond timestamps to a whole number of 'jiffies', I'd suggest the following using entirely integer arithmetic, making sure not to overflow since your example values are very close to the limits of a 64-bit signed integer:

JIFFIES_PER_SECOND = 65536
NANOSECONDS_PER_SECOND = 10**9

# Logical calculation: (pl.col.ts / NANOSECONDS_PER_SECOND * JIFFIES_PER_SECOND).floor()
# Calculation without risk of overflow, or using floating point:
whole_seconds = pl.col.ts // NANOSECONDS_PER_SECOND
rest_nanosec = pl.col.ts % NANOSECONDS_PER_SECOND
jiffies = whole_seconds * JIFFIES_PER_SECOND + rest_nanosec * JIFFIES_PER_SECOND // NANOSECONDS_PER_SECOND

@orlp orlp closed this as not planned Won't fix, can't repro, duplicate, stale Feb 20, 2024
@knl
Copy link
Author

knl commented Feb 20, 2024

I understand your point and I see how it matches the rest of the rationale for handling floats.

However, I would still kindly ask for a change in the documentation for the function, as it says:

Method equivalent of integer division operator expr // other.

clearly, they are not equivalent as these two give different results. I was already computing correct jiffies in python using \\, and was quite surprised that floordiv diverges.

Thanks for the suggestion, I actually started using it after discovering the issue. The way I got there is to convert seconds to jiffies, that take the rest, as it doesn’t overflow.

@orlp
Copy link
Collaborator

orlp commented Feb 20, 2024

However, I would still kindly ask for a change in the documentation for the function

@knl I will open an issue to improve the documentation of the floor division.

I was already computing correct jiffies in python using //, and was quite surprised that floordiv diverges.

Well, if you were using n // (1e9 / JIFFIES_IN_ONE_SECOND) in Python, I'm afraid that also wasn't correct. Take for example 4611686021 seconds:

>>> 4611686021 * 65536  # Clearly correct.
302231455072256
>>> 4611686021000000000 // (1e9 / JIFFIES_IN_ONE_SECOND)  # Oops.
302231455072255.0

If you want exact integer results for exact integer inputs, you're better off avoiding floating-point arithmetic unless you're very careful about error bounds.

@knl
Copy link
Author

knl commented Feb 20, 2024

@knl I will open an issue to improve the documentation of the floor division.

thanks a lot @orlp. And many thanks for the explanation of the issue, it’s quite educational!

Well, if you were using n // (1e9 / JIFFIES_IN_ONE_SECOND) in Python, I'm afraid that also wasn't correct. Take for example 4611686021 seconds:

>>> 4611686021 * 65536  # Clearly correct.
302231455072256
>>> 4611686021000000000 // (1e9 / JIFFIES_IN_ONE_SECOND)  # Oops.
302231455072255.0

If you want exact integer results for exact integer inputs, you're better off avoiding floating-point arithmetic unless you're very careful about error bounds.

hah, true! I’m avoiding floats as much as possible exactly for the reasons you mention. Dealing with these timestamps produces a lot of surprises when there are silent conversions to floats. It was painful with pandas to do a simple shift, as that would turn the missing element to null, turning the whole column to float and rounding to possible float values. One of the things I like with polars is that integers are explicit and sneaking in null values doesn’t produce silent conversions to floats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P-high Priority: high python Related to Python Polars
Projects
Archived in project
Development

No branches or pull requests

5 participants