Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Element-wise multiplication & division of pl.Decimal silently fail and produce inconsistent results #17046

Open
2 tasks done
scur-iolus opened this issue Jun 18, 2024 · 1 comment
Labels
A-dtype-decimal Area: decimal data type A-ops Area: operations bug Something isn't working P-medium Priority: medium python Related to Python Polars

Comments

@scur-iolus
Copy link

scur-iolus commented Jun 18, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars (1.0.0-beta.1 to date).

Reproducible example

The following test passes with float but fails with Decimal as a datatype:

from decimal import Decimal as Dec

import polars as pl
import pytest


@pytest.mark.parametrize("datatype", [float, Dec])
def test_element_wise_multiplication_n_division(datatype) -> None:
    df = pl.DataFrame(
        [
            {
                "a": datatype(f"1.{'0' * 20}"),
                "b": datatype(f"1.{'0' * 20}"),
            }
        ]
    )
    df = df.with_columns(c=pl.col("a") * pl.col("b"))
    df = df.with_columns(d=pl.col("a") / pl.col("b"))
    # next line fails: I got a Decimal('0.0131811...') probably due to an overflow?
    assert df[0, "c"] == datatype("1")
    # next line fails but differently: the value is null
    assert df[0, "d"] == datatype("1")

Issue description

Some very basic element-wise operations between Decimal values fail, probably because of an overflow, without explicit warning nor error message.

This is a very nasty bug: in an element-wise multiplication, the output values may be plausible, but turn out to be completely wrong, making it really hard to debug. Here, I've tried to reduce the problem to its most transparent form (multiplying 1 by 1 must equal 1, not 0.013...), but I'm sure other people could tear their hair out over it.

I know polars only partially supports Decimal, I'm aware that it's a "work-in-progress feature", I tend to believe that this issue should be fixed before polars can claim to support this datatype. I marked this as a python bug since that is how I stumbled over it but I wouldn't be surprised if the same bug exists in Rust.

This could be related to #15037 in the sense that in both cases the root of the issue is likely a silent overflow.

Expected behavior

An explicit exception should be raised or a warning should be displayed.

Installed versions

--------Version info---------
Polars:               1.0.0-beta.1
Index type:           UInt32
Platform:             Linux-5.15.0-107-generic-x86_64-with-glibc2.31
Python:               3.11.9 (main, Apr  6 2024, 17:59:24) [GCC 9.4.0]
@scur-iolus scur-iolus added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Jun 18, 2024
@deanm0000 deanm0000 added A-dtype-decimal Area: decimal data type P-high Priority: high A-ops Area: operations and removed needs triage Awaiting prioritization by a maintainer labels Jun 18, 2024
@ritchie46 ritchie46 added P-medium Priority: medium and removed P-high Priority: high labels Jun 18, 2024
@ritchie46
Copy link
Member

ritchie46 commented Jun 18, 2024

Set to p-medium as the decimal type is still unstable.

Note that if its is overflow, this is not something we check during execution. We follow numpy in that regard. Numeric arithmetic can overflow:

pl.Series([233], dtype=pl.UInt8) + 30
shape: (1,)
Series: '' [u8]
[
	7
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dtype-decimal Area: decimal data type A-ops Area: operations bug Something isn't working P-medium Priority: medium python Related to Python Polars
Projects
Status: Ready
Development

No branches or pull requests

3 participants