Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod seemingly dispatches to the rust remainder operator rather than the modulo operator #10570

Closed
2 tasks done
deanm0000 opened this issue Aug 17, 2023 · 8 comments · Fixed by #14026
Closed
2 tasks done
Assignees
Labels
accepted Ready for implementation bug Something isn't working P-medium Priority: medium python Related to Python Polars

Comments

@deanm0000
Copy link
Collaborator

Checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl
import numpy as np

x = 2.1
df = pl.DataFrame({'x':x})

print(x % 2) # Prints 0.1
print(np.mod(x,2)) # Prints 0.1
print(df.select(pl.col('x').mod(2))) # Prints 0.1

x = -.1
df = pl.DataFrame({'x':x})

print(x % 2) # Prints 1.9
print(np.mod(x,2)) # Prints 1.9
print(df.select(pl.col('x').mod(2))) #Prints -0.1

This originated at this SO question

Issue description

I found this which confirms BallPointBen's assertion that there's a mismatch in operators.

Expected behavior

mod should match the python % operator and np.mod

Installed versions

--------Version info---------
Polars:              0.18.15
Index type:          UInt32
Platform:            Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python:              3.10.12 (main, Jun  7 2023, 19:32:10) [GCC 10.2.1 20210110]

----Optional dependencies----
adbc_driver_sqlite:  <not installed>
cloudpickle:         <not installed>
connectorx:          0.3.1
deltalake:           <not installed>
fsspec:              2023.6.0
matplotlib:          <not installed>
numpy:               1.25.2
pandas:              2.0.3
pyarrow:             12.0.1
pydantic:            2.1.1
sqlalchemy:          <not installed>
xlsx2csv:            0.8.1
xlsxwriter:          3.1.2
@deanm0000 deanm0000 added bug Something isn't working python Related to Python Polars labels Aug 17, 2023
@orlp
Copy link
Collaborator

orlp commented Aug 17, 2023

I much prefer Python's definition of mod to Rust's, so I'm all in favor of changing it to match Python's behavior.

However this also means we must update integer division to be consistent with it. The identity (a / b) * b + (a % b) == a must always hold.

@ritchie46
Copy link
Member

There has been a discussion about this before where we decided we keep true to our backend.

I cannot find that issue atm.

@henryharbeck
Copy link
Contributor

I cannot find that issue atm.

I think this is it here

@deanm0000
Copy link
Collaborator Author

@orlp I don't think it's a question of preference. Modulo division is well established so if a programming language or library says it's doing modulo division then the result should be the same or else it's doing the wrong thing.

@ritchie46 In the other issue, the confusion was strictly about the % operator. Rust defines the % operator as remainder division whereas python defines % as modulo division. That's just a question of convention as there's nothing in math that says what % should mean. That's an open point of confusion but I don't think there's an inherently right or wrong decision.

By contrast rust has mod too.

If I'm not mistaken, the current situation is that polars's mod uses rust's % instead of rust's mod.

@orlp
Copy link
Collaborator

orlp commented Aug 18, 2023

By contrast rust has mod too.

@deanm0000 Euclidean division/modulo in Rust is still wrong for negative divisors.

I don't think it's a question of preference. Modulo division is well established so if a programming language or library says it's doing modulo division then the result should be the same or else it's doing the wrong thing.

Unfortunately, that's not the case. There are many variants all referred to as 'modulo': https://en.wikipedia.org/wiki/Modulo

Again, I would strongly prefer Python's definition because I think it's the most mathematically elegant. But it's not (sadly) universal.

@eitsupi
Copy link
Contributor

eitsupi commented Nov 20, 2023

However this also means we must update integer division to be consistent with it. The identity (a / b) * b + (a % b) == a must always hold.

If I am correct, I don't think floor_div (floordiv) and rem (mod) in polars currently satisfy this.
Wouldn't this be grounds for changing the behavior of the mod?

>>> x = -1
>>> y = 3
>>> x == (x % y) + y * (x // y)
True
>>> import polars as pl
>>> x_pl = pl.lit(x)
>>> y_pl = pl.lit(y)
>>> pl.select(x_pl == (x_pl % y_pl) + y_pl * (x_pl // y_pl))
shape: (1, 1)
┌─────────┐
│ literal │
│ ---     │
│ bool    │
╞═════════╡
│ false   │
└─────────┘
>>> pl.select(x == (x % y) + y * (x // y))
shape: (1, 1)
┌─────────┐
│ literal │
│ ---     │
│ bool    │
╞═════════╡
│ true    │
└─────────┘

@samedii
Copy link

samedii commented Dec 5, 2023

Current behaviour is very unexpected for a python math lib. It felt like debugging if something as simple as addition is working as expected.

@stinodego stinodego added the needs triage Awaiting prioritization by a maintainer label Jan 13, 2024
@orlp orlp self-assigned this Jan 26, 2024
@orlp orlp added P-medium Priority: medium and removed needs triage Awaiting prioritization by a maintainer labels Jan 26, 2024
@c-peters c-peters added the accepted Ready for implementation label Feb 5, 2024
@eitsupi
Copy link
Contributor

eitsupi commented Feb 6, 2024

However this also means we must update integer division to be consistent with it. The identity (a / b) * b + (a % b) == a must always hold.

If I am correct, I don't think floor_div (floordiv) and rem (mod) in polars currently satisfy this. Wouldn't this be grounds for changing the behavior of the mod?

>>> x = -1
>>> y = 3
>>> x == (x % y) + y * (x // y)
True
>>> import polars as pl
>>> x_pl = pl.lit(x)
>>> y_pl = pl.lit(y)
>>> pl.select(x_pl == (x_pl % y_pl) + y_pl * (x_pl // y_pl))
shape: (1, 1)
┌─────────┐
│ literal │
│ ---     │
│ bool    │
╞═════════╡
│ false   │
└─────────┘
>>> pl.select(x == (x % y) + y * (x // y))
shape: (1, 1)
┌─────────┐
│ literal │
│ ---     │
│ bool    │
╞═════════╡
│ true    │
└─────────┘

This does not seem to have been fixed yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working P-medium Priority: medium python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

8 participants