Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prevent panic with arg_sort_by #15247

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

eitsupi
Copy link
Contributor

@eitsupi eitsupi commented Mar 23, 2024

From pola-rs/r-polars#929

Correct panic when multiple column names are specified as the first argument of arg_sort_by.

In Python, current release version:

>>> import polars as pl
>>> df = pl.DataFrame(
...     {
...         "a": [0, 1, 1, 0],
...         "b": [3, 2, 3, 2],
...     }
... )
>>> df.select(pl.arg_sort_by(pl.col("a")))
shape: (4, 1)
┌─────┐
│ a   │
│ --- │
│ u32 │
╞═════╡
│ 0   │
│ 3   │
│ 1   │
│ 2   │
└─────┘
>>> df.select(pl.arg_sort_by(pl.col("a", "b")))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workspaces/polars/py-polars/polars/functions/lazy.py", line 1601, in arg_sort_by
    return wrap_expr(plr.arg_sort_by(exprs, descending))
polars.exceptions.ComputeError: this expression may produce multiple output names
>>> df.select(pl.arg_sort_by("*"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workspaces/polars/py-polars/polars/functions/lazy.py", line 1601, in arg_sort_by
    return wrap_expr(plr.arg_sort_by(exprs, descending))
polars.exceptions.ComputeError: cannot determine output column without a context for this expression

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Mar 23, 2024
@eitsupi eitsupi marked this pull request as ready for review March 23, 2024 05:00
Copy link

codecov bot commented Mar 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (main@0a33859). Click here to learn what that means.

❗ Current head 9122f91 differs from pull request most recent head 04fa3cf. Consider uploading reports for the commit 04fa3cf to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #15247   +/-   ##
=======================================
  Coverage        ?   81.32%           
=======================================
  Files           ?     1359           
  Lines           ?   176086           
  Branches        ?     2524           
=======================================
  Hits            ?   143204           
  Misses          ?    32399           
  Partials        ?      483           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@stinodego stinodego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't panic, but I think it should just work rather than throw an error.

Also, can you add the example you gave in the PR description as a test in the Python test suite?

@eitsupi
Copy link
Contributor Author

eitsupi commented Mar 23, 2024

This shouldn't panic, but I think it should just work rather than throw an error.

My understanding is that if something like Expr::Wildcard is passed as a column name, it cannot be reused as a column name in the result, so an error will occur.

>>> import polars as pl
>>> pl.arg_sort_by("*")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/workspaces/polars/py-polars/polars/functions/lazy.py", line 1601, in arg_sort_by
    return wrap_expr(plr.arg_sort_by(exprs, descending))
polars.exceptions.ComputeError: cannot determine output column without a context for this expression

Ok(
int_range(lit(0 as IdxSize), len().cast(IDX_DTYPE), 1, IDX_DTYPE)
.sort_by(by, descending)
.alias(name.as_ref()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try keep_name here? That might work.

Copy link
Contributor Author

@eitsupi eitsupi Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried name().keep(), but `keep`, `suffix`, `prefix` should be last expression error occurred in Python tests.

Copy link

codspeed-hq bot commented Apr 9, 2024

CodSpeed Performance Report

Merging #15247 will not alter performance

Comparing eitsupi:fix-panic-arg_sort_by (04fa3cf) with main (42d3697)

Summary

✅ 21 untouched benchmarks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants