Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow broadcasting in ranges #11900

Merged
merged 20 commits into from
Dec 12, 2023
Merged

fix: Allow broadcasting in ranges #11900

merged 20 commits into from
Dec 12, 2023

Conversation

stinodego
Copy link
Member

@stinodego stinodego commented Oct 20, 2023

Closes #11892

Also changes the way capacity is allocated for int_ranges.

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Oct 20, 2023
@stinodego stinodego marked this pull request as ready for review October 20, 2023 16:09
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of broadcasting the input arrays and increasing memory, can we add branches where we resolve the broadcasting optimally. This is how we deal with broadcasting in most places and makes it most efficient.

You can do that quite elegantly by creating a function that accepts two different iterators. Where for the broadcasting branches one iterator is the iterator over the array and the other is tstd::iter::repeat.

@stinodego
Copy link
Member Author

Right, I just stole the existing logic from int_ranges, but indeed I have seen the logic you describe elsewhere. I'll see if I can implement it the way you describe!

@ritchie46
Copy link
Member

Any news on this one?

@stinodego stinodego force-pushed the fix-date-range branch 2 times, most recently from 36b0f2d to 4acdc4d Compare November 29, 2023 13:02
@stinodego stinodego force-pushed the fix-date-range branch 2 times, most recently from aa6ea75 to 11f184a Compare December 6, 2023 09:38
@stinodego stinodego marked this pull request as ready for review December 6, 2023 10:20
let start = start.downcast_iter().next().unwrap();
let end = end.downcast_iter().next().unwrap();

// First do a pass to determine the required value capacity.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this pass and set a default capacity of 5 times the length of the column, similar to the other ranges functions. If this capacity calculation must be preserved, I have to make some adjustments as it relied on fully materializing start/end.

Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there. :)

crates/polars-plan/src/dsl/function_expr/range/utils.rs Outdated Show resolved Hide resolved
crates/polars-plan/src/dsl/function_expr/range/utils.rs Outdated Show resolved Hide resolved
crates/polars-plan/src/dsl/function_expr/range/utils.rs Outdated Show resolved Hide resolved
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one @stinodego. It looks great and nice that we are optimal now in all branches. :)

@ritchie46 ritchie46 merged commit a6483c6 into main Dec 12, 2023
26 checks passed
@ritchie46 ritchie46 deleted the fix-date-range branch December 12, 2023 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

date_ranges: ComputeError and different behaviour than int_ranges
2 participants