Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump rust-polars to 0.34 #442

Merged
merged 49 commits into from
Oct 31, 2023
Merged

Bump rust-polars to 0.34 #442

merged 49 commits into from
Oct 31, 2023

Conversation

etiennebacher
Copy link
Collaborator

@etiennebacher etiennebacher commented Oct 27, 2023

Close #404

Changelog: https://github.com/pola-rs/polars/releases/tag/rs-0.34.0

@eitsupi @sorhawell no rush on this, I'm just opening this PR to get the ball rolling. I'm fine with releasing polars 0.9.0 either before or after this is merged.

@eitsupi eitsupi marked this pull request as draft October 28, 2023 09:34
etiennebacher

This comment was marked as outdated.

@etiennebacher
Copy link
Collaborator Author

There shouldn't be remaining test failures on R or Rust side. However, examples using $cumsum() on Series fail because I commented the code on Rust side since I couldn't find the fix

@eitsupi
Copy link
Collaborator

eitsupi commented Oct 29, 2023

Thank you for working on this!
It seems that MSRV is changing to Rust 1.71, so I would like to release 0.9.0 first.

@etiennebacher
Copy link
Collaborator Author

Sure, go ahead with 0.9.0

@etiennebacher
Copy link
Collaborator Author

Looks like the last thing to do is to fix the $cumsum() error

@eitsupi eitsupi added this to the 0.10 milestone Oct 31, 2023
@etiennebacher
Copy link
Collaborator Author

I tried several things and I'm stuck here:

pub fn cumsum(&self, reverse: bool) -> Series {
        pl::cumsum(self, reverse)
                .map(Series)
                .unwrap()
}

The compiler tells me that self is a series::Series while cumsum() expects a polars::prelude::Series. Can you take a look @eitsupi ?

@eitsupi
Copy link
Collaborator

eitsupi commented Oct 31, 2023

Can you take a look @eitsupi ?

I think it could probably be fixed in a74b6af.

@etiennebacher etiennebacher marked this pull request as ready for review October 31, 2023 11:33
@etiennebacher
Copy link
Collaborator Author

@sorhawell if you have time, it would be great if you could review the changes related to the string cache since you discussed it with the rust-polars team

Copy link
Collaborator

@eitsupi eitsupi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for accomplishing this almost single-handedly!
I am excited to get this merged and have access to the new features.

Obviously there is some testing and documentation missing on the R side, but those can be added later.

src/rust/Cargo.toml Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@etiennebacher
Copy link
Collaborator Author

Good thing I had a lot of travel by train these last days 😄 many thanks for your help @eitsupi

Just waiting a bit to see if @sorhawell has some remarks, otherwise I'll merge later

@sorhawell
Copy link
Collaborator

Not bad :) I think I have something to add pl$with_string_cache ...

@sorhawell
Copy link
Collaborator

I was initially concerned for a corner of nested pl$with_string_cache() like

pl$with_string_cache({
  df1 = pl$DataFrame(head(iris, 2))
  pl$with_string_cache("foobar")
  df2 = pl$DataFrame(tail(iris, 2))
  pl$concat(list(df1, df2))
})

as when disabling an inner with_string_cache() that would the affect the outer. You cleverly avoid all that with the if (!pl$using_string_cache()) { ...} guard. So all good. I cannot point out a bad behavior if a user only uses pl$with_string_cache

I commit and suggest we derive pl$with_string_cache() from StringCacheHolder instead, only as that likely is more future proof as py-polars relies on that too and is not derived from enable/disable_string_cache().

Feel free to roll back this commit and we will be fine also with your solution :)

Copy link
Collaborator

@sorhawell sorhawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etiennebacher thank you so much, this is very good work

@etiennebacher
Copy link
Collaborator Author

Many thanks @sorhawell, that's definitely something I missed in rust-polars. I intend to use the string cache more in tidypolars so that will be the occasion to test it more extensively.

@etiennebacher etiennebacher merged commit 62d40c1 into main Oct 31, 2023
31 checks passed
@etiennebacher etiennebacher deleted the rust-polars-0.34 branch October 31, 2023 23:09
@grantmcdermott grantmcdermott mentioned this pull request Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Read partition columns of Hive dataset
3 participants