Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement LazyFrame$schema #250

Merged
merged 10 commits into from
Jun 24, 2023
Merged

Implement LazyFrame$schema #250

merged 10 commits into from
Jun 24, 2023

Conversation

Sicheng-Pan
Copy link
Collaborator

This PR aims to implement schema, columns, and dtypes for LazyFrame, as requested by #243

@Sicheng-Pan
Copy link
Collaborator Author

@sorhawell It seems that the assignment symbol is changed from = to <- somehow in the wrapper generated by extendr. Any ideas how to configure this so that we can keep using = for assignment?

@sorhawell
Copy link
Collaborator

sorhawell commented Jun 22, 2023

@sorhawell It seems that the assignment symbol is changed from = to <- somehow in the wrapper generated by extendr. Any ideas how to configure this so that we can keep using = for assignment?

my bad, I auto-styled all files in your previous PR, but extendr-wrappers.R should not have been styled.

#251 should revert this, feel free to overrride = immediately

@sorhawell
Copy link
Collaborator

@Sicheng-Pan I recommend that you work and submit PRs from individual feature branches in your fork. I got this advice from the rlang maintainer. It will make it easier for you to work on simultanous PRs and a reviewer like me will not mess up your main branch (and all branches thereof) by pushing changes to your PR.

@sorhawell
Copy link
Collaborator

sorhawell commented Jun 22, 2023

looks good :)
I changed to derive columns and dtypes from internal wrapper to show "in $columns" and not "in $schema". It should also be possible to derive from a public method but then something like this pattern should be used

  self$schema |> 
    result() |> 
    unwrap("in $column():") |>
    names()

however currently then err will have two times the same call-info + two times a "in $xxx" info. This is annoying. Currently alot of polars methods are derived from pl$col and pl$lit and the error messages look messy. Before with string error, it was not simple to correct.

I was contemplating to let RPolarsErr have this type

struct RPolarsErr {
  ctx_chain: VecDeque<Rctx>,
  call: Option<Rctx>,
  wherein: Option<Rctx>
}

then any error can only have one call and one wherein. When deriving from another public method, the call and wherein will just be overwritten if an error.

@sorhawell
Copy link
Collaborator

@vincentarelbundock this will allow us to bring back names s3 method for LazyFrame without a collect and performance drop :)

@Sicheng-Pan Sicheng-Pan marked this pull request as ready for review June 23, 2023 07:53
Copy link
Collaborator

@sorhawell sorhawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Sicheng-Pan

@sorhawell sorhawell merged commit 7635353 into pola-rs:main Jun 24, 2023
@vincentarelbundock
Copy link
Collaborator

@vincentarelbundock this will allow us to bring back names s3 method for LazyFrame without a collect and performance drop :)

This is awesome! I haven't time to do anything about this recently, and I probably won't have more time in the next couple months. Will probably check in then to see what's to do. Sorry...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants