Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cov, corr, rolling_cov, rolling_corr #351

Merged
merged 6 commits into from
Aug 7, 2023
Merged

Conversation

Sicheng-Pan
Copy link
Collaborator

macronova and others added 4 commits August 6, 2023 17:21
Merge branch 'main' into stat_funcs

# Conflicts:
#	src/rust/src/lazy/dsl.rs
@sorhawell
Copy link
Collaborator

sorhawell commented Aug 7, 2023

Thank you for implementing these. We should avoid using extendr-internal conversion via &Expr as it will bypass our own conversions, not throw an RPolarsErr and will accept fewer valid inputs. See also explanation here.

robj_to! is currently the way. extendr does not support custom internal conversions. Alternatives are generic trait wrappers or a macro like robj_to!. See some examples here

My aim is to centralize what it means to take a type like Expr as input into one place. If we handle type check ad-hoc in various R or rust snippets, we risk to deviate in behavior and rules for input args. Sometimes we have to, and that is ok.
Currently the central definition converting into Expr is pl$lit and pl$col called via robj_to! where possible.

...otherwise when robj_to! rarely is not possible, you can use wrap_e result & unwrap with on R side.
result and unwrap is used to catch the conversion error and blame pl$cov

pl$cov = function(a,b) {
  result(
    pl_cov(wrap_e(a,str_as_lit = FALSE),wrap_e(b, str_as_lit = FALSE))
  ) |> unwrap("in pl_cov")
}

Now robj_to!(Expr) calls R function pl$lit() and robj_to!(ExprCol) calls R function pl$col().
In a future R function pl$lit and pl$col could derived from robj_to! via pure rust code. That could be cleaner and faster. However query generation is already plenty fast and it would make the code base harder to read for those who are not very comfortable with rust.

Copy link
Collaborator

@sorhawell sorhawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add unit test which compares to equivalents in R see e.g. "pl$std pl$var" -test . You may use something from data.table to test expected rolling behavior. It is not uncommon there is a difference in behavior. If it is a bug in rust-polars (it happens) we can write a PR to them, or otherwise explain in docs when/why polars is behaving different than plain R.

R/functions__lazy.R Outdated Show resolved Hide resolved
R/functions__lazy.R Outdated Show resolved Hide resolved
R/functions__lazy.R Outdated Show resolved Hide resolved
src/rust/src/lazy/dsl.rs Outdated Show resolved Hide resolved
sorhawell

This comment was marked as duplicate.

Copy link
Collaborator

@sorhawell sorhawell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me. Many thanks :)

@Sicheng-Pan Sicheng-Pan merged commit 53a6243 into main Aug 7, 2023
11 checks passed
@Sicheng-Pan Sicheng-Pan deleted the stat_funcs branch August 7, 2023 23:07
@eitsupi eitsupi mentioned this pull request Nov 3, 2023
29 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants