Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor colsubset exprs tracking so it an be overridden in split funs #795

Closed

Conversation

gmbecker
Copy link
Collaborator

This closes #785, allowing custom split functions, specifically functions which override the core function, to specify subsetting expressions for the facets they generate, which is necessary for the custom split to function properly in column space.

This is backwards compatible other than the fact that attempting to use a custom splitting function which overrides core splitting behavior in column space was previously an error, and any TableTree objects that have been serialized to rds/rda files would need to be refreshed as one of the underlying classes has changed. Tests have been added/updated for the new behavior

@shajoezhu
Copy link
Collaborator

Hi @gmbecker , thanks a lot for the changes. We will need to create PRs and trigger CICD pipelines to investigate if any downstream break changes. Will get back to this as I know more.

@shajoezhu shajoezhu self-requested a review December 1, 2023 05:12
@shajoezhu shajoezhu added the sme label Dec 1, 2023
#'
#' @examples
#' splres <- make_split_result(
#' values = c("hi", "lo"),
#' datasplit = list(hi = mtcars, lo = mtcars[1:10, ]),
#' labels = c("more data", "less data")
#' labels = c("more data", "less data"),
#' subset_exprs = list(expression(TRUE), expression(seq_along(wt) <= 10))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this would do a subset only in the column space, while it is ignored in the row space, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Faceting in column space is tracked via expressions, while faceting in row space is immediately materialized during tabulation. This is largely for a combination of performance and legacy reasons.

validate_split_result(splres)
newstuff <- make_split_result(values, datasplit, labels, extras)
newstuff <- make_split_result(values, datasplit, labels, extras, subset_exprs = list(sub_expr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why sub_expr is not defaulting to the splres values if sub_expr is NULL?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are two cases that can happen here:

  1. values are strings (this is more common as this is being called in custom code), in which case value_expr(values[.]) will always be NULL, so that wouldn't change anything.
  2. values are already SplitValue objects, in which case they (should) already have their expressions set (if they are not the default, which will still work in most cases), so they don't need to be changed.

Currently, as written, subset_expr ends up being ignored when values are already SplitValue objects.

This is defensible behavior, I think (as the construction time of the SplitValue is where the custom expression "should" be set), but probably not optimal. I'll change things slightly to make it easier to override subset expressions if (for some reason) you're reusing the same value object but want different subsetting behavior (which would be weird but I can think of a a corner case you might want to). As written originally you'd have been expected to reset the expressions on the value objects before passing to make_split_result

R/make_split_fun.R Outdated Show resolved Hide resolved
R/make_split_fun.R Outdated Show resolved Hide resolved
}
}


.or_combine_exprs <- function(ex1, ex2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember us talking about making expression handling more general and contained in a single system of functions. Do you think it is worth it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean adding this to make_subset_expr (or is this too specific?)

R/00tabletrees.R Outdated
Comment on lines 1019 to 1022
sub = .combine_subset_exprs(
pos_subset(parpos),
make_subset_expr(newspl, nsplitval)
pos_subset(parpos),
## this will grab the value's custom subset expression if present
make_subset_expr(newspl, nsplitval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the core effect of altering column col splits

@@ -1,6 +1,7 @@
context("Checking utility functions")

test_that("func_takes works with different inputs", {
func_takes <- rtables:::func_takes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed as tests access to all the namespace (with hidden functions)

Suggested change
func_takes <- rtables:::func_takes

Copy link
Contributor

@Melkiades Melkiades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very good! Thanks Gabe, I will unlock the draft to see if tests sail smoothly ;)

@Melkiades Melkiades marked this pull request as ready for review December 22, 2023 11:37
@gmbecker gmbecker closed this Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support core split overriding split functions in column space
3 participants