-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What should make_with_*()
return?
#2
Comments
I just had a thought: introduce an make_with_recipe(
dependencies = RAW_DATA,
targets = FINAL_DATA,
recipe = {
raw_dat <- readRDS(RAW_DATA)
out <- do_stuff(raw_dat)
saveRDS(out, FINAL_DATA)
out
},
if_uptodate = {
readRDS(FINAL_DATA)
}
) |
@gorcha I can't make my mind up about this. Do you have any advice? Currently I've got:
|
Yo! It seems to me like this package is mostly about flow control, so I think it makes more sense to be A couple of associated things:
|
@gorcha Thanks for your input! Great point about the I've just had a thought, what if The one nice thing about the current behavior is that you can get a comparison object only when it's needed: cmp <- make_with_recipe(
targets = blah,
dependencies = bleh,
recipe = {
dat <- do_stuff()
srcutils::write_df_rds(dat, blah, 'key')
}
) |
That would be cool, but it potentially keeps a bunch of stuff in memory unnecessarily. res1 <- make_with_source(...)
res2 <- make_with_source(...)
res3 <- make_with_source(...)
res4 <- make_with_source(...) Maybe it would be good to have a "return evaluation environment" You could possibly just return the final value like a function (would that work with This begs the question - would it be better to have a more defined return type, like a make result object that stores result metadata? It'd be good for storing more detailed info about the outcome of the run, like:
You could also attach this to the pipeline somehow so you can show status kind of like targets 😉 |
Well, you've always got the option of not binding the returned object. So, if the default behavior is to run the sourced script in a fresh environment whose parent is make_with_source(...)
make_with_source(...)
make_with_source(...)
make_with_source(...) would be cleaned up by the garbage collector. Or is there some memory allocation detail I'm forgetting? Good idea about attaching execution information to the pipeline! I'm going to open up a separate issue for that as I'll do it no matter what. Implementing a returner is intriguing. By default I suppose the make returner could work like this: make_return <- function(x) {
signalCondition(rlang::cnd("makepipe_return_cnd", res = x))
invisible(x)
}
make_with_source <- function(...) {
...
out <- tryCatch(
source(...),
makepipe_return_cnd = function(x) {
x$res
}
)
...
out
} This would replicate the early exit behaviour of |
Yeah, I reckon |
@gorcha Thanks for your help! |
Hey @kinto-b, What I meant with the environments keeping stuff in memory is that a pretty natural pattern is to store the results of the make in an object to check whether it ran, and if you return the evaluation environment it's easy to unwittingly hold onto it. Re: passing things back, I was thinking about passing back multiple objects rather than a single return value, like a E.g. if this is my source script "blah.R" x <- readRDS("blah.rds")
x %<>% do_things()
x %>% check_me() %>% make_register("x_check")
x %>% check_me_more() %>% make_register("x_check2")
x %>% do_more_things %>% writeRDS("blah_out.rds") Then I run it, and have access to the objects as part of the result: res <- make_with_source("blah.R", ...)
res$objects$x_check
res$objects$x_check2 This could be handled via an environment, so make_with_source <- function(...) {
...
out_envir <- new.env(parent = emptyenv())
assign("__make_register__", out_envir, envir)
source(...)
...
out$objects <- out_envir
out
}
make_register <- function(value, name) {
# check if __make_register__ exists first and throw a warning if not for non-make contexts
assign(name, value, `__make_register__`)
invisible(value)
} |
@gorcha Yo! I did consider that approach, but it seems to me that a a) It means the object returned by dat <- make_with_source(...)
dat %>% mutate(...) %>% etc() b) It still allows you to return multiple objects in the usual way: # This is my source script, blah.R
...
make_return(list(x = x, y = y)) c) It cleanly separates metadata (held in the Pipeline) from data But what are your thoughts? One advantage I can think of for |
I like the register kind of approach because it's more explicit and localised (i.e. you can signal the registration when the object is created rather than having to do it all at the bottom of the script). And it makes more sense to me to return a result object from the make call that prints nicely so you can look at the other metadata as well (I guess basically the stuff that would be attached to the pipeline). Returning the final line of the file seems eminently reasonable in |
Yeah, that's a good point. I might have a play around with a few options and see what feels most natural. Going to reopen this issue for now. |
##General cleanup * Expanded CI * Expand tests (closes #12) * Rename package (closes #11) ## Execution and return behaviour * Implement `makepipe_result` class and `make_register()` (closes #2, closes #14) * Add `envir` argument to `make_with_source()` (closes #16) * Force `make_*()` evaluation to, by default, occur in a fresh environment which is a child of the execution environment
The options are:
NULL
invisiblyTRUE
/FALSE
invisibly depending on whether or not the targets were up-to-dateNULL
or the result of executing therecipe
. Note that this only applies tomake_with_recipe
, since whether the result(s) of sourcing thesource
are attached to the global environment depends on the arguments passed through tosource()
The text was updated successfully, but these errors were encountered: