What should `make_with_*()` return? #2

kinto-b · 2021-04-23T09:27:06Z

The options are:

NULL invisibly
TRUE/FALSE invisibly depending on whether or not the targets were up-to-date
NULL or the result of executing the recipe. Note that this only applies to make_with_recipe, since whether the result(s) of sourcing the source are attached to the global environment depends on the arguments passed through to source()

The text was updated successfully, but these errors were encountered:

kinto-b · 2021-05-20T01:24:36Z

I just had a thought: introduce an on_exit argument which stipulates what is returned or an if_uptodate argument which stipulates what is returned if the targets are up to date. Example:

make_with_recipe(
  dependencies = RAW_DATA, 
  targets = FINAL_DATA,
  recipe = {
    raw_dat <- readRDS(RAW_DATA)
    out <- do_stuff(raw_dat)
    saveRDS(out, FINAL_DATA)
    out
  },
  if_uptodate = {
    readRDS(FINAL_DATA)
  }
)

kinto-b · 2021-09-14T09:45:49Z

@gorcha I can't make my mind up about this. Do you have any advice? Currently I've got:

make_with_recipe() invisibly returns the result of evaluating the recipe if the recipe is executed and NULL otherwise
make_with_source() invisibly returns NULL always, though by default whatever names are bound in the sourced script will be attached to the global environment.

gorcha · 2021-09-14T12:11:24Z

Yo! It seems to me like this package is mostly about flow control, so I think it makes more sense to be TRUE or FALSE (TRUE if the data is out of date and the step runs).

A couple of associated things:

Instead of using the usual source() default you should add an envir argument to make_with_source(), so the signature is the same as make_as_recipe() and they both have the same default evaluation behaviour.
Would it be better to run in a fresh environment by default so there are no side effects from a make call and it's not affected by other goings on in the calling context?

kinto-b · 2021-09-15T07:15:59Z

@gorcha Thanks for your input! Great point about the envir argument. It doesn't make sense for make_with_source() to evaluate in the global by default while make_with_recipe() evaluates in the parent.frame().

I've just had a thought, what if make_*() invisibly returns the evaluation environment? That way, if any evaluation goes on, the names bound will all be available to the user should they need them.

The one nice thing about the current behavior is that you can get a comparison object only when it's needed:

cmp <- make_with_recipe(
  targets = blah,
  dependencies = bleh,
  recipe = {
    dat <- do_stuff()
    srcutils::write_df_rds(dat, blah, 'key')
  }
)

gorcha · 2021-09-15T11:28:57Z

That would be cool, but it potentially keeps a bunch of stuff in memory unnecessarily.
For e.g. the example below, you suddenly have all objects produced from the 4 script runs in memory...

res1 <- make_with_source(...)
res2 <- make_with_source(...)
res3 <- make_with_source(...)
res4 <- make_with_source(...)

Maybe it would be good to have a "return evaluation environment" TRUE/FALSE kind of argument for debugging?

You could possibly just return the final value like a function (would that work with source()?) instead, but if you want to be able to pass specific things back to the calling environment I think it'd be better thinking about a more formal mechanism (maybe having a function that can be called from inside the script?).

This begs the question - would it be better to have a more defined return type, like a make result object that stores result metadata? It'd be good for storing more detailed info about the outcome of the run, like:

Was it run (outdated targets or not)?
Did it run successfully? (e.g. no errors)
The aforementioned additional objects somehow returned from the executing expression/script

You could also attach this to the pipeline somehow so you can show status kind of like targets 😉

kinto-b · 2021-09-16T00:44:20Z

Well, you've always got the option of not binding the returned object. So, if the default behavior is to run the sourced script in a fresh environment whose parent is parent.frame(), then

make_with_source(...)
make_with_source(...)
make_with_source(...)
make_with_source(...)

would be cleaned up by the garbage collector. Or is there some memory allocation detail I'm forgetting?

Good idea about attaching execution information to the pipeline! I'm going to open up a separate issue for that as I'll do it no matter what.

Implementing a returner is intriguing. By default source() appears to return whatever is evaluated on the final line of the sourced file, which is not ideal.

I suppose the make returner could work like this:

make_return <- function(x) {
    signalCondition(rlang::cnd("makepipe_return_cnd", res = x))
    invisible(x)
}

make_with_source <- function(...) {
  ...
  out <- tryCatch(
    source(...),
    makepipe_return_cnd = function(x) {
        x$res
    }
  )
 ... 
 
  out
}

This would replicate the early exit behaviour of return()

kinto-b · 2021-09-16T00:52:48Z

Yeah, I reckon make_return() is the best go. It unifies both _recipe() and _source() variants in a neat way

kinto-b · 2021-09-16T00:53:10Z

@gorcha Thanks for your help!

gorcha · 2021-09-16T01:33:25Z

Hey @kinto-b,

What I meant with the environments keeping stuff in memory is that a pretty natural pattern is to store the results of the make in an object to check whether it ran, and if you return the evaluation environment it's easy to unwittingly hold onto it.

Re: passing things back, I was thinking about passing back multiple objects rather than a single return value, like a make_register() or make_keep() style thing.

E.g. if this is my source script "blah.R"

x <- readRDS("blah.rds")
x %<>% do_things()

x %>% check_me() %>% make_register("x_check")
x %>% check_me_more() %>% make_register("x_check2")

x %>% do_more_things %>% writeRDS("blah_out.rds")

Then I run it, and have access to the objects as part of the result:

res <- make_with_source("blah.R", ...)

res$objects$x_check
res$objects$x_check2

This could be handled via an environment, so make_with_* attaches an environment to the executing environment, which is used by make_register().

make_with_source <- function(...) {
  ...
  out_envir <- new.env(parent = emptyenv())
  assign("__make_register__", out_envir, envir)
  source(...)
  ... 

  out$objects <- out_envir
  out
}

make_register <- function(value, name) {
  # check if __make_register__ exists first and throw a warning if not for non-make contexts
  assign(name, value, `__make_register__`)
  invisible(value)
}

kinto-b · 2021-09-16T01:54:39Z

@gorcha Yo! I did consider that approach, but it seems to me that a return()-like function would be preferable since:

a) It means the object returned by make_*() is a bit easier to work with in the case when you just want to return one value:

dat <- make_with_source(...)
dat %>% mutate(...) %>% etc()

b) It still allows you to return multiple objects in the usual way:

# This is my source script, blah.R
...
make_return(list(x = x, y = y))

c) It cleanly separates metadata (held in the Pipeline) from data

But what are your thoughts? One advantage I can think of for make_register() is that it makes it more obvious that what is registered isn't necessarily related to what the targets are

gorcha · 2021-09-16T02:51:51Z

I like the register kind of approach because it's more explicit and localised (i.e. you can signal the registration when the object is created rather than having to do it all at the bottom of the script).

And it makes more sense to me to return a result object from the make call that prints nicely so you can look at the other metadata as well (I guess basically the stuff that would be attached to the pipeline). Returning the final line of the file seems eminently reasonable in make_with_recipe() but a bit weird in make_with_source() since there's so much separation between the make call and the actual script.

kinto-b · 2021-09-17T00:22:28Z

Yeah, that's a good point. I might have a play around with a few options and see what feels most natural. Going to reopen this issue for now.

##General cleanup * Expanded CI * Expand tests (closes #12) * Rename package (closes #11) ## Execution and return behaviour * Implement `makepipe_result` class and `make_register()` (closes #2, closes #14) * Add `envir` argument to `make_with_source()` (closes #16) * Force `make_*()` evaluation to, by default, occur in a fresh environment which is a child of the execution environment

kinto-b added the question Further information is requested label Apr 23, 2021

kinto-b added this to the CRAN/0.0.3 milestone Sep 14, 2021

kinto-b closed this as completed Sep 16, 2021

kinto-b mentioned this issue Sep 16, 2021

Implement make_return() #15

Closed

kinto-b reopened this Sep 17, 2021

kinto-b mentioned this issue Sep 18, 2021

Add envir argument to make_with_source() #16

Closed

kinto-b mentioned this issue Oct 17, 2021

Updates for CRAN submission #13

Merged

kinto-b closed this as completed in eb8041d Oct 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What should `make_with_*()` return? #2

What should `make_with_*()` return? #2

kinto-b commented Apr 23, 2021

kinto-b commented May 20, 2021

kinto-b commented Sep 14, 2021

gorcha commented Sep 14, 2021

kinto-b commented Sep 15, 2021

gorcha commented Sep 15, 2021

kinto-b commented Sep 16, 2021 •

edited

Loading

kinto-b commented Sep 16, 2021

kinto-b commented Sep 16, 2021

gorcha commented Sep 16, 2021

kinto-b commented Sep 16, 2021

gorcha commented Sep 16, 2021

kinto-b commented Sep 17, 2021

What should make_with_*() return? #2

What should make_with_*() return? #2

Comments

kinto-b commented Apr 23, 2021

kinto-b commented May 20, 2021

kinto-b commented Sep 14, 2021

gorcha commented Sep 14, 2021

kinto-b commented Sep 15, 2021

gorcha commented Sep 15, 2021

kinto-b commented Sep 16, 2021 • edited Loading

kinto-b commented Sep 16, 2021

kinto-b commented Sep 16, 2021

gorcha commented Sep 16, 2021

kinto-b commented Sep 16, 2021

gorcha commented Sep 16, 2021

kinto-b commented Sep 17, 2021

What should `make_with_*()` return? #2

What should `make_with_*()` return? #2

kinto-b commented Sep 16, 2021 •

edited

Loading