Incorporate rlang tidy evaluation in drake #200

kendonB · 2018-01-30T02:45:37Z

Ref: https://cran.r-project.org/web/packages/rlang/vignettes/tidy-evaluation.html

The !! operator would be especially useful when building up complicated commands.

This, for example, works in tibble, which is my current workaround:

> library(tibble)
> library(drake)
>   
> little_b <- "b"

> tibble(target = "letter", command = !!little_b)
# A tibble: 1 x 2
  target command
  <chr>  <chr>  
1 letter b    


> drake_plan(letter = !!little_b)
  target      command
1 letter !(!little_b)

Using tibble in the background might be a good way to achieve this without much pain.

The text was updated successfully, but these errors were encountered:

wlandau · 2018-01-30T03:32:14Z

This could even help with @AlexAxthelm's #77 and #79, which are dplyr-focused. I am new to tidy evaluation, and @lionel-'s webinar last month was a great first exposure. I am super excited for @hadley's Extending the Tidyverse workshop this week, and I may be better versed in it afterwards.

I expect reimplementing drake_plan() will be a good first exercise. make() is a whole other story. To support tidy evaluation of drake workflow commands, there is some tricky sensitive work to do in run.R, particularly in with_timeout() and wrap_in_try_statement(). Looking back, this is a rather inelegant part of the code base. I expect tidy evaluation to have a better way of evaluating commands. Requirements:

On success, return the target's value and make sure it is assigned to the target itself in config$envir.
On error, return an informative object of class "error" just like evaluate::try_capture_stack(). This is essential to drake::diagnose(), which gives users access to the complete error logs of failed targets.
Play nicely with R.utils::withTimeout().

wlandau · 2018-01-30T04:00:29Z

On the other hand, changing how commands are evaluated in make() would be a big move. I need to step back and think about the big picture. Drake is tidyverse-friendly in the sense that the data frame is the first-class citizen: workflow plans are data frames, not remake YAML files or Makefiles. But elevating drake to be a proper extension of the tidyverse is something else. If done correctly, this will major contribution to the user experience. If done poorly, it will cause more unintuitive behavior that requires extra documentation. Drake is an unfamiliar idea, and most people already find it difficult to learn.

A major strength of the current drake is that a command is just an ordinary code chunk. Evaluating it in make() is the same as evaluating it in your console (provided that your dependencies are loaded in your environment: loadd(..., deps = TRUE)). Building on top of the tidy evaluation would make those two modes of execution different. I am not sure I am comfortable with that. On the other hand, if we do make the change, then updates to the documentation will be easy because the section on diagnosing failed targets is already almost there.

kendonB · 2018-01-30T04:16:12Z

This is certainly a big move and you would probably want it on it's own branch to start.

I am not so concerned about code running differently from the console vs in make as the use of this feature is really just for power users who would be able to navigate fine. The nse interface in rlang is so particular that no one would put that in inadvertently.

One thing I've found a couple of times is that I want to create commands that depend on targets. For example, a regression command depends on the names of an intermediate dataset that only exists as a target. I workaround this by calling make twice. Once after the intermediate dataset is built. tidy evaluation might allow users to overcome these awkward workarounds with slightly less awkward workarounds.

wlandau · 2018-01-30T04:25:29Z

Yeah, I guess tidy evaluation does not totally overturn evaluation itself. And if the code in actual commands could depend on the values of previous targets, #77 would have a lot more potential.

Begins to address #200. Tries to support the !! operator and quosures.

wlandau · 2018-01-30T06:24:05Z

From this blog, it should be enough to replace my use of quote() with rlang::expr(), as far as make() is concerned. In fact, c89f028#diff-74379f67ceff780b514be78e35c7e21cL101 should have done the trick. Unfortunately, I rely on formatR::tidy_source() to tidy up commands here, and there is an odd bug:

formatR::tidy_source(text = "!!little_b", output = FALSE)$text.tidy
## [1] "!(!little_b)"

Is there a pre-packaged equivalent that is more tidyverse-compliant?

wlandau · 2018-01-30T06:54:56Z

Tidying commands is important because that's how drake ignores changes to comments and whitespace. Unfortunately, tidying text is likely to involve some form of deparse(parse(...)). (See this StackOverflow post.)

For maximum compatibility and increased safety all around, here is what I am thinking: the command that is reformatted with tidy_command() need not be the command that is actually evaluated. There should really be two versions of each command:

An evaluation command, which is as close to the command in the original workflow plan as possible. This is the command that will run and actually evaluate the target. There is no need to cache/hash it because the workflow plan already has it.
A standardized command. Standardization is a kind of reformatting that strips away non-standard whitespace and comments. This is what allows drake to avoid triggering excessive rebuilds from changes to whitespace and comments.

wlandau · 2018-01-30T07:37:51Z

Another setback: non-lexical scoping in tidy eval:

my_plan <- drake_plan(list = c(
  little_b = "\"b\"",
  letter = "!!little_b"
))
make(my_plan, envir = new.env(parent = globalenv()))

## fail letter

diagnose(letter)

## [1] "object 'little_b' not found"

wlandau · 2018-01-30T07:39:50Z

On the plus side, the following works in d8431e0 since all the environments agree. We're close.

my_plan <- drake_plan(list = c(
  little_b = "\"b\"",
  letter = "!!little_b"
))
make(my_plan)
readd(letter)

## [1] "b"

Now, is there a way to force a quasiquotation to use old-fashioned lexical scoping?

wlandau · 2018-01-30T08:05:39Z

What we really need is a way to decide the environment where quasiquotation symbols are resolved. See this post.

Quasiquotation with !! is enabled in commands, etc. Supports #200.

wlandau · 2018-01-30T12:46:44Z

ccd967c seems to work for all parallel computing options currently implemented. Confirmation:

install.packages("drake", type = "source", repos = NULL)
devtools::load_all("drake")
for(s in testing_scenario_names()){ # parallel computing and global/custom environment scenarios
  set_testing_scenario(s)
  print(get_testing_scenario())

test_with_dir("make() does tidy eval in commands", {
  con <- dbug()
  con$plan <- drake_plan(list = c(
    little_b = "\"b\"",
    letter = "!!little_b"
  ))
  con$targets <- con$plan$target
  testrun(con)
  expect_equal(readd(letter), "b")
})
}

Now, drake_plan() needs just work, and we're all set.

wlandau · 2018-01-30T13:37:51Z

@kendonB if !! is evaluated indrake_plan(), then there would no longer be a !! to evaluate in make(). Users would need the list argument of drake_plan() to suppress the evaluation of !! and save it for make(). Are you okay with that?

wlandau · 2018-01-30T14:19:52Z

From this discussion, the only option may be to evaluate !! right away.

AlexAxthelm · 2018-01-30T14:38:14Z

I'm curious about the use case here. It seems that the goal here is to use a evaluate an object while creating a plan, so that plans can be altered dynamically, but isn't that the domain of evaluate_plan()? I agree that it can be a little kludgy for the toy example here, and I know that I've been running into issues with multiple rules not expanding the way I want (will file separately), but I believe that this is the equivalent in current drake paradigm:

library("drake")
little_b <- "b"
letter_template <- drake_plan(
	letter = letter__
	)
letter_plan <- evaluate_plan(
	letter_template,
	rules = list(letter__ = little_b)
	)
letter_plan
#    target command
#1 letter_b       b

I agree that being tidy-friendly is a good thing, but I worry about what to do if I want to do if I want to delay evaluation of a variable until make. Personally, I often make plans along the lines of

drake_plan(foo = bar %>% summarize(baz = mean(rlang::UQ(as.name("baz")))

in order for drake to not identify baz as a dependency, when it is actually just a column.

Supports #200

wlandau · 2018-01-30T14:54:58Z

Since you brought that up, I am planning to go ahead and use tidy evaluation in ... by default, but implement a tidy_evaluation argument in drake_plan() to disable it. From this post, I think recovering !!x unevaluated would be really messy if it is possible at all, which makes me think it would be safer to just embrace tidy evaluation. For most users, the current lack of tidy evaluation is a waste: the other parsing techniques I know just turn !!x into !(!x), so we're in a "use it or lose it" situation there.

Support tidy evaluation (solve #200)

wlandau · 2018-01-30T15:31:55Z

Closed via #202. Thanks @kendonB and @AlexAxthelm!

kendonB · 2018-03-06T22:01:25Z

Coming back to this now. As far as I can see, there is still no way to delay tidy evaluation until the making stage? The use case of constructing commands that depend on the values of targets is then still unavailable in drake?

krlmlr · 2018-03-06T22:40:18Z

Seems so. I have proposed some ideas in #233.

krlmlr · 2018-03-06T22:41:41Z

It seems worthwhile to incorporate concepts from tidy evaluation there.

wlandau · 2018-03-07T02:09:28Z

I agree that like so many other drake issues, the right approach is #233. In the short term, you can delay evaluation by writing your commands as text (clearly not ideal in the long run).

library(drake)
plan <- drake_plan(list = c(index = "sample.int(26, 1)", letter = "LETTERS[!!index]"))
make(plan)
#> cache /tmp/Rtmp4BtHQk/.drake
#> connect 1 import: plan
#> connect 2 targets: index, letter
#> check 2 items: LETTERS, sample.int
#> check 1 item: index
#> target index
#> check 1 item: letter
#> target letter
readd(letter)
#> cache /tmp/Rtmp4BtHQk/.drake
#> [1] "L"

AlexAxthelm · 2018-03-07T14:38:12Z

nb: I have a stopgap workaround that sort of works for me, but it involves calling make on 2 plans, and only works on simplistic expansions.

I use one plan to generate the data and create the levels for my splits, and then another to evaluate those levels. It loses the explicit dependency between the data source and the final analysis, but for something where there are changing levels, or the levels aren't known prior to runtime, I haven't found anything better in the current paradigm.

An example:

library("drake")
library("dplyr")
write.csv(iris, "iris.csv")

data_and_levels <- drake_plan(strings_in_dots = "literals",
  data = read.csv(file_in("iris.csv")),
  species_list = data %>% pull("Species") %>% unique()
)

make(data_and_levels)
loadd(species_list)

process_data_template <- drake_plan(strings_in_dots = "literals",
  species_data = dplyr::filter(data, Species == "XXSPECIES"),
  species_analysis = lm(
    formula = Sepal.Length ~ Sepal.Width + Petal.Length,
    data =species_data_XXSPECIES
  )
)

process_data <- evaluate_plan(
  plan = process_data_template,
  rules = list(XXSPECIES = species_list)
)

second_plan <- process_data %>%
  bind_rows(gather_plan(., target = "end"))

vis_drake_graph(drake_config(second_plan))

make(second_plan)

wlandau added the type: new feature label Jan 30, 2018

wlandau pushed a commit that referenced this issue Jan 30, 2018

Replace quote() with rlang::expr() in run.R

c89f028

Begins to address #200. Tries to support the !! operator and quosures.

wlandau mentioned this issue Jan 30, 2018

Decide the environment where !! is evaluated r-lib/rlang#399

Closed

wlandau pushed a commit that referenced this issue Jan 30, 2018

Enable make() to use rlang::expr()

ccd967c

Quasiquotation with !! is enabled in commands, etc. Supports #200.

wlandau pushed a commit that referenced this issue Jan 30, 2018

Use quasiquotation in drake_plan()

2d5bee1

Supports #200

wlandau pushed a commit that referenced this issue Jan 30, 2018

Merge pull request #202 from ropensci/tidy_eval

1e50514

Support tidy evaluation (solve #200)

wlandau closed this as completed Jan 30, 2018

This was referenced Feb 1, 2018

Consider styler to replace formatR #201

Closed

Use tibbles and language objects internally instead of data frames and text #247

Closed

wlandau mentioned this issue Feb 13, 2018

Make it easier to load workflows structured as packages. #208

Closed

wlandau mentioned this issue Apr 8, 2018

CRAN check errors in drake 5.1.0 #356

Closed

wlandau mentioned this issue Jul 3, 2018

Metaprogramming Difficulties (vignette potentially helpful) #451

Closed

wlandau mentioned this issue Sep 25, 2018

Challenges in progamming with tidyeval #518

Closed

wlandau mentioned this issue Apr 2, 2019

How to use unquote in actual code #811

Closed

3 tasks

wlandau mentioned this issue Aug 26, 2019

add support for strings and files for combine() #996

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorporate rlang tidy evaluation in drake #200

Incorporate rlang tidy evaluation in drake #200

kendonB commented Jan 30, 2018

wlandau commented Jan 30, 2018 •

edited

wlandau commented Jan 30, 2018 •

edited

kendonB commented Jan 30, 2018

wlandau commented Jan 30, 2018 •

edited

wlandau commented Jan 30, 2018 •

edited

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

AlexAxthelm commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

kendonB commented Mar 6, 2018

krlmlr commented Mar 6, 2018 •

edited

krlmlr commented Mar 6, 2018

wlandau commented Mar 7, 2018

AlexAxthelm commented Mar 7, 2018

Incorporate rlang tidy evaluation in drake #200

Incorporate rlang tidy evaluation in drake #200

Comments

kendonB commented Jan 30, 2018

wlandau commented Jan 30, 2018 • edited

wlandau commented Jan 30, 2018 • edited

kendonB commented Jan 30, 2018

wlandau commented Jan 30, 2018 • edited

wlandau commented Jan 30, 2018 • edited

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

AlexAxthelm commented Jan 30, 2018

wlandau commented Jan 30, 2018

wlandau commented Jan 30, 2018

kendonB commented Mar 6, 2018

krlmlr commented Mar 6, 2018 • edited

krlmlr commented Mar 6, 2018

wlandau commented Mar 7, 2018

AlexAxthelm commented Mar 7, 2018

wlandau commented Jan 30, 2018 •

edited

wlandau commented Jan 30, 2018 •

edited

wlandau commented Jan 30, 2018 •

edited

wlandau commented Jan 30, 2018 •

edited

krlmlr commented Mar 6, 2018 •

edited