Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should there be a "suppress" trigger? #529

Closed
wlandau opened this issue Oct 4, 2018 · 11 comments
Closed

Should there be a "suppress" trigger? #529

wlandau opened this issue Oct 4, 2018 · 11 comments

Comments

@wlandau
Copy link
Member

wlandau commented Oct 4, 2018

trigger(suppress = some_code) would work like trigger(condition = ...) except it would override all the other triggers (except condition) and avoid the build. The second part of #528 brings up a use case, but I am not sure if it is worth making triggers more complicated. Deserves a couple days of thought.

@idavydov
Copy link

idavydov commented Oct 4, 2018

I think suppress is a great idea. Additionally suppressed nodes could even have a different color in the graph visualization.

For my use-case an alternative would be to have a permanent error output, which will suppress recomputation of erroneous nodes.

@idavydov
Copy link

idavydov commented Oct 4, 2018

Another solution which could solve my problem is a special value, e.g. drake_stop(). This also looks like a very flexible solution. So the following code

plan <- drake_plan(
  a = rnorm(1),
  b = if (a>0) sqrt(a) else NULL,
  c = if (is.null(b)) NULL else b+1,
  d = if (is.null(c)) NULL else c+1
)

Would become:

plan <- drake_plan(
  a = rnorm(1),
  b = if (a>0) sqrt(a) else drake_stop(),
  c = b+1,
  d = c+1
)

On top of that node b (and its' children?) could be visualized using another color.

Also, the workaround with NULL becomes rather complicated when a target depends on multiple targets. So far I came up with the following workaround:

result.or.null <- function(FUN, ...) {
  args <- list(...)
  if (any(sapply(is.null, args))) NULL else do.call(FUN, args)
}

plan <- drake_plan(
  a = rnorm(1),
  b = if (a>0) sqrt(a) else NULL,
  c = result.or.null(my.func, b),
  d = result.or.null(my.func, c),
  e = result.or.null(my.func, c, d)
)

But this solution is very repetitive and rather limiting. E.g., I cannot use expressions anymore.
And you might expect some functions which expect NULL in normal conditions. (Here, again I could replace NULL with a special value)

@wlandau
Copy link
Member Author

wlandau commented Oct 5, 2018

I could picture drake_stop() as a function that outputs a special S3 object. In should_build_target(), drake could conceivably go through the dependencies to check for the output of drake_stop(), but this could be slow for large workflows.

Another side note for clarity. The condition trigger really should have followed the following convention:

  • TRUE: always build the target.
  • FALSE: never build the target.
  • NULL: defer to the other triggers.

The current convention is:

  • TRUE: always build the target.
  • FALSE: defer to the other triggers.

But it is too late to make the change now.

@idavydov
Copy link

idavydov commented Oct 5, 2018

  1. It seems that trigger condition is a more flexible and less computationally intensive approach. It could be possible to have an option to enable drake_stop() not to slowdown normal workflows, although this sounds slightly complicated.
  2. In principle it is possible to have the following conventions without breaking backward compatibility:
  • "always" or TRUE: always build the target.
  • "defer" or FALSE: defer to the other triggers.
  • "never": never build the target.
  1. Alternatively it could be
pkgconfig::set_config("drake::trigger_condition" = "three-states")

So far it seems that the second option is the least ugly one and doesn't seem to hit the performance.

There should be also a way for drake to understand not build targets which depend non-existing targets.

@wlandau
Copy link
Member Author

wlandau commented Oct 6, 2018

The more I think about it, the more I actually like TRUE/FALSE because logicals are more convenient to generate than special strings.

After some reflection, I am basically convinced that we need a new trigger. At the moment, I think trigger(skip = ...) should have the power to blacklist targets, and it should override all other triggers, including condition.

The problem I see with drake_stop() is that it is trying to do what triggers do, but in commands. I strongly prefer to keep commands and triggers separate.

@wlandau
Copy link
Member Author

wlandau commented Oct 7, 2018

Nope, that idea has too much refactoring. A new skip trigger would require more dependency detection, which means a lot more sensitive code in build_drake_graph() and slower preprocessing.

How about a new mode argument to trigger()? Here is what I am thinking.

  • trigger(mode = "whitelist"): current behavior. Default.
  • trigger(mode = "blacklist"): if condition evaluates to FALSE, override all the other triggers and skip the target. Otherwise, defer to the other triggers.
  • trigger(mode = "condition"): the condition trigger is the only decider. In other words, if condition evaluates to TRUE, ignore the other triggers and build the target. If it evaluates to FALSE, ignore the other triggers and skip the target.

wlandau pushed a commit that referenced this issue Oct 7, 2018
@wlandau
Copy link
Member Author

wlandau commented Oct 7, 2018

cc @kendonB because you originally requested triggers. I think trigger modes as described above would combine the best of several worlds.

wlandau pushed a commit that referenced this issue Oct 7, 2018
@wlandau wlandau closed this as completed in 6f666c7 Oct 7, 2018
@wlandau
Copy link
Member Author

wlandau commented Oct 7, 2018

See ?trigger for the new functionality.

@idavydov
Copy link

idavydov commented Oct 8, 2018

Thanks a lot for implementing this!

However, it seems that the new trigger functionality can control if a targets gets rebuilt, but it cannot control if it's built for the first time.

plan <- drake_plan(
  a = -100,
  b = target(
    sqrt(a),
    trigger=trigger(condition=a>0, mode='blacklist')
  ),
  c = b+1
)
clean(destroy=TRUE)
make(plan)

The output:

target a
target b
Warning: target b warnings:
  NaNs produced
target c
Warning message:
In sqrt(a) : NaNs produced

Is this an expected behavior?

It is consistent with the current documentation, however not sure if it solves the use-case from #528.

@wlandau
Copy link
Member Author

wlandau commented Oct 8, 2018

I may allow missing targets at some point, but does make things more complicated (both because of drake's philosophy and the current internals) and I do not think the majority of use cases need it. I will still think about it. For now, you can ignore those initial warnings.

2 recommendations that I think meet your use case:

  1. Start with a fake plan to prime the pump.
  2. Use the same blacklist trigger for targets downstream of b.
library(drake)
pkgconfig::set_config("drake::strings_in_dots" = "literals")
fake_plan <- data.frame(target = letters[1:3], command = 1)
make(fake_plan)
#> target a
#> target b
#> target c
plan <- drake_plan(
  a = target(-100, trigger = trigger()),
  b = sqrt(a),
  c = b + 1
)
make(plan, trigger = trigger(condition = a > 0, mode = "blacklist"))
#> target a
#> Used non-default triggers. Some targets may not be up to date.

Created on 2018-10-08 by the reprex package (v0.2.1)

@wlandau
Copy link
Member Author

wlandau commented Jan 12, 2020

FYI: the new cancel() and cancel_if() functions may be useful in similar situations: #1135.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants