Targets for Notebooks #469

nviets · 2021-05-07T13:32:09Z

nviets
May 7, 2021

Are there any plans for recommended strategies to integrate targets with Rmd Notebooks? Python has dataflow, and Julia has Pluto. Targets seems like a natural starting point for similar functionality in Rmd.

wlandau · 2021-05-07T14:00:51Z

wlandau
May 7, 2021
Maintainer

Through tarchetypes, targets seamlessly integrates with R Markdown, including parameterized reports. This setup allows you to run reports as part of a pipeline, knit them interactively, and easily go back and forth between these two options. In this scenario, a good R Markdown report is 99% prose and 1% R code, taking advantage of targets you already computed upstream in the pipeline. Details:

20 replies

nviets May 24, 2021
Author

Looks awesome @wlandau ! Will running the target chunks in the Rmd also call the tar_make command (or tar_make_clustermq)?

wlandau May 24, 2021
Maintainer

Not in the first rollout at least, but you will be able to run the pipeline in a regular R code chunk later in the report. Sketch:


```{r setup}
library(targets) # will load the chunk engines
```

```{tar_globals script_name}
download_data <- function() {...}
remove_missing <- function(raw) {...}
analyze_data <- function(data) {...}
```

```{tar_target data, format = "feather"}
raw <- download_data_data()
raw %>%
  remove_missing() %>%
  dplyr::arrange(ID)
```

```{tar_target analysis, format = "qs"}
analyze_data(data)
```

```{r}
tar_make_clustermq(workers = 4)
```

nviets May 25, 2021
Author

Makes sense. Thank you!

cderv May 25, 2021

Just trying to answer some questions:

A custom chunk engine to emulate the pure function behavior of targets. If a chunk has type = "target" as an option, run it inside local() and assign the return value to a new variable whose name is the chunk name. Would a custom chunk engine be capable of this?

I think this would be possible yet as you already tried. For interactive usage, there will be the limitation of the IDE integration I believe. Which you already found I think.

A tarchetypes function that crawls through one or more Rmd files and translates the code chunks into _targets.R etc. for a targets pipeline. This requires the ability to purl an Rmd in a way that separately identifies the code and options of each chunk. Does rmarkdown or knitr expose a way to do this?

I believe this is possible but may use current internal function. purl can keep the option in the output document but currently that would mean parsing the R code for those comment. I believe this would benefit some improvmement in knitr maybe. For parsing rmd, parsermd is rather useful.

Somehow incorporate the user's original prose in rendered Rmd's that can print out the return values of targets. This last part is less clear to me, but maybe we could programmatically replace code chunks with tar_read() and render them as tar_render() outputs from the pipeline.

You could replace the code chunk content I believe. Does the tar_read() replacement needs to be evaluated ? or just show in the resulting output as code chunk source ? Anyway, code chunk content is in options$code

Sorry for the delay on this, I think you already found most of your question's answer 😓

wlandau May 26, 2021
Maintainer

See https://community.rstudio.com/t/suppress-evaluation-of-knitr-chunk-options-in-custom-language-engine/105774/6. I am pivoting the direction of Target Markdown to https://community.rstudio.com/t/suppress-evaluation-of-knitr-chunk-options-in-custom-language-engine/105774/7?u=wlandau because it is more powerful and less likely to conflict with the features and limitations of knitr.

wlandau · 2021-05-11T02:09:49Z

wlandau
May 11, 2021
Maintainer

A few more thoughts on this R Markdown interface idea:

Rendered reports

What should the actual rendered report do? Since most code chunks will be individual targets, one obvious choice is to print all the targets just below their respective code chunks. This would flow nicely an naturally from a literate programming perspective, but we should limit the targets printed in case the data is too large. We could either disable reading and printing by default, or we could only print small non-dynamic targets. I like the latter.

How to implement the rendered report

I need to read up on custom language engines and custom chunk engines.

Naively, we could create a temporary copy of the report that runs tar_read_raw() in each target chunk instead of the command.

Code chunk behavior

When run interactively in notebook mode, a chunk should run with no side effects (maybe inside local()) assign the return value to an object with the same name as the chunk, and print the object to the screen. These guardrails would enforce the purity and immutability requirements of targets and counteract the dangerous R-Markdown-oriented looseness that folks would not otherwise be wary of.

Other files

Besides the output HTML, we know we need to generate a _targets.R file. Beyond that, there is an opportunity to enforce some degree of modularity. But I hesitate to do so because there is no one-size-fits-all way to organize files, the report does not provide enough structure anyway, and the unpredictable file names might disrupt projects. At the risk of creating a monolithic dumping ground for messy code (a vice which literate programming enables with impunity) I think it would be okay to go with a single _targets.R file, at least to start.

3 replies

wlandau May 11, 2021
Maintainer

Guardrails

The generated _targets.R file should be treated as read-only in the RStudio IDE, e.g.

targets/R/tar_renv.R

Line 98 in a1c5e97

"# Generated by targets::tar_renv(): do not edit by hand",

.
We should have some way to nudge users away from creating multiple pipeline-generating notebooks in the same working directory.

nviets May 12, 2021
Author

Thanks @wlandau ! For generating _targets.R, it might be worth looking into knitr's purl, which allows the code from chunks to selectively be written out to an external file.

wlandau May 12, 2021
Maintainer

Yeah, on reflection this feature may end up looking something like a glorified purl, depending on what turns out to be possible through output and language engines.

wlandau · 2021-05-25T14:36:46Z

wlandau
May 25, 2021
Maintainer

Some chunk options (e.g. pattern) should be taken as language objects and not evaluated as R code. I can make that happen in non-interactive mode using the eval.after knit option and knitr::engine_output(). But if I run chunks in the RStudio IDE, options like pattern are executed like ordinary R code. Any ideas on how to suppress this? Filed an issue at rstudio/rstudio#9407.

21 replies

wlandau Jun 4, 2021
Maintainer

Another detail: for a custom language engine, how do I print arbitrary objects (plots, lm() objects, etc.) instead of being limited to just text? In knitr::engine_output(), it looks like the out argument needs to be a character vector. I tried switching back to the R engine (options$engine <- "r") just before calling engine_output(out = plot_object), but out is still printed as text. I also tried manually calling print() on the object from inside the engine, but in the notebook interface, some outputs get printed to the R console instead of inline in the notebook.

wlandau Jun 4, 2021
Maintainer

And thanks so much @yihui for your willingness to allow underscores in bookdown chunk labels!

yihui Jun 5, 2021

Another detail: for a custom language engine, how do I print arbitrary objects (plots, lm() objects, etc.) instead of being limited to just text?

The out argument can be a list instead of a character vector. The key is that you should not provide the code argument to engine_output(). In that case, knitr calls the S3 generic function knitr::sew() to generate a character vector (output): https://github.com/yihui/knitr/blob/master/R/engine.R#L82 You can register custom S3 methods.

I'm not sure if that answers your question. Perhaps I can help better if you show me an example on Monday.

wlandau Jun 5, 2021
Maintainer

I just tried to implement your suggestions in a773731: i.e.
when tar_simple = TRUE, invoke knitr::engine_output() with out as a list and with no value for code.

targets/R/utils_knitr.R

Lines 124 to 128 in a773731

    
           knitr_engine_output_print_target <- function(options, out) { 
        
             out <- list(tar_option_get("envir")[[options$label]]) 
        
             options$engine <- "r" 
        
             knitr::engine_output(options = options, out = out) 
        
           }

I tested it with this minimal report:

---
title: "Target Markdown"
output: html_document
---

```{r setup, include = FALSE}
library(targets)
tar_option_set(packages = "ggplot2")
```

```{targets targetname, tar_simple = TRUE}
ggplot(mtcars) + geom_histogram(aes(x = mpg))
```

I opened this report in the RStudio IDE and pressed the play button for each chunk sequentially. I expected the {targets targetname} chunk to print a ggplot object inline just like the R engine. But instead, the message "don't know how to handle 'targets' engine output" printed to the R console. I find this puzzling because options$engine gets changed to "r" just before knitr::engine_output(options = options, out = out).

This is not the most crucial feature, and admittedly I have some mixed feeling about implementing it. On the other hand, the behavior is not what I expected from knitr::engine_output(), especially because the R engine seems to print all kinds of values inline in the notebook and respect the print() method of the particular object.

wlandau Jun 5, 2021
Maintainer

~~Maybe I should just try an output hook?~~ On the other hand, the default output and plot hooks already look right, so maybe not.

wlandau · 2021-05-28T19:00:13Z

wlandau
May 28, 2021
Maintainer

I love how this feature turned out! Resources:

https://books.ropensci.org/targets/markdown.html
Target Markdown template: https://rstudio.github.io/rstudio-extensions/rmarkdown_templates.html
A new use_targets() function.

My last question is about syntax highlighting. Is there a way to use the same syntax highlighting for the {targets} engine as the {r} engine?

3 replies

cderv May 28, 2021

In the RStudio IDE, this is a matter of IDE support. I am not sure how they detect the language to highlight in the chunk but I suspect the chunk engine.

Regarding the HTML output for the source code, it is derived from the engine name. So his is a matter of how your engine outputs. I need to look at your code to know specifically how in your case. But basically, you may be able to modify the correct option after you engine has done its work.

Or we need to teach knitr about your engine name 😉

cderv May 28, 2021

So I believe your engine is this one

targets/R/tar_engine.R

Lines 68 to 71 in 82ead80

    
           tar_engine_output <- function(options, out) { 
        
             code <- paste(options$code, collapse = "\n") 
        
             knitr::engine_output(options = options, code = code, out = out) 
        
           }

So you could do

tar_engine_output <- function(options, out) { 
  code <- paste(options$code, collapse = "\n") 
+ # to get the correct markup later in knitr source hook
+ options$engine <- 'r'
  knitr::engine_output(options = options, code = code, out = out) 
}

This after your engine has done its work, so I don't think changing the engine will have impact on how the code is evaluated. It will just be used for further processing, and which class to apply on fenced code attribute in Markdown output is one of them.

I am a bit lazy as it is Friday night here in France so I did not try :) I let you try ?

Other solution would be for you to set the class.source option to r, through your engine or using an option hook (but it seems the targets chunk option is not to be provided each time)
https://bookdown.org/yihui/rmarkdown-cookbook/chunk-styling.html#chunk-styling

Adding the class is what triggers the Pandoc highlighting (https://pandoc.org/MANUAL.html#fenced-code-blocks)

If this is not working as I thought, we'll surely be able to do something (maybe adding a language options or similar.

wlandau May 29, 2021
Maintainer

Thanks so much, Christophe! options$engine <- "r" is perfect! Works out of the box.

Targets for Notebooks #469

nviets May 7, 2021

Replies: 4 comments · 47 replies

wlandau May 7, 2021 Maintainer

nviets May 24, 2021 Author

wlandau May 24, 2021 Maintainer

nviets May 25, 2021 Author

cderv May 25, 2021

wlandau May 26, 2021 Maintainer

wlandau May 11, 2021 Maintainer

Rendered reports

How to implement the rendered report

Code chunk behavior

Other files

wlandau May 11, 2021 Maintainer

Guardrails

nviets May 12, 2021 Author

wlandau May 12, 2021 Maintainer

wlandau May 25, 2021 Maintainer

wlandau Jun 4, 2021 Maintainer

wlandau Jun 4, 2021 Maintainer

yihui Jun 5, 2021

wlandau Jun 5, 2021 Maintainer

wlandau Jun 5, 2021 Maintainer

wlandau May 28, 2021 Maintainer

cderv May 28, 2021

cderv May 28, 2021

wlandau May 29, 2021 Maintainer

nviets
May 7, 2021

Replies: 4 comments 47 replies

wlandau
May 7, 2021
Maintainer

nviets May 24, 2021
Author

wlandau May 24, 2021
Maintainer

nviets May 25, 2021
Author

wlandau May 26, 2021
Maintainer

wlandau
May 11, 2021
Maintainer

wlandau May 11, 2021
Maintainer

nviets May 12, 2021
Author

wlandau May 12, 2021
Maintainer

wlandau
May 25, 2021
Maintainer

wlandau Jun 4, 2021
Maintainer

wlandau Jun 4, 2021
Maintainer

wlandau Jun 5, 2021
Maintainer

wlandau Jun 5, 2021
Maintainer

wlandau
May 28, 2021
Maintainer

wlandau May 29, 2021
Maintainer