pretty printing of plans #489

wlandau · 2018-08-02T13:02:54Z

It is difficult to visually inspect long commands in workflow plan data frames. Example: #257 (comment).

library(drake)
pkgconfig::set_config("drake::strings_in_dots" = "literals")
drake_plan(
  u_auckland = make_place(
    Name = "University of Auckland",
    Latitude = -36.8521369,
    Longitude = 174.7688785
  ),
  shapefile = {
    file_out("u-auckland.prj", "u-auckland.shx", "u-auckland.dbf")
    st_write(
      obj = u_auckland,
      dsn = file_out("u-auckland.shp"),
      driver = "ESRI Shapefile",
      delete_dsn = TRUE
    )
  }
)
#> # A tibble: 2 x 2
#>   target     command                                                      
#> * <chr>      <chr>                                                        
#> 1 u_auckland "make_place(Name = \"University of Auckland\", Latitude = -3…
#> 2 shapefile  "{\n    file_out(\"u-auckland.prj\", \"u-auckland.shx\", \"u…

Desired output:

# A tibble: 2 x 2
  target     command                                             
* <chr>      <drake_cmd>
1 u_auckland make_place(
               Name = "University of Auckland",
               Latitude = -36.8521369,
               Longitude = 174.7688785
             )
2 shapefile  {
               file_out(
                 "u-auckland.prj",
                 "u-auckland.shx",
                 "u-auckland.dbf"
               )
               st_write(
                 obj = u_auckland,
                 dsn = file_out("u-auckland.shp"),
                 driver = "ESRI Shapefile",
                 delete_dsn = TRUE
               )
             }

I am not sure if this is possible, but I think it may be worth trying some combination of pillar::pillar_shaft(), styler, highlight, and crayon.

The text was updated successfully, but these errors were encountered:

connoralbright · 2018-08-02T13:50:44Z

This would be a very, very nice feature.

krlmlr · 2018-08-02T21:23:11Z

How about a pretty-print method that prints/returns a drake_plan() call, with indentation?

wlandau · 2018-08-03T00:44:15Z

Brilliant! Much easier and just as good.

wlandau · 2018-08-03T12:26:04Z

We will just have to make it really clear that the plan is just a data frame and that people do not always need to use drake_plan(). Ref: #451 (comment). Maybe a comment would be enough.

> print(plan)
data frame generated by drake_plan(
  u_auckland = make_place(
                 Name = "University of Auckland",
                 Latitude = -36.8521369,
                 Longitude = 174.7688785
               )
  shapefile = {
                file_out(
                  "u-auckland.prj",
                  "u-auckland.shx",
                  "u-auckland.dbf"
                )
                st_write(
                  obj = u_auckland,
                  dsn = file_out("u-auckland.shp"),
                  driver = "ESRI Shapefile",
                  delete_dsn = TRUE
                )
              }
)

wlandau · 2018-08-05T02:26:54Z

We should define a special S3 class and print method for printing, and I am having some initial trouble. The guide on extending tibble helped me get started, but I still do not know how to make sure an S3 "drake_plan" class gets preserved during dplyr operations.

as_drake_plan <- function(x, ...){
  class(x) <- c("drake_plan", "tbl_df", "tbl", "data.frame")
  x
}

drake_plan.as.data.frame <- as_drake_plan
drake_plan.as_data_frame <- as_drake_plan
drake_plan.as_tibble <- as_drake_plan
drake_plan.as.tibble <- as_drake_plan

`[.drake_plan` <- function(...){
  as_drake_plan(NextMethod())
}

library(dplyr)
library(drake)
plan <- drake_plan(
  x = get_data(),
  y = analyze_data(x)
) %>%
  as_drake_plan() %>%
  print
#> # A tibble: 2 x 2
#>   target command        
#> * <chr>  <chr>          
#> 1 x      get_data()     
#> 2 y      analyze_data(x)
class(plan)
#> [1] "drake_plan" "tbl_df"     "tbl"        "data.frame"
filter(plan, target == "x") %>%
  class()
#> [1] "tbl_df"     "tbl"        "data.frame"

Related Stack Overflow post here.

wlandau · 2018-08-05T02:41:49Z

Just started to construct the call in the drake_plan_call branch.

wlandau · 2018-08-05T03:30:12Z

We now have an internal drake_plan_call() function to generate calls to drake_plan() from actual plans. Ref: #493.

wlandau · 2018-08-05T03:30:20Z

wlandau · 2018-08-05T12:33:41Z

Update: from #494, we now have a special "drake_plan" S3 class for plans. dplyr functions still seem to drop the class, but other than that, we have the ground work now. A new print.drake_plan() method will come eventually. I am not exactly sure on the best style, but I am convinced that it should be some cleverly indented and colored version of a call to drake_plan().

wlandau · 2018-08-05T12:34:06Z

Continued work on this will be in the files R/drake_plan_class.R and R/drake_plan_print.R.

wlandau · 2018-08-05T12:41:23Z

So apparently, dropping tibble subclasses is already a known dplyr issue: tidyverse/dplyr#2532. Makes me think we should just roll with it and expose the as_drake_plan() method.

wlandau · 2018-08-05T14:08:39Z

@lorenzwalthert, I would like to include you in this conversation too. This feature is likely to require a custom styler style guide for calls to drake_plan().

lorenzwalthert · 2018-08-05T16:35:30Z

Thanks. I think the fact that attributes and sub classes are generally not preserved in R are a pitty and limit the usefulness of this mission slightly. Anyways, as_drake_plan() seems a way out and I am happy to provide guidance regarding styling. Just let me know what the requirements are.

wlandau · 2018-08-06T04:22:08Z

Yeah, not the best circumstances, but I still think this feature is worth exploring.

I laid the groundwork for R/drake_plan_class.R and R/drake_plan_print.R in #493 and #494. Beyond that, I think we're in a brainstorming phase (which is the most fun part). So I think it is a bit early to have all the requirements. But I do think our raw drake_plan() call should be styled.

drake::load_mtcars_example()
my_plan$trigger <- NA
my_plan$trigger[4] <- "trigger(condition = is_tuesday(), file = FALSE)"
my_plan$non_standard_column <- 1234
# Long commands are often truncated this way:
print(my_plan)
#> # A tibble: 15 x 4
#>    target      command                  trigger           non_standard_co…
#>    <chr>       <chr>                    <chr>                        <dbl>
#>  1 report      "knit(knitr_in(\"report… <NA>                          1234
#>  2 small       simulate(48)             <NA>                          1234
#>  3 large       simulate(64)             <NA>                          1234
#>  4 regression… reg1(small)              trigger(conditio…             1234
#>  5 regression… reg1(large)              <NA>                          1234
#>  6 regression… reg2(small)              <NA>                          1234
#>  7 regression… reg2(large)              <NA>                          1234
#>  8 summ_regre… suppressWarnings(summar… <NA>                          1234
#>  9 summ_regre… suppressWarnings(summar… <NA>                          1234
#> 10 summ_regre… suppressWarnings(summar… <NA>                          1234
#> 11 summ_regre… suppressWarnings(summar… <NA>                          1234
#> 12 coef_regre… suppressWarnings(summar… <NA>                          1234
#> 13 coef_regre… suppressWarnings(summar… <NA>                          1234
#> 14 coef_regre… suppressWarnings(summar… <NA>                          1234
#> 15 coef_regre… suppressWarnings(summar… <NA>                          1234
# So we might style a drake_plan() call that could generate the plan.
call <- drake:::drake_plan_call(my_plan)
str(call)
#>  language drake_plan(report = knit(knitr_in("report.Rmd"), file_out("report.md"),      quiet = TRUE), small = simulate(48),| __truncated__ ...
# But the default print method does not guide intuition.
print(call)
#> drake_plan(report = knit(knitr_in("report.Rmd"), file_out("report.md"), 
#>     quiet = TRUE), small = simulate(48), large = simulate(64), 
#>     regression1_small = target(command = reg1(small), trigger = trigger(condition = is_tuesday(), 
#>         file = FALSE)), regression1_large = reg1(large), regression2_small = reg2(small), 
#>     regression2_large = reg2(large), summ_regression1_small = suppressWarnings(summary(regression1_small$residuals)), 
#>     summ_regression1_large = suppressWarnings(summary(regression1_large$residuals)), 
#>     summ_regression2_small = suppressWarnings(summary(regression2_small$residuals)), 
#>     summ_regression2_large = suppressWarnings(summary(regression2_large$residuals)), 
#>     coef_regression1_small = suppressWarnings(summary(regression1_small))$coefficients, 
#>     coef_regression1_large = suppressWarnings(summary(regression1_large))$coefficients, 
#>     coef_regression2_small = suppressWarnings(summary(regression2_small))$coefficients, 
#>     coef_regression2_large = suppressWarnings(summary(regression2_large))$coefficients)

My main preference is to arrange this output in a 2-column grid with each target on its own set of rows. drake plans are just data frames, and I think it is important to preserve the overall look and feel of a data frame/tibble. Beyond that, we can experiment with per-column widths, the overall width, max lines per target, max lines overall, and syntax highlighting.

Thank you for offering to provide guidance. It will really help to draw on your aesthetic opinions and styler expertise.

wlandau · 2018-08-18T17:45:49Z

After letting this issue sit for a few days, I no longer think the drake plan itself should have special printing. It is so crucial to emphasize that the plan is just a data frame, and printing out a function call by default could make the plan too mysterious. But what if we expose drake_plan_call() and add a nice print method for the output?

Most of the remaining work for that is to create a custom style guide. Relevant: https://lorenzwalthert.netlify.com/posts/customizing-styler-the-quick-way/

wlandau · 2018-08-18T17:54:43Z

I think the syntax highlighting piece may be better suited to styler itself. Ref: r-lib/styler#417.

lorenzwalthert · 2018-08-18T23:50:56Z

To be honest, I think you don't even need styler for the highlighting only. Can we just use cat(prettycode::highlight(...), sep = "\n")?

wlandau · 2018-08-19T00:06:01Z

Sure, seems to work fine in my own console.

lorenzwalthert · 2018-08-19T00:09:30Z

Ok, cool. I mean you can still use styler to prettify your code beforehand according to a (custom) style guide.

wlandau · 2018-08-19T00:20:48Z

Absolutely. I am in the beginnings of constructing a custom style guide to manage line breaks and indentation. The desired output is a drake_plan() call as formatted below.

pkgconfig::set_config("drake::strings_in_dots" = "literals")
plan <- drake::drake_plan(
  small_data = download_data("https://some_website.com") %>%
    select_my_columns() %>%
    munge(),
  large_data_raw = target(
    command = download_data("https://lots_of_data.com") %>%
      select_top_columns,
    trigger = trigger(
      change = time_last_modified("https://lots_of_data.com"),
      command = FALSE,
      depend = FALSE
    ),
    timeout = 1e3
  )
)

Text to begin with:

call <- drake:::drake_plan_call(plan)
text <- rlang::expr_text(call, width = 70)
text
#> [1] "drake_plan(small_data = download_data(\"https://some_website.com\") %>% select_my_columns() %>% \n    munge(), large_data_raw = target(command = download_data(\"https://lots_of_data.com\") %>% \n    select_top_columns, trigger = trigger(change = time_last_modified(\"https://lots_of_data.com\"), \n    command = FALSE, depend = FALSE), timeout = 1000))"

wlandau · 2018-08-19T00:45:16Z

I'm a little unclear about the role of pd_flat$newlines vs pd_flat$lag_newlines. Would you elaborate? Here is what I am seeing. Using pd_flat$newlines, no new line breaks are added.

add_line_breaks <- function(pd_flat){
  pd_flat$newlines <- pd_flat$newlines + (pd_flat$token == "SYMBOL_SUB")
  pd_flat
}
guide <- function(){
  styler::create_style_guide(line_break = tibble::lst(add_line_breaks))
}
styled <- styler::style_text(text, style = guide)
cat(prettycode::highlight(styled), sep = "\n")
#> drake_plan(small_data = download_data("https://some_website.com") %>% select_my_columns() %>%
#> munge(), large_data_raw = target(command = download_data("https://lots_of_data.com") %>%
#> select_top_columns, trigger = trigger(change = time_last_modified("https://lots_of_data.com"),
#> command = FALSE, depend = FALSE), timeout = 1000))

And with pd_flat$lag_newlines:

add_line_breaks <- function(pd_flat){
  pd_flat$lag_newlines <- pd_flat$lag_newlines | (pd_flat$token == "SYMBOL_SUB")
  pd_flat
}
guide <- function(){
  styler::create_style_guide(line_break = tibble::lst(add_line_breaks))
}
styled <- styler::style_text(text, style = guide)
cat(prettycode::highlight(styled), sep = "\n")
#> drake_plan(
#> small_data = download_data("https://some_website.com") %>% select_my_columns() %>%
#> munge(),
#> large_data_raw = target(
#> command = download_data("https://lots_of_data.com") %>%
#> select_top_columns,
#> trigger = trigger(
#> change = time_last_modified("https://lots_of_data.com"),
#> command = FALSE,
#> depend = FALSE),
#> timeout = 1000))

wlandau · 2018-08-19T00:52:33Z

I think it is appropriate to have:

A new line and a 2-space indent for each target name.
A new line and a 4-space indent for each argument to target() (only appears at the target name level).
A new line and a 6-space indent for each argument to trigger() (only appears as the trigger argument to target()).
Line breaks and appropriate indentation for other code long enough to wrap (for commands and custom triggers).

Any suggestions on wrangling pd_flat to accomplish this?

wlandau · 2018-08-19T12:50:41Z

I figured out a solution: at the top level, recurse the AST of a drake_plan() call to make sure we have the proper line breaks and indentation. Then, apply styler::style_text() to the leaf nodes. The result is quite nice, and we have syntax highlighting with prettycode::highlight() in the console.

library(drake)
plan <- drake_plan(
  small_data = download_data("https://some_website.com") %>%
    select_my_columns() %>%
    munge(),
  large_data_raw = target(
    command = download_data("https://lots_of_data.com") %>%
      select_top_columns,
    trigger = trigger(
      change = time_last_modified("https://lots_of_data.com"),
      command = FALSE,
      depend = FALSE
    ),
    timeout = 1e3
  ),
  strings_in_dots = "literals"
)
print(plan)
#> # A tibble: 2 x 4
#>   target    command                     trigger                    timeout
#> * <chr>     <chr>                       <chr>                        <dbl>
#> 1 small_da… "download_data(\"https://s… <NA>                            NA
#> 2 large_da… "download_data(\"https://l… "trigger(change = time_la…    1000
drake_plan_source(plan)
#> drake_plan(
#>   small_data = download_data("https://some_website.com") %>%
#>     select_my_columns() %>%
#>     munge(),
#>   large_data_raw =target(
#>     command = download_data("https://lots_of_data.com") %>% select_top_columns(),
#>     trigger =trigger(
#>       change = time_last_modified("https://lots_of_data.com"),
#>       command = FALSE,
#>       depend = FALSE
#>     ),
#>     timeout = 1000
#>   ),
#>   strings_in_dots = "literals"
#> )

wlandau · 2018-08-19T12:54:08Z

I do like the simplicity of deferring to styler's defaults, but the code for the recursion is longer and more complicated than I would like. A custom style guide would make it more concise.

lorenzwalthert · 2018-08-19T16:59:55Z

This indeed seems complicated. What are the requirements for the styling? E.g. line break after drake_plan( and after every target code?

wlandau · 2018-08-19T19:29:39Z

That's part of it. I would prefer each of the following to start on its own line:

Every target (argument to drake_plan())
Every argument to target()
Every argument to trigger()
Every piece of code following %>%
Every closing parenthesis.

This, plus the indentation above, are what I am after. I take it this is difficult to do in styler? Do you think a custom style guide would require more code than the existing solution?

krlmlr · 2018-11-07T10:04:21Z

FYI: The "vertical" class in pillar is being renamed to "pillar_vertical". Seems safer to use the "glue" class for this purpose, or actually any other class that just prints the contents of the object via cat() .

Ref: #489 (comment)

wlandau · 2018-11-09T13:07:11Z

Good point. I forgot I already defined a "drake_plan_source" class and a special print method, so we actually do not need to borrow an S3 class from another package.

lorenzwalthert · 2018-11-09T14:23:27Z

This, plus the indentation above, are what I am after. I take it this is difficult to do in styler? Do you think a custom style guide would require more code than the existing solution?

The problem with a style guide is that all rules are not exported from styler, so you needed :::, which gives R CMD Check notes, or implement the style guide in styler for which you can't control release schedules and other things and it adds overhead, so it's a bad dependency I guess. Plus we have not even decided on how to handle third-party style guides. So until we expose rules for styling in stylerinfra or similar, I think you are better served with the current solution.

wlandau · 2019-05-06T12:31:42Z

Update: the current behavior is to create a special expr_list class for the command, trigger, and transform columns of the plan. Using S3 magic, these columns get deparsed to characters when the plan gets printed, and an <expr> label appears at the top of the tibble.

library(drake)
drake_plan(
  data = get_data(),
  y = analyze_data(data)
)
#> # A tibble: 2 x 2
#>   target command           
#>   <chr>  <expr>            
#> 1 data   get_data()        
#> 2 y      analyze_data(data)

^{Created on 2019-05-06 by the reprex package (v0.2.1)}

Drawbacks:

This is a lot of bookkeeping for an feature as minor as pretty printing.
The entire column is deparsed when the tibble gets printed, which does not scale well. Printing should be instantaneous, we have a low tolerance for delays.

So I would rather rely on r-lib/pillar#34 and r-lib/pillar#153. But because pillar's enhanced printing relies on vctrs, there is a snag: r-lib/pillar#34 (comment). Not sure yet if it totally solves (2), but I think has potential to solve (1).

wlandau added the topic: style label Aug 2, 2018

wlandau mentioned this issue Aug 2, 2018

Can custom pillar_shaft() methods print elements over multiple lines? r-lib/pillar#125

Closed

wlandau added the difficulty: advanced label Aug 5, 2018

wlandau mentioned this issue Aug 5, 2018

Construct a call to drake_plan() from a workflow plan data frame #493

Merged

7 tasks

wlandau removed the difficulty: advanced label Aug 5, 2018

wlandau mentioned this issue Aug 5, 2018

Use an S3 "drake_plan" class #494

Merged

7 tasks

wlandau mentioned this issue Aug 18, 2018

syntax highlighting in print.vertical() r-lib/styler#417

Closed

wlandau pushed a commit that referenced this issue Aug 19, 2018

Implement a rough version of drake_plan_source() (ref: #489)

17714b9

wlandau closed this as completed in 3c73b92 Aug 19, 2018

wlandau pushed a commit that referenced this issue Nov 9, 2018

Drop the "vertical" class

87e473d

Ref: #489 (comment)

wlandau mentioned this issue May 6, 2019

Lists of language objects r-lib/vctrs#321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pretty printing of plans #489

pretty printing of plans #489

wlandau commented Aug 2, 2018 •

edited

Loading

connoralbright commented Aug 2, 2018

krlmlr commented Aug 2, 2018

wlandau commented Aug 3, 2018

wlandau commented Aug 3, 2018

wlandau commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018

lorenzwalthert commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 6, 2018 •

edited

Loading

wlandau commented Aug 18, 2018

wlandau commented Aug 18, 2018

lorenzwalthert commented Aug 18, 2018

wlandau commented Aug 19, 2018

lorenzwalthert commented Aug 19, 2018

wlandau commented Aug 19, 2018 •

edited

Loading

wlandau commented Aug 19, 2018

wlandau commented Aug 19, 2018 •

edited

Loading

wlandau commented Aug 19, 2018

wlandau commented Aug 19, 2018

lorenzwalthert commented Aug 19, 2018

wlandau commented Aug 19, 2018

krlmlr commented Nov 7, 2018

wlandau commented Nov 9, 2018

lorenzwalthert commented Nov 9, 2018

wlandau commented May 6, 2019 •

edited

Loading

pretty printing of plans #489

pretty printing of plans #489

Comments

wlandau commented Aug 2, 2018 • edited Loading

connoralbright commented Aug 2, 2018

krlmlr commented Aug 2, 2018

wlandau commented Aug 3, 2018

wlandau commented Aug 3, 2018

wlandau commented Aug 5, 2018 • edited Loading

wlandau commented Aug 5, 2018 • edited Loading

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018 • edited Loading

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018

wlandau commented Aug 5, 2018

lorenzwalthert commented Aug 5, 2018 • edited Loading

wlandau commented Aug 6, 2018 • edited Loading

wlandau commented Aug 18, 2018

wlandau commented Aug 18, 2018

lorenzwalthert commented Aug 18, 2018

wlandau commented Aug 19, 2018

lorenzwalthert commented Aug 19, 2018

wlandau commented Aug 19, 2018 • edited Loading

wlandau commented Aug 19, 2018

wlandau commented Aug 19, 2018 • edited Loading

wlandau commented Aug 19, 2018

wlandau commented Aug 19, 2018

lorenzwalthert commented Aug 19, 2018

wlandau commented Aug 19, 2018

krlmlr commented Nov 7, 2018

wlandau commented Nov 9, 2018

lorenzwalthert commented Nov 9, 2018

wlandau commented May 6, 2019 • edited Loading

wlandau commented Aug 2, 2018 •

edited

Loading

wlandau commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 5, 2018 •

edited

Loading

lorenzwalthert commented Aug 5, 2018 •

edited

Loading

wlandau commented Aug 6, 2018 •

edited

Loading

wlandau commented Aug 19, 2018 •

edited

Loading

wlandau commented Aug 19, 2018 •

edited

Loading

wlandau commented May 6, 2019 •

edited

Loading