Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use language objects for commands #700

Closed
wlandau opened this issue Feb 1, 2019 · 3 comments
Closed

Use language objects for commands #700

wlandau opened this issue Feb 1, 2019 · 3 comments

Comments

@wlandau
Copy link
Collaborator

wlandau commented Feb 1, 2019

drake currently uses character strings to represent code.

drake_plan(x = simulate_data())
#> # A tibble: 1 x 2
#>   target command        
#>   <chr>  <chr>          
#> 1 x      simulate_data()

Now that we have a language-based DSL (#233) we can start to move away from text. Specific considerations:

  1. Printing. Better display for list columns if all elements have the same type r-lib/pillar#34 has been quiet, so we may need a special S3 class and a print() method for drake plans. The trouble is that data frame operations often drop attributes.
  2. Wildcard functions. In functions like evaluate_plan(), we should deparse the command column, do the text replacement, safe_parse() to get language objects, and then restore the S3 class.
  3. Command standardization. I think this part will clean up and speed up nicely. We just need to remember to use all.vars(functions = TRUE) instead of grepl() to set look_for_ignore in standardize_command().
@wlandau wlandau self-assigned this Feb 1, 2019
@wlandau wlandau changed the title Treat commands as language objects Use language objects for commands Feb 1, 2019
@wlandau wlandau added this to the Version 7.0.0 milestone Feb 1, 2019
@wlandau
Copy link
Collaborator Author

wlandau commented Feb 2, 2019

Using a new_tibble(subclass = "drake_plan") with custom printing actually seems to work. In the 700 branch, drake deparses list columns of language objects and then overloads pillar::type_sum() to show <lang> instead of <chr> when printing. Deparsing should not be a bottleneck because it happens after the rows are subsetted.

Now, I need to update the wildcard functions and the tests. Could take some time.

@wlandau
Copy link
Collaborator Author

wlandau commented Feb 3, 2019

Fixed via #708. cc @krlmlr

@wlandau wlandau closed this as completed Feb 3, 2019
@wlandau wlandau added this to Done in A flexible, agile, domain-specific API via automation Feb 3, 2019
@wlandau
Copy link
Collaborator Author

wlandau commented Feb 3, 2019

Printing seems to work out okay now. Note the <expr> type summary below. Hopefully the deparsing is not too slow. I have not yet decided whether to skip most of the deparsing or how best to do that.

library(drake)
drake_plan(
  raw_data = readxl::read_excel(file_in("raw_data.xlsx")),
  data = raw_data %>%
    mutate(Species = forcats::fct_inorder(Species)),
  hist = create_plot(data),
  fit = lm(Sepal.Width ~ Petal.Width + Species, data),
  report = rmarkdown::render(
    knitr_in("report.Rmd"),
    output_file = file_out("report.html"),
    quiet = TRUE
  )
)
#> # A tibble: 5 x 2
#>   target   command                                                         
#>   <chr>    <expr>                                                          
#> 1 raw_data readxl::read_excel(file_in("raw_data.xlsx"))                   …
#> 2 data     raw_data %>% mutate(Species = forcats::fct_inorder(Species))   …
#> 3 hist     create_plot(data)                                              …
#> 4 fit      lm(Sepal.Width ~ Petal.Width + Species, data)                  …
#> 5 report   rmarkdown::render(knitr_in("report.Rmd"), output_file = file_ou…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

1 participant