knbknb inside flevels(): fct_reorder() argument is called .x (not x)
With "x" I encountered this error message:
   
    mutate(df_r, method = fct_reorder(method, x = desc(time)))
    Error in mutate_impl(.data, dots) : 
      Evaluation error: argument ".x" is missing, with no default.
Latest commit 3d0c623 Apr 19, 2018
Permalink
..
Failed to load latest commit information.
ex01_leave-it-in-the-data-frame_files/figure-gfm Row oriented work webinar materials Apr 11, 2018
ex08_nesting-is-good_files/figure-gfm Row oriented work webinar materials Apr 11, 2018
iterate-over-rows_files/figure-gfm Row oriented work webinar materials Apr 11, 2018
.gitignore Row oriented work webinar materials Apr 11, 2018
2018-04-11_row-oriented-work-rstudio-webinar.pdf Row oriented work webinar materials Apr 11, 2018
LICENSE Row oriented work webinar materials Apr 11, 2018
README.md Row oriented work webinar materials Apr 11, 2018
col-benchmark.csv Row oriented work webinar materials Apr 11, 2018
col-benchmark.png Row oriented work webinar materials Apr 11, 2018
ex01_leave-it-in-the-data-frame.R Row oriented work webinar materials Apr 11, 2018
ex01_leave-it-in-the-data-frame.md Row oriented work webinar materials Apr 11, 2018
ex02_create-or-mutate-in-place.R Row oriented work webinar materials Apr 11, 2018
ex02_create-or-mutate-in-place.md Row oriented work webinar materials Apr 11, 2018
ex03_row-wise-iteration-are-you-sure.R Row oriented work webinar materials Apr 11, 2018
ex03_row-wise-iteration-are-you-sure.md Row oriented work webinar materials Apr 11, 2018
ex04_map-example.R Row oriented work webinar materials Apr 11, 2018
ex04_map-example.md Row oriented work webinar materials Apr 11, 2018
ex05_attack-via-rows-or-columns.R Row oriented work webinar materials Apr 11, 2018
ex05_attack-via-rows-or-columns.md Row oriented work webinar materials Apr 11, 2018
ex06_runif-via-pmap.R Row oriented work webinar materials Apr 11, 2018
ex06_runif-via-pmap.md Row oriented work webinar materials Apr 11, 2018
ex07_group-by-summarise.R Row oriented work webinar materials Apr 11, 2018
ex07_group-by-summarise.md Row oriented work webinar materials Apr 11, 2018
ex08_nesting-is-good.R Row oriented work webinar materials Apr 11, 2018
ex08_nesting-is-good.md Row oriented work webinar materials Apr 11, 2018
iterate-over-rows.R inside flevels(): fct_reorder() argument is called .x (not x) Apr 19, 2018
iterate-over-rows.md Row oriented work webinar materials Apr 11, 2018
row-benchmark.csv Row oriented work webinar materials Apr 11, 2018
row-benchmark.png Row oriented work webinar materials Apr 11, 2018

README.md

Row-oriented workflows in R with the tidyverse

Materials for RStudio webinar:

Thinking inside the box: you can do that inside a data frame?!
Jenny Bryan
Wednesday, April 11 at 1:00pm ET / 10:00am PT

PDF of slides:

Note: this is a static copy of materials taken from this repo: https://github.com/jennybc/row-oriented-workflows

Abstract

The data frame is a crucial data structure in R and, especially, in the tidyverse. Working on a column or a variable is a very natural operation, which is great. But what about row-oriented work? That also comes up frequently and is more awkward. In this webinar I’ll work through concrete code examples, exploring patterns that arise in data analysis. We’ll discuss the general notion of "split-apply-combine", row-wise work in a data frame, splitting vs. nesting, and list-columns.

Code examples

Beginner --> intermediate --> advanced
Not all are used in webinar

  • Leave your data in that big, beautiful data frame. ex01_leave-it-in-the-data-frame Show the evil of creating copies of certain rows of certain variables, using Magic Numbers and cryptic names, just to save some typing.
  • Adding or modifying variables. ex02_create-or-mutate-in-place df$var <- ... versus dplyr::mutate(). Recycling/safety, df's as data mask, aesthetics.
  • Are you SURE you need to iterate over rows? ex03_row-wise-iteration-are-you-sure Don't fixate on most obvious generalization of your pilot example and risk overlooking a vectorized solution. Features a paste() example, then goes out with some glue glory.
  • Working with non-vectorized functions. ex04_map-example Small example using purrr::map() to apply nrow() to list of data frames.
  • Row-wise thinking vs. column-wise thinking. ex05_attack-via-rows-or-columns Data rectangling example. Both are possible, but I find building a tibble column-by-column is less aggravating than building rows, then row binding.
  • Iterate over rows of a data frame. iterate-over-rows Empirical study of reshaping a data frame into this form: a list with one component per row. Revisiting a study originally done by Winston Chang. Run times for different number of rows or columns.
  • Generate data from different distributions via purrr::pmap(). ex06_runif-via-pmap Use purrr::pmap() to generate U[min, max] data for various combinations of (n, min, max), stored as rows of a data frame.
  • Are you SURE you need to iterate over groups? ex07_group-by-summarise Use dplyr::group_by() and dplyr::summarise() to compute group-wise summaries, without explicitly splitting up the data frame and re-combining the results. Use list() to package multivariate summaries into something summarise() can handle, creating a list-column.
  • Group-and-nest. ex08_nesting-is-good How to explicitly work on groups of rows via nesting (our recommendation) vs splitting.

More tips and links

Big thanks to everyone who weighed in on the related twitter thread. This was very helpful for planning content.

45 minutes is not enough! A few notes about more special functions and patterns for row-driven work. Maybe we need to do a follow up ...

tibble::enframe() and deframe() are handy for getting into and out of the data frame state.

map() and map2() are useful for working with list-columns inside mutate().

tibble::add_row() handy for adding a single row at an arbitrary position in data frame.

imap() handy for iterating over something and its names or integer indices at the same time.

When you have multiple values for a single unit in one row (e.g. repeated measures), consider reshaping for easier computation. That turns a row-oriented problem into group_by() + summarise(), which is usually easier.

dplyr::case_when() helps you get rid of hairy, nested if () {...} else {...} statements.

Great resource on the "why?" of functional programming approaches (such as map()): https://github.com/getify/Functional-Light-JS/blob/master/manuscript/ch1.md/