Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that row ordering doesn't affect summary row calculations #556

Merged
merged 9 commits into from May 21, 2020

Conversation

rich-iannone
Copy link
Member

@rich-iannone rich-iannone commented Apr 23, 2020

If there was to be row reordering, the intermediate table used to generate summary values wouldn't take that into account, providing erroneous summary calculations. This change ensures that summary row calculations (by group and through the grand summary method) are independent of the final ordering of rows.

Here are some examples that can be used to test the fix.

tbl_1 <-
    tibble::tibble(
      id = c("1", "2", "3", "4", "5", "6"),
      value = c(1, 10, 1, 10, 99, 1),
      group = c("b", "a", "b", "a", "c", "b")
    )

  tbl_2 <-
    tbl_1 %>%
    dplyr::slice(6, 3, 5, 1, 4, 2)

  tbl_3 <-
    tbl_2 %>%
    dplyr::arrange(group, id)

  # Prepare a set gt tables with summary rows (using the same
  # `summary_rows()` call each time)
  gt_tbl_1 <-
    tbl_1 %>%
    dplyr::group_by(group) %>%
    gt(rowname_col = "id") %>%
    summary_rows(groups = TRUE, columns = vars(value), fns = list("sum"))

  gt_tbl_1b <-
    tbl_1 %>%
    gt(rowname_col = "id", groupname_col = "group") %>%
    summary_rows(groups = TRUE, columns = vars(value), fns = list("sum"))

  gt_tbl_2 <-
    tbl_2 %>%
    gt(rowname_col = "id", groupname_col = "group") %>%
    summary_rows(groups = TRUE, columns = vars(value), fns = list("sum"))

  gt_tbl_3 <-
    tbl_3 %>%
    gt(rowname_col = "id", groupname_col = "group") %>%
    summary_rows(groups = TRUE, columns = vars(value), fns = list("sum"))

After the fix is applied, all of the generated tables should have obviously correct summary lines in each of the groups (no matter the arrangement of the table rows). Prior to the fix, the summary rows in each of these gt objects would be wrong (unless it was sorted in just the right way).

Numerous tests were added to verify that different row combinations and methods of specifying groups don't affect the summary value calculations.

Fixes #553.
Fixes #470.

Strangely, changes to `export.R` were not checked in as part of #527 (the roxygenized .Rd was, however). This change ensures that the source doc is now in sync.
R/dt_summary.R Outdated Show resolved Hide resolved
@jcheng5
Copy link
Member

jcheng5 commented Apr 24, 2020

Failing case:

tbl_1 <-
  tibble::tibble(
    id = c("1", "2", "3", "4", "5", "6"),
    value = c(1, 10, 1, 10, 99, 1),
    columns = c(2, 20, 2, 20, 198, 2),
    group = c("b", "a", "b", "a", "c", "b")
  )

gt_tbl_1 <-
  tbl_1 %>%
  dplyr::group_by(group) %>%
  gt(rowname_col = "id") %>%
  summary_rows(groups = TRUE, columns = vars(value, columns), fns = list("sum")) %>%
  grand_summary_rows(columns = vars(value, columns), fns = list("sum"))

gt_tbl_1

R/dt_summary.R Outdated Show resolved Hide resolved
R/dt_summary.R Outdated Show resolved Hide resolved
Copy link
Member

@jcheng5 jcheng5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

jcheng5
jcheng5 previously approved these changes Apr 27, 2020
jcheng5
jcheng5 previously approved these changes Apr 29, 2020
@rich-iannone rich-iannone requested a review from jcheng5 May 21, 2020 22:31
@rich-iannone rich-iannone merged commit 1e42417 into master May 21, 2020
@rich-iannone rich-iannone deleted the summary-rows-calc-fix branch May 21, 2020 22:40
rich-iannone added a commit that referenced this pull request May 22, 2020
* master:
  Ensure that row ordering doesn't affect summary row calculations (#556)
rich-iannone added a commit that referenced this pull request May 22, 2020
* master:
  Ensure that row ordering doesn't affect summary row calculations (#556)
rich-iannone added a commit that referenced this pull request May 22, 2020
* master:
  Ensure that row ordering doesn't affect summary row calculations (#556)
rich-iannone added a commit that referenced this pull request May 22, 2020
* master:
  Ensure that row ordering doesn't affect summary row calculations (#556)
rich-iannone added a commit that referenced this pull request May 26, 2020
* Update DESCRIPTION and NEWS.md

* Update NEWS.md

* Merge branch 'master' into v0.2.1-rc

* master:
  Ensure that row ordering doesn't affect summary row calculations (#556)

* Update NEWS.md

* Update DESCRIPTION
rich-iannone added a commit that referenced this pull request Jun 22, 2020
* master:
  Refactor `data_color()` so that it executes faster (#576)
  Update R-CMD-check.yaml (#599)
  Release gt 0.2.1 (#588)
  Ensure that row ordering doesn't affect summary row calculations (#556)
  Update failing example (#586)
  Remove test of scales behaviour
  Squelch warnings from tibble 3.0.0/3.0.1 (#557)
  Bump cache on pkgdown.yaml for GH workflow (#570)
  Update GH Actions workflow for R CMD check (#568)
  Update Description of package to be less confusing (#569)
rich-iannone added a commit that referenced this pull request Jun 22, 2020
* master:
  Restore row striping option in stub cells (`row.striping.include_stub = TRUE`) (#564)
  Refactor `data_color()` so that it executes faster (#576)
  Update R-CMD-check.yaml (#599)
  Release gt 0.2.1 (#588)
  Ensure that row ordering doesn't affect summary row calculations (#556)
  Update failing example (#586)
  Remove test of scales behaviour
rich-iannone added a commit that referenced this pull request Jun 22, 2020
* master:
  Fix issues with defining column widths in `cols_width()` (#561)
  Add `scale_values` arg to `fmt_percent()` (#565)
  Restore row striping option in stub cells (`row.striping.include_stub = TRUE`) (#564)
  Refactor `data_color()` so that it executes faster (#576)
  Update R-CMD-check.yaml (#599)
  Release gt 0.2.1 (#588)
  Ensure that row ordering doesn't affect summary row calculations (#556)
  Update failing example (#586)
  Remove test of scales behaviour
  Squelch warnings from tibble 3.0.0/3.0.1 (#557)
  Bump cache on pkgdown.yaml for GH workflow (#570)
  Update GH Actions workflow for R CMD check (#568)
  Update Description of package to be less confusing (#569)
rich-iannone added a commit that referenced this pull request Oct 13, 2020
* master:
  Rewrite of RTF building functions and `as_rtf()` (#638)
  v0.2.2 Release Candidate (#629)
  Settable font options (#591)
  Add options for sig figs / inclusion of trailing dec marks (#546)
  Fix issues with defining column widths in `cols_width()` (#561)
  Add `scale_values` arg to `fmt_percent()` (#565)
  Restore row striping option in stub cells (`row.striping.include_stub = TRUE`) (#564)
  Refactor `data_color()` so that it executes faster (#576)
  Update R-CMD-check.yaml (#599)
  Release gt 0.2.1 (#588)
  Ensure that row ordering doesn't affect summary row calculations (#556)
rich-iannone added a commit that referenced this pull request Oct 25, 2020
* master: (30 commits)
  Spanner alignment correction (#662)
  PEN currency fix (#663)
  Fix for `gtsave()` when saving an image and specifying a `path` value (#592)
  Rewrite of RTF building functions and `as_rtf()` (#638)
  v0.2.2 Release Candidate (#629)
  Settable font options (#591)
  Add options for sig figs / inclusion of trailing dec marks (#546)
  Fix issues with defining column widths in `cols_width()` (#561)
  Add `scale_values` arg to `fmt_percent()` (#565)
  Restore row striping option in stub cells (`row.striping.include_stub = TRUE`) (#564)
  Refactor `data_color()` so that it executes faster (#576)
  Update R-CMD-check.yaml (#599)
  Release gt 0.2.1 (#588)
  Ensure that row ordering doesn't affect summary row calculations (#556)
  Update failing example (#586)
  Remove test of scales behaviour
  Squelch warnings from tibble 3.0.0/3.0.1 (#557)
  Bump cache on pkgdown.yaml for GH workflow (#570)
  Update GH Actions workflow for R CMD check (#568)
  Update Description of package to be less confusing (#569)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants