Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement fill() #113

Merged

Conversation

etiennebacher
Copy link
Contributor

Reproduce tidyr::fill(). I took the main part of the function (the filling algorithm) in a SO answer and I must say I'm not super confident of how it works, but it works and it's fast.

A few tests are missing, for lists for example (see tidyr tests. There's also a weird message with groups (see last example).

Examples:

suppressPackageStartupMessages(library(poorman))

# Value (year) is recorded only when it changes
sales <- data.frame(
  quarter = c(
    "Q1", "Q2", "Q3", "Q4", "Q1", "Q2", "Q3", "Q4", "Q1", "Q2",
    "Q3", "Q4", "Q1", "Q2", "Q3", "Q4"
  ),
  year = c(2000, NA, NA, NA, 2001, NA, NA, NA, 2002, NA, NA, NA, 2004, NA, NA, NA),
  sales = c(
    66013, 69182, 53175, 21001, 46036, 58842, 44568, 50197, 39113, 41668, 30144,
    52897, 32129, 67686, 31768, 49094
  )
)
sales
#>    quarter year sales
#> 1       Q1 2000 66013
#> 2       Q2   NA 69182
#> 3       Q3   NA 53175
#> 4       Q4   NA 21001
#> 5       Q1 2001 46036
#> 6       Q2   NA 58842
#> 7       Q3   NA 44568
#> 8       Q4   NA 50197
#> 9       Q1 2002 39113
#> 10      Q2   NA 41668
#> 11      Q3   NA 30144
#> 12      Q4   NA 52897
#> 13      Q1 2004 32129
#> 14      Q2   NA 67686
#> 15      Q3   NA 31768
#> 16      Q4   NA 49094
sales %>% fill(year)
#>    quarter year sales
#> 1       Q1 2000 66013
#> 2       Q2 2000 69182
#> 3       Q3 2000 53175
#> 4       Q4 2000 21001
#> 5       Q1 2001 46036
#> 6       Q2 2001 58842
#> 7       Q3 2001 44568
#> 8       Q4 2001 50197
#> 9       Q1 2002 39113
#> 10      Q2 2002 41668
#> 11      Q3 2002 30144
#> 12      Q4 2002 52897
#> 13      Q1 2004 32129
#> 14      Q2 2004 67686
#> 15      Q3 2004 31768
#> 16      Q4 2004 49094


# Value (pet_type) is missing above
tidy_pets <- data.frame(
  rank = c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L),
  pet_type = c(NA, NA, NA, NA, NA, "Dog", NA, NA, NA, NA, NA, "Cat"),
  breed = c(
    "Boston Terrier", "Retrievers (Labrador)", "Retrievers (Golden)",
    "French Bulldogs", "Bulldogs", "Beagles", "Persian", "Maine Coon",
    "Ragdoll", "Exotic", "Siamese", "American Short"
  )
)
tidy_pets
#>    rank pet_type                 breed
#> 1     1     <NA>        Boston Terrier
#> 2     2     <NA> Retrievers (Labrador)
#> 3     3     <NA>   Retrievers (Golden)
#> 4     4     <NA>       French Bulldogs
#> 5     5     <NA>              Bulldogs
#> 6     6      Dog               Beagles
#> 7     1     <NA>               Persian
#> 8     2     <NA>            Maine Coon
#> 9     3     <NA>               Ragdoll
#> 10    4     <NA>                Exotic
#> 11    5     <NA>               Siamese
#> 12    6      Cat        American Short
tidy_pets %>%
  fill(pet_type, .direction = "up")
#>    rank pet_type                 breed
#> 1     1      Dog        Boston Terrier
#> 2     2      Dog Retrievers (Labrador)
#> 3     3      Dog   Retrievers (Golden)
#> 4     4      Dog       French Bulldogs
#> 5     5      Dog              Bulldogs
#> 6     6      Dog               Beagles
#> 7     1      Cat               Persian
#> 8     2      Cat            Maine Coon
#> 9     3      Cat               Ragdoll
#> 10    4      Cat                Exotic
#> 11    5      Cat               Siamese
#> 12    6      Cat        American Short


# Value (n_squirrels) is missing above and below within a group
squirrels <- data.frame(
  group = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3),
  name = c(
    "Sam", "Mara", "Jesse", "Tom", "Mike", "Rachael", "Sydekea",
    "Gabriela", "Derrick", "Kara", "Emily", "Danielle"
  ),
  role = c(
    "Observer", "Scorekeeper", "Observer", "Observer", "Observer",
    "Observer", "Scorekeeper", "Observer", "Observer", "Scorekeeper",
    "Observer", "Observer"
  ),
  n_squirrels = c(NA, 8, NA, NA, NA, NA, 14, NA, NA, 9, NA, NA)
)
squirrels
#>    group     name        role n_squirrels
#> 1      1      Sam    Observer          NA
#> 2      1     Mara Scorekeeper           8
#> 3      1    Jesse    Observer          NA
#> 4      1      Tom    Observer          NA
#> 5      2     Mike    Observer          NA
#> 6      2  Rachael    Observer          NA
#> 7      2  Sydekea Scorekeeper          14
#> 8      2 Gabriela    Observer          NA
#> 9      3  Derrick    Observer          NA
#> 10     3     Kara Scorekeeper           9
#> 11     3    Emily    Observer          NA
#> 12     3 Danielle    Observer          NA
squirrels %>%
  group_by(group) %>%
  fill(n_squirrels, .direction = "downup") %>%
  ungroup()
#> Adding missing grouping variables: `group`
#> Adding missing grouping variables: `group`
#> Adding missing grouping variables: `group`
#>    group     name        role n_squirrels
#> 1      1      Sam    Observer           8
#> 2      1     Mara Scorekeeper           8
#> 3      1    Jesse    Observer           8
#> 4      1      Tom    Observer           8
#> 5      2     Mike    Observer          14
#> 6      2  Rachael    Observer          14
#> 7      2  Sydekea Scorekeeper          14
#> 8      2 Gabriela    Observer          14
#> 9      3  Derrick    Observer           9
#> 10     3     Kara Scorekeeper           9
#> 11     3    Emily    Observer           9
#> 12     3 Danielle    Observer           9

Created on 2022-08-26 by the reprex package (v2.0.1)

@codecov-commenter
Copy link

Codecov Report

Merging #113 (d3f0898) into master (bbb9aed) will increase coverage by 0.05%.
The diff coverage is 96.29%.

@@            Coverage Diff             @@
##           master     #113      +/-   ##
==========================================
+ Coverage   93.53%   93.58%   +0.05%     
==========================================
  Files          58       59       +1     
  Lines        1455     1482      +27     
==========================================
+ Hits         1361     1387      +26     
- Misses         94       95       +1     
Impacted Files Coverage Δ
R/fill.R 96.29% <96.29%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@nathaneastwood
Copy link
Owner

I have today returned from holiday. I will try and look at this this evening. Thanks for the hard work.

@etiennebacher
Copy link
Contributor Author

No problem, take your time (to be honest I forgot this PR)

@etiennebacher
Copy link
Contributor Author

Small bump (I'm going through my open PRs to see if they can be merged/closed)

@nathaneastwood nathaneastwood self-requested a review October 3, 2022 20:38
Copy link
Owner

@nathaneastwood nathaneastwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great work. Thanks a lot @etiennebacher and sorry for the slow response on this one.

@nathaneastwood nathaneastwood merged commit 89aaae8 into nathaneastwood:master Oct 3, 2022
@etiennebacher etiennebacher deleted the feat_implement_fill branch October 3, 2022 21:06
@etiennebacher
Copy link
Contributor Author

No problem ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants