New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 2134 #2136
Issue 2134 #2136
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2136 +/- ##
=======================================
Coverage 96.14% 96.14%
=======================================
Files 87 87
Lines 4952 4953 +1
=======================================
+ Hits 4761 4762 +1
Misses 191 191
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I was thinking however whether we should add an option to group the NAs - for instance use.na = TRUE
. I investigated a few similar functions, and they work like this:
-
split()
Drops NA as a group.Any missing values in f are dropped together with the corresponding values of x.
-
dplyr::group_split()
KeepsNA
as a group
df <- structure(list(
x = 1:10,
y = structure(c(
NA, 1L, 1L, 1L, 1L,
NA, 3L, 3L, 3L, 3L
),
.Label = c("a", "b", "c"),
class = "factor"
)
),
row.names = c(NA, -10L), class = "data.frame"
)
df
## x y
## 1 1 <NA>
## 2 2 a
## 3 3 a
## 4 4 a
## 5 5 a
## 6 6 <NA>
## 7 7 c
## 8 8 c
## 9 9 c
## 10 10 c
# drops NA
split(df, df$y, drop = TRUE)
## $a
## x y
## 2 2 a
## 3 3 a
## 4 4 a
## 5 5 a
##
## $c
## x y
## 7 7 c
## 8 8 c
## 9 9 c
## 10 10 c
# keeps NA
dplyr::group_split(df, y, .drop = TRUE)
## <list_of<
## tbl_df<
## x: integer
## y: factor<af15a>
## >
## >[3]>
## [[1]]
## # A tibble: 4 × 2
## x y
## <int> <fct>
## 1 2 a
## 2 3 a
## 3 4 a
## 4 5 a
##
## [[2]]
## # A tibble: 4 × 2
## x y
## <int> <fct>
## 1 7 c
## 2 8 c
## 3 9 c
## 4 10 c
##
## [[3]]
## # A tibble: 2 × 2
## x y
## <int> <fct>
## 1 1 <NA>
## 2 6 <NA>
And aggregate()
has an option na.action = na.omit
(although it applies only to formulas):
> na.action
a function which indicates what should happen when the data contain NA
values. The default is to ignore missing values in the given variables.
This PR only makes |
Fix #2134