-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Milestone
Description
I want to compare each element in a factor vector against the previous element in the same vector in order to identify those elements that are different from the previous one.
For example (expected behaviour):
> test_factor <- factor(rep(c('A','B','C'), each = 3))
> test_factor
[1] A A A B B B C C C
Levels: A B C
> test_factor != lag(test_factor)
[1] NA FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSEHowever, when using the same technique inside a mutate() function, it does not work as intended.
> test_df <- tbl_df(data.frame(test = test_factor))
> str(test_df)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 9 obs. of 1 variable:
$ test: Factor w/ 3 levels "A","B","C": 1 1 1 2 2 2 3 3 3
> test_df %>% mutate(is_diff = (test != lag(test)))
Source: local data frame [9 x 2]
test is_diff
1 A NA
2 A TRUE
3 A TRUE
4 B TRUE
5 B TRUE
6 B TRUE
7 C TRUE
8 C TRUE
9 C TRUEI found a work-around that is to explicitly convert the factor to either numeric or character vector. However this is inconvenient and reduces code readability.
> test_df %>% mutate(is_diff = (as.numeric(test) != lag(as.numeric(test))))
Source: local data frame [9 x 2]
test is_diff
1 A NA
2 A FALSE
3 A FALSE
4 B TRUE
5 B FALSE
6 B FALSE
7 C TRUE
8 C FALSE
9 C FALSE
> test_df %>% mutate(is_diff = (as.character(test) != lag(as.character(test))))
Source: local data frame [9 x 2]
test is_diff
1 A NA
2 A FALSE
3 A FALSE
4 B TRUE
5 B FALSE
6 B FALSE
7 C TRUE
8 C FALSE
9 C FALSEPlease let me know if it was because something I misunderstood. Thanks.
Metadata
Metadata
Assignees
Labels
No labels