Skip to content

n_distinct giving error "Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'language'" #1657

@stevencb

Description

@stevencb

The code below works when using "dplyr::n_distinct" but not when just using "n_distinct". The error in the non-working case is below.

Is it a bug or I am missing something? It doesn't appear that "n_distinct" is being picked up from a different package, as RStudio indicates it is coming from 'dplyr'. I have also included the equivalent "length(distinct(data))", which does work.

I have included the code, output, and the session info.

Simplified Example

This example simplifies showing the issue.

library(dplyr)
dat3 <- data.frame(id = c(2,6,7,10,10))
dat3 %>% summarise(n_unique = length(unique(id[id>6])))
##   n_unique
## 1        2

dat3 %>% summarise(n_unique = n_distinct(id[id>6]))
## Error in summarise_impl(.data, dots) : 
##   Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'language'

dat3 %>% summarise(n_unique = dplyr::n_distinct(id[id>6]))
##   n_unique
## 1        2

Original Example
NOTE: this is just a reproducible example to show the issue, I know the operation can be approached in a different way.

library(dplyr)

yrdf2 <- data.frame(year = 2012:2015)
dat2  <- data.frame(id = 1:4, start_yr = c(2012, 2012, 2013, 2013))

### WORKS
yrdf2 %>% group_by(year) %>% mutate(count = dplyr::n_distinct( dat2$id[ dat2$start_yr <= year ] ))
## Source: local data frame [4 x 2]
## Groups: year [4]
## 
##    year count
##   (int) (int)
## 1  2012     2
## 2  2013     4
## 3  2014     4
## 4  2015     4

### FAILS
yrdf2 %>% group_by(year) %>% mutate(count = n_distinct( dat2$id[ dat2$start_yr <= year ] ))
## Error in mutate_impl(.data, dots) : 
##   Value of SET_STRING_ELT() must be a 'CHARSXP' not a 'language'

### WORKS
yrdf2 %>% group_by(year) %>% mutate(count = length(unique(( dat2$id[ dat2$start_yr <= year ]))))
## Source: local data frame [4 x 2]
## Groups: year [4]
## 
##    year count
##   (int) (int)
## 1  2012     2
## 2  2013     4
## 3  2014     4
## 4  2015     4


Session Info

R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.4.3.9000 plyr_1.8.3       scales_0.3.0     ggplot2_2.0.0    data.table_1.9.6
[6] bit64_0.9-5      bit_1.1-12       tidyr_0.3.1      lubridate_1.5.0 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.3          magrittr_1.5         munsell_0.4.2        statmod_1.4.22      
 [5] colorspace_1.2-6     lattice_0.20-33      R6_2.1.2             stringr_1.0.0       
 [9] tools_3.2.3          parallel_3.2.3       grid_3.2.3           gtable_0.1.2        
[13] nlme_3.1-122         h2o_3.6.0.8          DBI_0.3.1            lazyeval_0.1.10.9000
[17] assertthat_0.1       bitops_1.0-6         RCurl_1.95-4.7       stringi_1.0-1       
[21] jsonlite_0.9.19      chron_2.3-47        

Metadata

Metadata

Labels

bugan unexpected problem or unintended behavior

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions