Skip to content

dplyr::distinct appears to consider empty rows different #2954

@JohnMount

Description

@JohnMount

dplyr::distinct() appears to consider empty rows different. Notice in the example below dplyr::distinct() returns a 2 row data frame where both rows are identical. This is a corner-case where there are no columns, but I think in this case dplyr::distinct() should not return more than 1 row in this case. Notice adding a column in the example then decreases the number of rows considered distinct.

suppressPackageStartupMessages(library("dplyr"))
packageVersion("dplyr")
#> [1] '0.7.1.9000'

d <- data.frame(x= c(1, 1))

d0 <- select(d, one_of(character(0)))
dD <- distinct(d0)
print(dD)
#> data frame with 0 columns and 2 rows

d2 <- mutate(dD, newCol = 1)
print(d2)
#>   newCol
#> 1      1
#> 2      1

distinct(d2)
#>   newCol
#> 1      1

Metadata

Metadata

Labels

featurea feature request or enhancement

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions