Skip to content

arrange()-ing by data frame (instead of by variable) results in strange output instead of error message #3153

@huftis

Description

@huftis

It’s currently possible to use arrange() to sort a data frame by another data frame instead of by a variable. This does not make sense, and should have resulted in an error message. Instead it results in a strangely sorted, truncated version of the original data frame.

Here is an example. I generate a 150-row long tibble with two variables, x and iri. I want to sort the tibble by the iri variable, but accidentally misspells it as iris, which exists as an example data set. The result is a 5-row subset of the original tibble, with the rows in a seemingly random order.

library(dplyr)
set.seed(1)
d = tibble(x = 1:150, iri = rnorm(150))
arrange(d, iris)
#> # A tibble: 5 x 2
#>       x        iri
#>   <int>      <dbl>
#> 1     4  1.5952808
#> 2     3 -0.8356286
#> 3     2  0.1836433
#> 4     5  0.3295078
#> 5     1 -0.6264538

Here’s a similar example, using the mtcars dataset. The 32 rows of the tibble are reduced to 11 rows, again in a seemingly random order:

arrange(d[1:32,], mtcars)
#> # A tibble: 11 x 2
#>        x        iri
#>    <int>      <dbl>
#>  1     7  0.4874291
#>  2    11  1.5117812
#>  3     6 -0.8204684
#>  4     5  0.3295078
#>  5    10 -0.3053884
#>  6     1 -0.6264538
#>  7     2  0.1836433
#>  8     4  1.5952808
#>  9     3 -0.8356286
#> 10     9  0.5757814
#> 11     8  0.7383247

Metadata

Metadata

Labels

bugan unexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions