Skip to content

slice_sample() errors if n bigger than number of rows #6185

@swannyy14

Description

@swannyy14

In the man page for slice functions, the description for the argument n states:

If n is greater than the number of rows in the group (or prop > 1), the result will be silently truncated to the group size. If the proportion of a group size does not yield an integer number of rows, the absolute value of prop*nrow(.data) is rounded down.

The output of slice_sample used to be the same data.frame (with different ordering) if n is higher than the number of rows, but it is now returning an error.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

tmp <- data.frame(x = c(rep('a',6), rep('b',4)), y = 1:10)
print(tmp)
#>    x  y
#> 1  a  1
#> 2  a  2
#> 3  a  3
#> 4  a  4
#> 5  a  5
#> 6  a  6
#> 7  b  7
#> 8  b  8
#> 9  b  9
#> 10 b 10

tmp %>% slice_sample(n = 15)
#> Error in `slice_sample()`:
#> ! Problem while computing indices.
#> Caused by error in `sample.int()`:
#> ! cannot take a sample larger than the population when 'replace = FALSE'

tmp %>% group_by(x) %>% slice_sample(n = 15)
#> Error in `slice_sample()`:
#> ! Problem while computing indices.
#> ℹ The error occurred in group 1: x = "a".
#> Caused by error in `sample.int()`:
#> ! cannot take a sample larger than the population when 'replace = FALSE'

Created on 2022-02-11 by the reprex package (v2.0.1)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviorrows ↕️Operations on rows: filter(), slice(), arrange()

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions