Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: rewrite $head, $limit, and $tail to be equivalent to Python Polars #840

Merged
merged 5 commits into from
Feb 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
- In the when-then-otherwise expressions, the last `$otherwise()` is now optional,
as in Python Polars. If `$otherwise()` is not specified, rows that don't respect
the condition set in `$when()` will be filled with `null` (#836).
- `<DataFrame>$head()` and `<DataFrame>$tail()` methods now support negative
etiennebacher marked this conversation as resolved.
Show resolved Hide resolved
row numbers (#840).

## Polars R Package 0.14.1

Expand Down
59 changes: 27 additions & 32 deletions R/dataframe__frame.R
Original file line number Diff line number Diff line change
Expand Up @@ -769,44 +769,39 @@ DataFrame_with_columns = function(...) {
}



#' Limit a DataFrame
#' @name DataFrame_limit
#' @description Take some maximum number of rows.
#' @param n Positive number not larger than 2^32.
#'
#' @details Any number will converted to u32. Negative raises error.
#' @keywords DataFrame
#' @return DataFrame
#' @inherit LazyFrame_head title details
#' @param n Number of rows to return. If a negative value is passed,
#' return all rows except the last [`abs(n)`][abs].
#' @return A [DataFrame][DataFrame_class]
#' @examples
#' pl$DataFrame(iris)$limit(6)
DataFrame_limit = function(n) {
self$lazy()$limit(n)$collect()
}

#' Head of a DataFrame
#' @name DataFrame_head
#' @description Get the first `n` rows of the query.
#' @param n Positive number not larger than 2^32.
#' df = pl$DataFrame(foo = 1:5, bar = 6:10, ham = letters[1:5])
#'
#' @inherit DataFrame_limit details
#' @keywords DataFrame
#' @return DataFrame

DataFrame_head = function(n) {
#' df$head(3)
#'
#' # Pass a negative value to get all rows except the last `abs(n)`.
#' df$head(-3)
DataFrame_head = function(n = 5L) {
if (isTRUE(n < 0)) n = max(0, self$height + n)
self$lazy()$head(n)$collect()
}

#' Tail of a DataFrame
#' @name DataFrame_tail
#' @description Get the last `n` rows.
#' @param n Positive number not larger than 2^32.
#'
#' @inherit DataFrame_limit details
#' @keywords DataFrame
#' @return DataFrame
#' @rdname DataFrame_head
DataFrame_limit = DataFrame_head

DataFrame_tail = function(n) {

#' @inherit LazyFrame_tail title
#' @param n Number of rows to return. If a negative value is passed,
#' return all rows except the first [`abs(n)`][abs].
#' @inherit DataFrame_head return
#' @examples
#' df = pl$DataFrame(foo = 1:5, bar = 6:10, ham = letters[1:5])
#'
#' df$tail(3)
#'
#' # Pass a negative value to get all rows except the first `abs(n)`.
#' df$tail(-3)
DataFrame_tail = function(n = 5L) {
if (isTRUE(n < 0)) n = max(0, self$height + n)
self$lazy()$tail(n)$collect()
}

Expand Down
2 changes: 0 additions & 2 deletions R/extendr-wrappers.R
Original file line number Diff line number Diff line change
Expand Up @@ -1118,8 +1118,6 @@ RPolarsLazyFrame$select <- function(exprs) .Call(wrap__RPolarsLazyFrame__select,

RPolarsLazyFrame$select_str_as_lit <- function(exprs) .Call(wrap__RPolarsLazyFrame__select_str_as_lit, self, exprs)

RPolarsLazyFrame$limit <- function(n) .Call(wrap__RPolarsLazyFrame__limit, self, n)

RPolarsLazyFrame$tail <- function(n) .Call(wrap__RPolarsLazyFrame__tail, self, n)

RPolarsLazyFrame$filter <- function(expr) .Call(wrap__RPolarsLazyFrame__filter, self, expr)
Expand Down
53 changes: 34 additions & 19 deletions R/lazyframe__lazy.R
Original file line number Diff line number Diff line change
Expand Up @@ -844,24 +844,31 @@ LazyFrame_sink_ndjson = function(
}


#' @title Limit a LazyFrame
#' @inherit DataFrame_limit description params details
#' @return A `LazyFrame`
#' @examples pl$LazyFrame(mtcars)$limit(4)$collect()
LazyFrame_limit = function(n) {
unwrap(.pr$LazyFrame$limit(self, n), "in $limit():")
}

#' @title Head of a LazyFrame
#' @inherit DataFrame_head description params details
#' Get the first `n` rows.
#'
#' @examples pl$LazyFrame(mtcars)$head(4)$collect()
#' A shortcut for [`$slice(0, n)`][LazyFrame_slice].
#' Consider using the [`$fetch()`][LazyFrame_fetch] method if you want to test your query.
#' The [`$fetch()`][LazyFrame_fetch] operation will load the first `n` rows at
#' the scan level, whereas `$head()` is applied at the end.
#'
#' `$limit()` is an alias for `$head()`.
#' @param n Number of rows to return.
#' @inherit LazyFrame_slice return
#' @examples
#' lf = pl$LazyFrame(a = 1:6, b = 7:12)
#'
#' lf$head()$collect()
#'
#' lf$head(2)$collect()
#' @return A new `LazyFrame` object with applied filter.

LazyFrame_head = function(n) {
unwrap(.pr$LazyFrame$limit(self, n), "in $head():")
LazyFrame_head = function(n = 5L) {
result(self$slice(0, n)) |>
unwrap("in $head():")
}

LazyFrame_limit = LazyFrame_head


#' @title Get the first row of a LazyFrame
#' @keywords DataFrame
#' @return A LazyFrame with one row
Expand Down Expand Up @@ -1033,6 +1040,7 @@ LazyFrame_reverse = use_extendr_wrapper
#' @title Slice
#' @description Get a slice of the LazyFrame.
#' @inheritParams DataFrame_slice
#' @return A [LazyFrame][LazyFrame_class]
#' @examples
#' pl$LazyFrame(mtcars)$slice(2, 4)$collect()
#' pl$LazyFrame(mtcars)$slice(30)$collect()
Expand All @@ -1041,11 +1049,18 @@ LazyFrame_slice = function(offset, length = NULL) {
unwrap(.pr$LazyFrame$slice(self, offset, length), "in $slice():")
}

#' @title Tail of a DataFrame
#' @inherit DataFrame_tail description params details
#' @return A LazyFrame
#' @examples pl$LazyFrame(mtcars)$tail(2)$collect()
LazyFrame_tail = function(n) {
#' Get the last `n` rows.
#'
#' @inherit LazyFrame_head return params
#' @inheritParams LazyFrame_head
#' @seealso [`<LazyFrame>$head()`][LazyFrame_head]
#' @examples
#' lf = pl$LazyFrame(a = 1:6, b = 7:12)
#'
#' lf$tail()$collect()
#'
#' lf$tail(2)$collect()
LazyFrame_tail = function(n = 5L) {
unwrap(.pr$LazyFrame$tail(self, n), "in $tail():")
}

Expand Down
44 changes: 29 additions & 15 deletions R/s3_methods.R
Original file line number Diff line number Diff line change
Expand Up @@ -138,32 +138,46 @@
pl$select(x)[i, , drop = TRUE]
}

#' Take the first n rows
#'
#' @param x A [DataFrame][DataFrame_class] or [LazyFrame][LazyFrame_class]
#' @param n Number of rows
#' @param ... Not used
#'
#' Return the first or the last `n` parts of an object
#'
#' They are equivalent to `$head()` and `$tail()` methods.
#' @param x A polars object
#' @param n An integer vector of length 1.
#' Note that negative values are not supported for if `x` is a [LazyFrame][LazyFrame_class].
#' @param ... Ignored
#' @return A polars object of the same class as `x`
#' @seealso
#' - [`<DataFrame>$head()`][DataFrame_head]
#' - [`<LazyFrame>$head()`][LazyFrame_head]
#' - [`<DataFrame>$tail()`][DataFrame_tail]
#' - [`<LazyFrame>$tail()`][LazyFrame_tail]
#' - [`<LazyFrame>$fetch()`][LazyFrame_fetch]
#' @export
#' @rdname S3_head
head.RPolarsDataFrame = function(x, n = 6L, ...) x$limit(n = n)
#' @examples
#' df = pl$DataFrame(foo = 1:5, bar = 6:10, ham = letters[1:5])
#' lf = df$lazy()
#'
#' head(df, 2)
#' tail(df, 2)
#'
#' head(lf, 2)
#' tail(lf, 2)
#'
#' head(df, -2)
#' tail(df, -2)
head.RPolarsDataFrame = function(x, n = 6L, ...) x$head(n = n)

#' @export
#' @rdname S3_head
head.RPolarsLazyFrame = head.RPolarsDataFrame

#' Take the last n rows
#'
#' @param x A [DataFrame][DataFrame_class] or [LazyFrame][LazyFrame_class]
#' @param n Number of rows
#' @param ... Not used
#'
#' @export
#' @rdname S3_tail
#' @rdname S3_head
tail.RPolarsDataFrame = function(x, n = 6L, ...) x$tail(n = n)

#' @export
#' @rdname S3_tail
#' @rdname S3_head
tail.RPolarsLazyFrame = tail.RPolarsDataFrame

#' Get the dimensions
Expand Down
25 changes: 18 additions & 7 deletions man/DataFrame_head.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 0 additions & 24 deletions man/DataFrame_limit.Rd

This file was deleted.

19 changes: 12 additions & 7 deletions man/DataFrame_tail.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 13 additions & 6 deletions man/LazyFrame_head.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 0 additions & 23 deletions man/LazyFrame_limit.Rd

This file was deleted.

3 changes: 3 additions & 0 deletions man/LazyFrame_slice.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading