Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: remove deprecated functions for version 0.13.0 #714

Merged
merged 4 commits into from
Jan 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@

### Breaking changes

- Deprecated functions from 0.12.x are removed (#714).
- `<Expr>$apply()` and `<Expr>$map()`, use `$map_elements()` and `$map_batches()` instead.
- `pl$polars_info()`, use `polars_info()` instead.
- The environment variables used when building the library have been changed. (#693)
This only affects selecting the feature flag and selecting profiles during source installation.
- `RPOLARS_PROFILE` is renamed to `LIBR_POLARS_PROFILE`
Expand Down
45 changes: 9 additions & 36 deletions R/expr__expr.R
Original file line number Diff line number Diff line change
Expand Up @@ -647,21 +647,23 @@ construct_ProtoExprArray = function(...) {
#' up some slow R functions as they can run in parallel R sessions. The
#' communication speed between processes is quite slower than between threads.
#' This will likely only give a speed-up in a "low IO - high CPU" use case.
#' If there are multiple `$map(in_background = TRUE)` calls in the query, they
#' will be run in parallel.
#' If there are multiple [`$map_batches(in_background = TRUE)`][Expr_map_batches]
#' calls in the query, they will be run in parallel.
#'
#' @return Expr
#' @details
#' It is sometimes necessary to apply a specific R function on one or several
#' columns. However, note that using R code in `$map()` is slower than native
#' polars. The user function must take one polars `Series` as input and the return
#' columns. However, note that using R code in [`$map_batches()`][Expr_map_batches]
#' is slower than native polars.
#' The user function must take one polars `Series` as input and the return
#' should be a `Series` or any Robj convertible into a `Series` (e.g. vectors).
#' Map fully supports `browser()`.
#'
#' If `in_background = FALSE` the function can access any global variable of the
#' R session. However, note that several calls to `$map()` will sequentially
#' share the same main R session, so the global environment might change between
#' the start of the query and the moment a `map()` call is evaluated. Any native
#' R session. However, note that several calls to [`$map_batches()`][Expr_map_batches]
#' will sequentially share the same main R session,
#' so the global environment might change between the start of the query and the moment
#' a [`$map_batches()`][Expr_map_batches] call is evaluated. Any native
#' polars computations can still be executed meanwhile. If `in_background = TRUE`,
#' the map will run in one or more other R sessions and will not have access
#' to global variables. Use `pl$set_options(rpool_cap = 4)` and `pl$options$rpool_cap`
Expand Down Expand Up @@ -716,18 +718,6 @@ Expr_map_batches = function(f, output_type = NULL, agg_list = FALSE, in_backgrou
unwrap("in $map_batches():")
}

Expr_map = function(f, output_type = NULL, agg_list = FALSE, in_background = FALSE) {
warning("$map() is deprecated and will be removed in 0.13.0. Use $map_batches() instead.", call. = FALSE)
if (isTRUE(in_background)) {
out = .pr$Expr$map_batches_in_background(self, f, output_type, agg_list)
} else {
out = .pr$Expr$map_batches(self, f, output_type, agg_list)
}

out |>
unwrap("in $map():")
}

#' Map a custom/user-defined function (UDF) to each element of a column
#'
#' The UDF is applied to each element of a column. See Details for more information
Expand Down Expand Up @@ -886,23 +876,6 @@ Expr_map_elements = function(f, return_type = NULL, strict_return_type = TRUE, a
unwrap("in $map_elements():")
}

Expr_apply = function(f, return_type = NULL, strict_return_type = TRUE,
allow_fail_eval = FALSE, in_background = FALSE) {
warning("$apply() is deprecated and will be removed in 0.13.0. Use $map_elements() instead.", call. = FALSE)
if (in_background) {
return(.pr$Expr$map_elements_in_background(self, f, return_type))
}

# use series apply
wrap_f = function(s) {
s$map_elements(f, return_type, strict_return_type, allow_fail_eval)
}

# return expression from the functions above, activate agg_list (grouped mapping)
.pr$Expr$map_batches(self, lambda = wrap_f, output_type = return_type, agg_list = TRUE) |>
unwrap("in $apply():")
}

#' Create a literal value
#'
#' @param x A vector of any length
Expand Down
4 changes: 0 additions & 4 deletions R/info.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,6 @@ polars_info = function() {
structure(out, class = "polars_info")
}

pl_polars_info = function() {
warning("pl$polars_info() is deprecated and will be removed in 0.13.0. Use polars_info() instead.", call. = FALSE)
polars_info()
}

#' @noRd
#' @export
Expand Down
6 changes: 3 additions & 3 deletions R/lazyframe__lazy.R
Original file line number Diff line number Diff line change
Expand Up @@ -438,16 +438,16 @@ LazyFrame_collect = function(
#' It is useful to not block the R session while query executes. If you use
#' [`<Expr>$map_batches()`][Expr_map_batches] or
#' [`<Expr>$map_elements()`][Expr_map_elements] to run R functions in the query,
#' then you must pass `in_background = TRUE` in `$map_batches()` (or
#' `$map_elements()`). Otherwise, `$collect_in_background()` will fail because
#' then you must pass `in_background = TRUE` in [`$map_batches()`][Expr_map_batches] (or
#' [`$map_elements()`][Expr_map_elements]). Otherwise, `$collect_in_background()` will fail because
#' the main R session is not available for polars execution. See also examples
#' below.
#'
#' @keywords LazyFrame DataFrame_new
#' @return RThreadHandle, a future-like thread handle for the task
#' @examples
#' # Some expression which does contain a map
#' expr = pl$col("mpg")$map(
#' expr = pl$col("mpg")$map_batches(
#' \(x) {
#' Sys.sleep(.1)
#' x * 0.43
Expand Down
8 changes: 5 additions & 3 deletions R/options.R
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ polars_optreq$rpool_cap = list() # rust-side options already check args
#' spawned in pool. `pl$options$rpool_cap` indicates the maximum number of new R
#' sessions that can be spawned. Anytime a polars thread worker needs a background
#' R session specifically to run R code embedded in a query via
#' `$map_batches(..., in_background = TRUE)` or `$map_elements(..., in_background = TRUE)`, it
#' [`$map_batches(..., in_background = TRUE)`][Expr_map_batches] or
#' [`$map_elements(..., in_background = TRUE)`][Expr_map_elements], it
#' will obtain any R session idling in rpool, or spawn a new R session (process)
#' and add it to the rpool if `rpool_cap` is not already reached. If `rpool_cap`
#' is already reached, the thread worker will sleep until an R session is idling.
Expand Down Expand Up @@ -300,8 +301,9 @@ pl_with_string_cache = function(expr) {
#' processes. `pl$options$rpool_active` is the number of R sessions are already spawned
#' in the pool. `rpool_cap` is the limit of new R sessions to spawn. Anytime a polars
#' thread worker needs a background R session specifically to run R code embedded
#' in a query via `$map(..., in_background = TRUE)` or
#' `$map_elements(..., in_background = TRUE)`, it will obtain any R session idling in
#' in a query via [`$map_batches(..., in_background = TRUE)`][Expr_map_batches]
#' or [`$map_elements(..., in_background = TRUE)`][Expr_map_elements],
#' it will obtain any R session idling in
#' rpool, or spawn a new R session (process) if `capacity`
#' is not already reached. If `capacity` is already reached, the thread worker
#' will sleep and in a R job queue until an R session is idle.
Expand Down
4 changes: 2 additions & 2 deletions R/rbackground.R
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,11 @@ print.RPolarsRThreadHandle = function(x, ...) as.character(x) |> cat("\n")
#'
#' NOTICE:
#' The background thread cannot use the main R session, but can access the pool of extra R sessions
#' to process R code embedded in polars query via `$map_batches(..., background = TRUE)` or
#' to process R code embedded in polars query via [`$map_batches(..., in_background = TRUE)`][Expr_map_batches] or
#' `$map_elements(background=TRUE)`. Use [`pl$set_options(rpool_cap = XX)`][pl_set_options] to limit number of
#' parallel R sessions.
#' Starting polars [`<LazyFrame>$collect_in_background()`][LazyFrame_collect_in_background] with
#' e.g. some `$map_batches(..., background = FALSE)` will raise an Error as the main R session is not
#' e.g. some [`$map_batches(..., in_background = FALSE)`][Expr_map_batches] will raise an Error as the main R session is not
#' available to process the R part of the polars query. Native polars query does not need any R
#' session.
#' @return see methods:
Expand Down
8 changes: 0 additions & 8 deletions R/series__series.R
Original file line number Diff line number Diff line change
Expand Up @@ -363,14 +363,6 @@ Series_map_elements = function(
) |> unwrap("in $map_elements():")
}

Series_apply = function(f, datatype = NULL, strict_return_type = TRUE,
allow_fail_eval = FALSE) {
warning("$apply() is deprecated and will be removed in 0.13.0. Use $map_elements() instead.")
Series_map_elements(f,
datatype = datatype, strict_return_type = strict_return_type,
allow_fail_eval = allow_fail_eval
)
}

#' Series_len
#' @description Length of this Series.
Expand Down
16 changes: 9 additions & 7 deletions man/Expr_map_batches.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/LazyFrame_collect_in_background.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/RThreadHandle_class.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions man/global_rpool_cap.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/pl_options.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/pl_pl.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading