Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series$mean median std var #170

Merged
merged 5 commits into from
Apr 25, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: polars
Title: Polars ported to R
Version: 0.5.0.9000
Version: 0.5.0.9001
Depends: R (>= 4.1.0)
Imports: utils, codetools
Authors@R:
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## What's changed
- `DataFrame` objects can be subsetted using brackets like standard R data frames: `pl$DataFrame(mtcars)[2:4, c("mpg", "hp")]` (#140 @vincentarelbundock)
- `Series` gains new methods: `$mean`, `$median`, `$std`, `$var` (#170 vincentarelbundock)

# polars v0.5.0

Expand Down
13 changes: 8 additions & 5 deletions R/extendr-wrappers.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
# Generated by extendr: Do not edit by hand

# nolint start

vincentarelbundock marked this conversation as resolved.
Show resolved Hide resolved
#
# This file was created with the following call:
# .Call("wrap__make_polars_wrappers", use_symbols = TRUE, package_name = "polars")
Expand Down Expand Up @@ -975,12 +972,20 @@ Series$append_mut <- function(other) .Call(wrap__Series__append_mut, self, other

Series$apply <- function(robj, rdatatype, strict, allow_fail_eval) .Call(wrap__Series__apply, self, robj, rdatatype, strict, allow_fail_eval)

Series$mean <- function() .Call(wrap__Series__mean, self)

Series$median <- function() .Call(wrap__Series__median, self)

Series$min <- function() .Call(wrap__Series__min, self)

Series$max <- function() .Call(wrap__Series__max, self)

Series$sum <- function() .Call(wrap__Series__sum, self)

Series$std <- function(ddof) .Call(wrap__Series__std, self, ddof)

Series$var <- function(ddof) .Call(wrap__Series__var, self, ddof)

Series$ceil <- function() .Call(wrap__Series__ceil, self)

Series$floor <- function() .Call(wrap__Series__floor, self)
Expand Down Expand Up @@ -1015,5 +1020,3 @@ PolarsBackgroundHandle$is_exhausted <- function() .Call(wrap__PolarsBackgroundHa
#' @export
`[[.PolarsBackgroundHandle` <- `$.PolarsBackgroundHandle`


# nolint end
46 changes: 46 additions & 0 deletions R/series__series.R
Original file line number Diff line number Diff line change
Expand Up @@ -653,6 +653,32 @@ Series_cumsum = function(reverse = FALSE) {
#' pl$Series(c(1:2,3,Inf,4,-Inf,5))$sum() # Inf-Inf is NaN
Series_sum = "use_extendr_wrapper"

#' Mean
#' @description Reduce Series with mean
#' @return Series
#' @keywords Series
#' @details
#' Dtypes in {Int8, UInt8, Int16, UInt16} are cast to
#' Int64 before meanming to prevent overflow issues.
#' @examples
#' pl$Series(c(1:2,NA,3,5))$mean() # a NA is dropped always
#' pl$Series(c(1:2,NA,3,NaN,4,Inf))$mean() # NaN carries / poisons
#' pl$Series(c(1:2,3,Inf,4,-Inf,5))$mean() # Inf-Inf is NaN
Series_mean = "use_extendr_wrapper"

#' Median
#' @description Reduce Series with median
#' @return Series
#' @keywords Series
#' @details
#' Dtypes in {Int8, UInt8, Int16, UInt16} are cast to
#' Int64 before medianming to prevent overflow issues.
#' @examples
#' pl$Series(c(1:2,NA,3,5))$median() # a NA is dropped always
#' pl$Series(c(1:2,NA,3,NaN,4,Inf))$median() # NaN carries / poisons
#' pl$Series(c(1:2,3,Inf,4,-Inf,5))$median() # Inf-Inf is NaN
Series_median = "use_extendr_wrapper"

#' max
#' @description Reduce Series with max
#' @return Series
Expand All @@ -679,6 +705,26 @@ Series_max = "use_extendr_wrapper"
#' pl$Series(c(1:2,3,Inf,4,-Inf,5))$min() # Inf-Inf is NaN
Series_min = "use_extendr_wrapper"

#' @title Var
#' @description Aggregate the columns of this Series to their variance values.
#' @keywords Series
#' @param ddof integer Delta Degrees of Freedom: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.
#' @return A new `Series` object with applied aggregation.
#' @examples pl$Series(1:10)$var()
Series_var = function(ddof = 1) {
unwrap(.pr$Series$var(self, ddof))
vincentarelbundock marked this conversation as resolved.
Show resolved Hide resolved
}

#' @title Std
#' @description Aggregate the columns of this Series to their standard deviation.
#' @keywords Series
#' @param ddof integer Delta Degrees of Freedom: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.
#' @return A new `Series` object with applied aggregation.
#' @examples pl$Series(1:10)$std()
Series_std = function(ddof = 1) {
unwrap(.pr$Series$std(self, ddof))
}

#' Get data type of Series
#' @keywords Series
#' @return DataType
Expand Down
28 changes: 28 additions & 0 deletions man/Series_mean.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

28 changes: 28 additions & 0 deletions man/Series_median.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 7 additions & 2 deletions man/Series_std.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 7 additions & 2 deletions man/Series_var.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 17 additions & 0 deletions src/rust/src/series.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ use crate::handle_type;
use crate::make_r_na_fun;
use crate::rdatatype::RPolarsDataType;
use crate::utils::{r_error_list, r_result_list};
use crate::robj_to;

use crate::conversion_r_to_s::robjname2series;
use crate::conversion_s_to_r::pl_series_to_list;
Expand Down Expand Up @@ -453,6 +454,14 @@ impl Series {
// Wrap(self.series.min_as_series().get(0)).into_py(py)
// }

pub fn mean(&self) -> Series {
self.0.mean_as_series().into()
}

pub fn median(&self) -> Series {
self.0.median_as_series().into()
}

pub fn min(&self) -> Series {
self.0.min_as_series().into()
}
Expand All @@ -465,6 +474,14 @@ impl Series {
self.0.sum_as_series().into()
}

pub fn std(&self, ddof: u8) -> Series {
vincentarelbundock marked this conversation as resolved.
Show resolved Hide resolved
self.0.std_as_series(ddof).into()
}

pub fn var(&self, ddof: u8) -> Series {
self.0.var_as_series(ddof).into()
}

pub fn ceil(&self) -> List {
r_result_list(
self.0
Expand Down
21 changes: 21 additions & 0 deletions tests/testthat/test-series.R
Original file line number Diff line number Diff line change
Expand Up @@ -504,3 +504,24 @@ test_that("internal method get_fmt and to_fmt_char", {
c('"foo"', '"bar"')
)
})


make_cases <- function() {
tibble::tribble(
~ .test_name, ~ base,
"mean", mean,
"median", median,
"std", sd,
"var", var,
)
}
patrick::with_parameters_test_that("mean, median, std, var", {
s = pl$Series(rnorm(100))
a = s[[.test_name]]()
# upstream .std_as_series() does not appear to return Series
if (inherits(a, "Series")) a = a$to_vector()
b = base(s$to_vector())
expect_equal(a, b)
},
.cases = make_cases()
)