Skip to content

Commit

Permalink
ARROW-8024: [R] Bindings for BinaryType and FixedSizeBinaryType
Browse files Browse the repository at this point in the history
You still can't do much useful with them yet, but at least you can make them (and safely).

Closes #6554 from nealrichardson/binary-type and squashes the following commits:

570b2e6 <Neal Richardson> (manually) update docs
e761c4a <Neal Richardson> Fix doclet
197fb92 <Neal Richardson> Update docs
754d4d2 <Neal Richardson> Add constructor for binary and fixed size binary types; more validation

Authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
  • Loading branch information
nealrichardson committed Mar 9, 2020
1 parent 7db3855 commit c03f2f6
Show file tree
Hide file tree
Showing 7 changed files with 79 additions and 1 deletion.
1 change: 1 addition & 0 deletions r/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ export(TimeUnit)
export(Type)
export(UnionDataset)
export(arrow_available)
export(binary)
export(bool)
export(boolean)
export(buffer)
Expand Down
4 changes: 4 additions & 0 deletions r/R/arrowExports.R

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions r/R/type.R
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ Float32 <- R6Class("Float32", inherit = FixedWidthType)
Float64 <- R6Class("Float64", inherit = FixedWidthType)
Boolean <- R6Class("Boolean", inherit = FixedWidthType)
Utf8 <- R6Class("Utf8", inherit = DataType)
Binary <- R6Class("Binary", inherit = DataType)
FixedSizeBinary <- R6Class("FixedSizeBinary", inherit = FixedWidthType)

DateType <- R6Class("DateType",
inherit = FixedWidthType,
Expand Down Expand Up @@ -202,6 +204,9 @@ NestedType <- R6Class("NestedType", inherit = DataType)
#' either "s" or "ms", while `time64()` can be "us" or "ns". `timestamp()` can
#' take any of those four values.
#' @param timezone For `timestamp()`, an optional time zone string.
#' @param byte_width For `binary()`, an optional integer width to create a
#' `FixedSizeBinary` type. The default `NULL` results in a `BinaryType` with
#' variable width.
#' @param precision For `decimal()`, precision
#' @param scale For `decimal()`, scale
#' @param type For `list_of()`, a data type to make a list-of-type
Expand Down Expand Up @@ -280,6 +285,16 @@ bool <- boolean
#' @export
utf8 <- function() shared_ptr(Utf8, Utf8__initialize())

#' @rdname data-type
#' @export
binary <- function(byte_width = NULL) {
if (is.null(byte_width)) {
shared_ptr(Binary, Binary__initialize())
} else {
shared_ptr(FixedSizeBinary, FixedSizeBinary__initialize(byte_width))
}
}

#' @rdname data-type
#' @export
string <- utf8
Expand Down
7 changes: 7 additions & 0 deletions r/man/data-type.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions r/src/arrowExports.cpp

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 13 additions & 1 deletion r/src/datatype.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ std::shared_ptr<arrow::DataType> Boolean__initialize() { return arrow::boolean()
// [[arrow::export]]
std::shared_ptr<arrow::DataType> Utf8__initialize() { return arrow::utf8(); }

// [[arrow::export]]
std::shared_ptr<arrow::DataType> Binary__initialize() { return arrow::binary(); }

// [[arrow::export]]
std::shared_ptr<arrow::DataType> Date32__initialize() { return arrow::date32(); }

Expand All @@ -85,11 +88,20 @@ std::shared_ptr<arrow::DataType> Null__initialize() { return arrow::null(); }
// [[arrow::export]]
std::shared_ptr<arrow::DataType> Decimal128Type__initialize(int32_t precision,
int32_t scale) {
return arrow::decimal(precision, scale);
// Use the builder that validates inputs
std::shared_ptr<arrow::DataType> out;
STOP_IF_NOT_OK(arrow::Decimal128Type::Make(precision, scale, &out));
return out;
}

// [[arrow::export]]
std::shared_ptr<arrow::DataType> FixedSizeBinary__initialize(int32_t byte_width) {
if (byte_width == NA_INTEGER) {
Rcpp::stop("'byte_width' cannot be NA");
}
if (byte_width < 1) {
Rcpp::stop("'byte_width' must be > 0");
}
return arrow::fixed_size_binary(byte_width);
}

Expand Down
24 changes: 24 additions & 0 deletions r/tests/testthat/test-data-type.R
Original file line number Diff line number Diff line change
Expand Up @@ -379,12 +379,36 @@ test_that("DictionaryType validation", {
dictionary(utf8(), int32()),
"Dictionary index type should be signed integer, got string"
)
expect_error(dictionary(4, utf8()), 'index_type must be a "DataType"')
expect_error(dictionary(int8(), "strings"), 'value_type must be a "DataType"')
})

test_that("decimal type and validation", {
expect_error(decimal(), 'argument "precision" is missing, with no default')
expect_error(decimal("four"), '"precision" must be an integer')
expect_error(decimal(4), 'argument "scale" is missing, with no default')
expect_error(decimal(4, "two"), '"scale" must be an integer')
expect_error(decimal(NA, 2), '"precision" must be an integer')
expect_error(decimal(0, 2), "Invalid: Decimal precision out of range: 0")
expect_error(decimal(100, 2), "Invalid: Decimal precision out of range: 100")
expect_error(decimal(4, NA), '"scale" must be an integer')

expect_is(decimal(4, 2), "Decimal128Type")

})

test_that("Binary", {
expect_is(binary(), "Binary")
expect_equal(binary()$ToString(), "binary")
})

test_that("FixedSizeBinary", {
expect_is(binary(4), "FixedSizeBinary")
expect_equal(binary(4)$ToString(), "fixed_size_binary[4]")

# input validation
expect_error(binary(NA), "'byte_width' cannot be NA")
expect_error(binary(-1), "'byte_width' must be > 0")
expect_error(binary("four"), class = "Rcpp::not_compatible")
expect_error(binary(c(2, 4)), class = "Rcpp::not_compatible")
})

0 comments on commit c03f2f6

Please sign in to comment.