New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-3628: [R] Expose Decimal128Array using vctrs #2845
Conversation
I created a JIRA for this and updated the issue title |
99bd849
to
d428642
Compare
11d41cf
to
107b09b
Compare
I opened https://issues.apache.org/jira/browse/ARROW-3747, we should probably go ahead and do that if it causes no problem as it will make things easier here |
Please yes. This was very confusing that i could not just use the 16bytes of an Rcomplex or an arrow::Decimal128 interchangeably. The ComplexVector is indeed used as a host for a contiguous array of things that occupy 16 bytes. That’s the same workaround as the bit64 📦 uses for representing 64 bits integers hosted in a double vector. |
107b09b
to
9d105f9
Compare
OK if you rebase then things ought to work better |
sorry, you'll need to wait for #2940 to be merged |
282f73d
to
c4a7baf
Compare
This still needs some care about:
but the basics are in place, and works more naturally with #2940: library(arrow)
a <- array(1:10, new_decimal128())
a
#> arrow::Decimal128Array
#> [
#> 1,
#> 2,
#> 3,
#> 4,
#> 5,
#> 6,
#> 7,
#> 8,
#> 9,
#> 10
#> ]
a$as_vector()
#> <arrow_decimal128[10]>
#> [1] 1 2 3 4 5 6 7 8 9 10 Created on 2018-11-12 by the reprex package (v0.2.1.9000) |
Maybe a question for @hadley, what's the NA strategy for custom vectors? |
I would be inclined to use a something binary compatible with the arrow bitmap. buffers, hosted in e.g. a raw vector, but I'm not sure In this example, we'd have a field bitmap <- function(n) {
# each byte (element of a raw vector) gives
# 8 values
nbytes <- ceiling(n/8)
new_vctr(raw(nbytes), n = n, class = "bitmap")
}
vec_size.bitmap <- function(x) {
attr(x, "n")
}
# ... other methods to fool vctrs The bitmap could also be an ALTREP logical vector, so that as far as R is concerned it's just a regular logical vector, but internally its memory come from the arrow::Buffer. We can do that in R 3.6 with wch/r-source@b60247b The advantage is that vctrs would just see this as a logical vector, the downside is that it may be materialized into a true contiguous logical vector (i.e. a vector of int) if |
The alternative is to use some sort of sentinel here too, this is what e.g. integer64 does in bit64 |
🤔 maybe not as a field, but as another attribute that holds buffer + offset. I'll try that. |
Personally, either I'd recommend holding off on Decimal128 coercion for now, or putting it in the simple possible structure that an expert could still compute with. There is no tooling in R available to work with numbers of this type, so there's no reason to invest in them at this time (except as much as needed to facilitate a round-trip between arrow and R). |
Makes sense, having a minimum structure to work with for now is fine, and then we can leave things like What we have now can host the value buffer in a I'll see what I can do for the nulls, but yeah I don't want to spend too much time on this until there's a need. |
0f31c8d
to
0ff4ed9
Compare
Author: Kouhei Sutou <kou@clear-code.com> Closes apache#4069 from kou/release-fix-typo-in-mail-template and squashes the following commits: 15dc3c8 <Kouhei Sutou> Fix typos in vote e-mail template
* Use jq in local * Ensure making variables for "for" function local for parallel processing Author: Kouhei Sutou <kou@clear-code.com> Closes apache#4070 from kou/release-improve-binary-upload-performance and squashes the following commits: 2463d3c <Kouhei Sutou> Improve 03-binary performance
…dapt to apache#2940 , ARROW-3747
…teger64 and arrow_decimal128
./r/lint.sh --fix
06767d5
to
93cf59d
Compare
Closing until a consensus about this feature can be reached |
This is a WIP for holding decimal types in a vctrs record, that I submit for initial feedback.
I'm probably doing something wrong, because the ToString methods gives me nonsense:
The basic idea is to host a Decimal128 (which takes 128 bits) in a complex (which also takes 128 bits), so that it minimizes copies (eventually when we use ALTREP).