-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-3614: [R] Support for timestamps #2887
Conversation
Some unit tests to add here? |
I could fix https://issues.apache.org/jira/projects/ARROW/issues/ARROW-3702, which will give us unit tests for this as well. Including fix for ARROW-3702. CC: @romainfrancois |
6092463
to
1f22f12
Compare
Codecov Report
@@ Coverage Diff @@
## master #2887 +/- ##
==========================================
+ Coverage 86.22% 87.18% +0.96%
==========================================
Files 490 422 -68
Lines 68599 63439 -5160
==========================================
- Hits 59147 55310 -3837
+ Misses 9358 8129 -1229
+ Partials 94 0 -94 Continue to review full report at Codecov.
|
Yes that sounds good to me |
I just merged #2928, so I think this needs to be rebased, and then in this PR maybe we can sort out the POSIXct issue |
8c915ed
to
bc0aa37
Compare
K, rebased. |
+1 thanks |
r/tests/testthat/test-Array.R
Outdated
@@ -230,13 +230,13 @@ test_that("array supports Date (ARROW-3340)", { | |||
test_that("array supports POSIXct (ARROW-3340)", { | |||
times <- lubridate::ymd_hms("2018-10-07 19:04:05") + 1:10 | |||
a <- array(times) | |||
expect_equal(a$type(), date64()) | |||
expect_equal(a$type(), timestamp(unit = TimeUnit$NANO)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? Why not use the native unit of time
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since timestamp()
does not define a default: https://github.com/apache/arrow/blob/master/r/R/R6.R#L319. Or, maybe I'm misunderstanding what you mean by native?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you converting the representation of times
to nanoseconds? I thought POSIXct was represented internally as seconds
It is seconds, but represented as a double. |
I see. It might make sense for the output to be second resolution anyway to reflect the intent of the data type |
Makes sense to use |
As long as we're fine losing sub second information |
Is the sub-second information supported? I was googling around trying to find some documentation If people actually use the subsecond information, it might make sense to cast to microseconds |
> str(Sys.time() - Sys.time())
'difftime' num -9.05990600585938e-06
- attr(*, "units")= chr "secs" So you have 9e-6 seconds difference |
Something else to consider... Spark only supports microsecond timestamps, so someone needs to make a conversion somewhere, see arrow/ArrowUtils.scala. I can send a PR to Spark; however, from spark/pull/22913, Spark is trying to minimize conversions in Java, which are probably slower than in Rcpp. So I would still need a way to convert to microseconds somewhere. |
OK. How about doing microseconds here, then? I think that's a reasonable compromise |
Tests updated. The default was already microseconds, but the test was only validating the type, not the unit. |
Oh, sorry, I wasn't aware the decimal issue was fixed already in that PR, let me know if we want to revert this last fix, but shouldn't hurt to take both. |
That shouldn’t be necessary, esp that i’m not sure the decimal pr will be merged soon. It’s still very experimental. |
Can we merge this one? |
+1 I don't have the rights to do it yet, and I guess I need some sort of training on the process used, but yeah LGTM. |
I'll merge it. @romainfrancois you need to get set up on https://gitbox.apache.org/. I know you have an ASF account but not sure if @jacques-n added you to the committer roster yet |
Support for
TimeStamp
in R bindings: https://issues.apache.org/jira/browse/ARROW-3614CC: @romainfrancois