Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-3614: [R] Support for timestamps #2887

Closed
wants to merge 5 commits into from

Conversation

javierluraschi
Copy link
Contributor

Support for TimeStamp in R bindings: https://issues.apache.org/jira/browse/ARROW-3614

CC: @romainfrancois

@wesm
Copy link
Member

wesm commented Nov 8, 2018

Some unit tests to add here?

@javierluraschi
Copy link
Contributor Author

javierluraschi commented Nov 9, 2018

I could fix https://issues.apache.org/jira/projects/ARROW/issues/ARROW-3702, which will give us unit tests for this as well. Including fix for ARROW-3702. CC: @romainfrancois

@codecov-io
Copy link

Codecov Report

Merging #2887 into master will increase coverage by 0.96%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2887      +/-   ##
==========================================
+ Coverage   86.22%   87.18%   +0.96%     
==========================================
  Files         490      422      -68     
  Lines       68599    63439    -5160     
==========================================
- Hits        59147    55310    -3837     
+ Misses       9358     8129    -1229     
+ Partials       94        0      -94
Impacted Files Coverage Δ
rust/src/record_batch.rs
go/arrow/array/table.go
rust/src/array.rs
go/arrow/math/uint64_amd64.go
go/arrow/internal/testing/tools/bool.go
go/arrow/array/bufferbuilder.go
go/arrow/internal/bitutil/bitutil.go
go/arrow/memory/memory_avx2_amd64.go
go/arrow/array/null.go
go/arrow/datatype_nested.go
... and 59 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14b8aa7...8c915ed. Read the comment docs.

@wesm
Copy link
Member

wesm commented Nov 9, 2018

Yes that sounds good to me

@wesm
Copy link
Member

wesm commented Nov 10, 2018

I just merged #2928, so I think this needs to be rebased, and then in this PR maybe we can sort out the POSIXct issue

@javierluraschi
Copy link
Contributor Author

K, rebased.

@romainfrancois
Copy link
Contributor

+1 thanks

@@ -230,13 +230,13 @@ test_that("array supports Date (ARROW-3340)", {
test_that("array supports POSIXct (ARROW-3340)", {
times <- lubridate::ymd_hms("2018-10-07 19:04:05") + 1:10
a <- array(times)
expect_equal(a$type(), date64())
expect_equal(a$type(), timestamp(unit = TimeUnit$NANO))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? Why not use the native unit of time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since timestamp() does not define a default: https://github.com/apache/arrow/blob/master/r/R/R6.R#L319. Or, maybe I'm misunderstanding what you mean by native?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you converting the representation of times to nanoseconds? I thought POSIXct was represented internally as seconds

@romainfrancois
Copy link
Contributor

It is seconds, but represented as a double.

@wesm
Copy link
Member

wesm commented Nov 11, 2018

I see. It might make sense for the output to be second resolution anyway to reflect the intent of the data type

@javierluraschi
Copy link
Contributor Author

Makes sense to use TimeUnit::Second in R by default, changing.

@romainfrancois
Copy link
Contributor

As long as we're fine losing sub second information

@wesm
Copy link
Member

wesm commented Nov 12, 2018

Is the sub-second information supported? I was googling around trying to find some documentation

If people actually use the subsecond information, it might make sense to cast to microseconds

@romainfrancois
Copy link
Contributor

> str(Sys.time() - Sys.time())
 'difftime' num -9.05990600585938e-06
 - attr(*, "units")= chr "secs"

So you have 9e-6 seconds difference

@javierluraschi
Copy link
Contributor Author

javierluraschi commented Nov 12, 2018

Something else to consider... Spark only supports microsecond timestamps, so someone needs to make a conversion somewhere, see arrow/ArrowUtils.scala.

I can send a PR to Spark; however, from spark/pull/22913, Spark is trying to minimize conversions in Java, which are probably slower than in Rcpp. So I would still need a way to convert to microseconds somewhere.

@wesm
Copy link
Member

wesm commented Nov 12, 2018

OK. How about doing microseconds here, then? I think that's a reasonable compromise

@javierluraschi
Copy link
Contributor Author

Tests updated. The default was already microseconds, but the test was only validating the type, not the unit.

@javierluraschi
Copy link
Contributor Author

Oh, sorry, I wasn't aware the decimal issue was fixed already in that PR, let me know if we want to revert this last fix, but shouldn't hurt to take both.

@romainfrancois
Copy link
Contributor

That shouldn’t be necessary, esp that i’m not sure the decimal pr will be merged soon. It’s still very experimental.

@javierluraschi
Copy link
Contributor Author

Can we merge this one?

@romainfrancois
Copy link
Contributor

romainfrancois commented Nov 14, 2018

+1

I don't have the rights to do it yet, and I guess I need some sort of training on the process used, but yeah LGTM.

@wesm
Copy link
Member

wesm commented Nov 14, 2018

I'll merge it. @romainfrancois you need to get set up on https://gitbox.apache.org/. I know you have an ASF account but not sure if @jacques-n added you to the committer roster yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants