New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-12994: [R] stringr tests fails on non-UTC machines due to strptime defaulting to local timezone and Arrow defaulting to UTC [WIP] #10495
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format?
or
See also: |
I'm wary of hardcoding UTC in our tests because it opens up all kinds of ways where we might do the wrong thing if you're not on UTC. We've had bug reports about this in the past, and the problem literally was that we were hardcoding UTC in the data translation layer itself. Here are some tests I added at that time to test all of the combinations: timezone-naive, UTC, timezone specified but definitely not your timezone (hence Pyonyang): https://github.com/apache/arrow/blob/master/r/tests/testthat/test-Array.R#L253-L292 I'm not sure that we can do the exact same thing here but maybe we can. One of the bugs I recall seeing (aside from the UTC hardcoding) was that R and Arrow appeared to be inconsistent with timezones, but the inconsistency was entirely in the print method: R localizes timezone-naive POSIXt data when it prints it and Arrow just prints as-is, which is only the same as R if you're in UTC. If there are limitations or incompatibilities in what Arrow's strptime function does, let's be sure to document them and/or link to JIRAs for addressing the issues. |
This ticket has been opened about the C++ function: https://issues.apache.org/jira/browse/ARROW-12820 Do I need to do anything else here @nealrichardson or should I just close this PR? |
Isn't the issue here that R's
That are general tests about timezone handling (eg when roundtripping between R <-> Arrow), I think? While here it is a specific test about the behaviour of the
That ticket is about not ignoring the timezone if it is present in the string. The current test you are modifying here doesn't have that, so I think this is unrelated to ARROW-12820 |
I guess if the aim here is to write R bindings which mimic the expected behaviour of the R function itself, the best solution would be to adapt the R binding to use the machine timezone. |
Just chatted with Nic about this, and if we want to follow the R |
Closing in favor of #10706 |
No description provided.