Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Fix tests that assume UTC local tz #28711

Closed
asfimport opened this issue Jun 7, 2021 · 4 comments
Closed

[R] Fix tests that assume UTC local tz #28711

asfimport opened this issue Jun 7, 2021 · 4 comments
Assignees
Milestone

Comments

@asfimport
Copy link

Here's the problem I detected while triaging tickets.

This was run locally after merging from apache/arrow at commit 8773b9d and re-building both Arrow library and Arrow R package.

library(arrow)
#> See arrow_info() for available features
#> 
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(testthat)
#> 
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#> 
#>     matches
#> The following object is masked from 'package:arrow':
#> 
#>     matches

tstring <- tibble(x = c("08-05-2008", NA))
tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA))

expect_equal(
  tstring %>%
    Table$create() %>%
    mutate(
      x = strptime(x, format = "%m-%d-%Y")
    ) %>%
    collect(),
  tstamp,
  check.tzone = FALSE
)
#> Error: `%>%`(...) not equal to `tstamp`.
#> Component "x": Mean absolute difference: 14400

We can see that the dates are different by exact 4 hours by removing the expectation:

library(arrow)
#> See arrow_info() for available features
#> 
#> Attaching package: 'arrow'
#> The following object is masked from 'package:utils':
#> 
#>     timestamp
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(testthat)
#> 
#> Attaching package: 'testthat'
#> The following object is masked from 'package:dplyr':
#> 
#>     matches
#> The following object is masked from 'package:arrow':
#> 
#>     matches

tstring <- tibble(x = c("08-05-2008", NA))
tstamp <- tibble(x = c(strptime("08-05-2008", format = "%m-%d-%Y"), NA))

tstring %>%
  Table$create() %>%
  mutate(
    x = strptime(x, format = "%m-%d-%Y")
  ) %>%
  collect()
#> # A tibble: 2 x 1
#>   x                  
#>   <dttm>             
#> 1 2008-08-04 20:00:00
#> 2 NA

tstamp
#> # A tibble: 2 x 1
#>   x                  
#>   <dttm>             
#> 1 2008-08-05 00:00:00
#> 2 NA

Created on 2021-06-07 by the reprex package (v2.0.0)

Reporter: Mauricio 'Pachá' Vargas Sepúlveda / @pachadotdev
Assignee: Neal Richardson / @nealrichardson
Watchers: Rok Mihevc / @rok

PRs and other links:

Note: This issue was originally created as ARROW-12994. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Mauricio 'Pachá' Vargas Sepúlveda / @pachadotdev:

Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.1.0 (2021-05-18)
 os       Ubuntu 20.04.2 LTS          
 system   x86_64, linux-gnu           
 ui       RStudio                     
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/Santiago            
 date     2021-06-07Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version    date       lib source        
 arrow       * 4.0.1.9000 2021-06-07 [1] local         
 assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.1.0)
 bit           4.0.4      2020-08-04 [1] CRAN (R 4.1.0)
 bit64         4.0.5      2020-08-30 [1] CRAN (R 4.1.0)
 cachem        1.0.5      2021-05-15 [1] CRAN (R 4.1.0)
 callr         3.7.0      2021-04-20 [1] CRAN (R 4.1.0)
 cli           2.5.0      2021-04-26 [1] CRAN (R 4.1.0)
 crayon        1.4.1      2021-02-08 [1] CRAN (R 4.1.0)
 DBI           1.1.1      2021-01-15 [1] CRAN (R 4.1.0)
 desc          1.3.0      2021-03-05 [1] CRAN (R 4.1.0)
 devtools    * 2.4.1      2021-05-05 [1] CRAN (R 4.1.0)
 dplyr       * 1.0.6      2021-05-05 [1] CRAN (R 4.1.0)
 ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.1.0)
 fansi         0.5.0      2021-05-25 [1] CRAN (R 4.1.0)
 fastmap       1.1.0      2021-01-25 [1] CRAN (R 4.1.0)
 fs            1.5.0      2020-07-31 [1] CRAN (R 4.1.0)
 generics      0.1.0      2020-10-31 [1] CRAN (R 4.1.0)
 glue          1.4.2      2020-08-27 [1] CRAN (R 4.1.0)
 lifecycle     1.0.0      2021-02-15 [1] CRAN (R 4.1.0)
 magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.1.0)
 memoise       2.0.0      2021-01-26 [1] CRAN (R 4.1.0)
 pillar        1.6.1      2021-05-16 [1] CRAN (R 4.1.0)
 pkgbuild      1.2.0      2020-12-15 [1] CRAN (R 4.1.0)
 pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.1.0)
 pkgload       1.2.1      2021-04-06 [1] CRAN (R 4.1.0)
 prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.1.0)
 processx      3.5.2      2021-04-30 [1] CRAN (R 4.1.0)
 ps            1.6.0      2021-02-28 [1] CRAN (R 4.1.0)
 purrr         0.3.4      2020-04-17 [1] CRAN (R 4.1.0)
 R6            2.5.0      2020-10-28 [1] CRAN (R 4.1.0)
 remotes       2.3.0      2021-04-01 [1] CRAN (R 4.1.0)
 rlang         0.4.11     2021-04-30 [1] CRAN (R 4.1.0)
 rprojroot     2.0.2      2020-11-15 [1] CRAN (R 4.1.0)
 rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.1.0)
 sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.1.0)
 testthat    * 3.0.2      2021-02-14 [1] CRAN (R 4.1.0)
 tibble        3.1.2      2021-05-16 [1] CRAN (R 4.1.0)
 tidyselect    1.1.1      2021-04-30 [1] CRAN (R 4.1.0)
 usethis     * 2.0.1      2021-02-10 [1] CRAN (R 4.1.0)
 utf8          1.2.1      2021-03-12 [1] CRAN (R 4.1.0)
 vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.1.0)
 withr         2.4.2      2021-04-18 [1] CRAN (R 4.1.0)

[1] /home/pacha/R/x86_64-pc-linux-gnu-library/4.1
[2] /usr/local/lib/R/site-library
[3] /usr/lib/R/site-library
[4] /usr/lib/R/library

@asfimport
Copy link
Author

Nicola Crane / @thisisnic:
@pachadotdev - this test fails locally for you and I but not on the CI, as we have non-UTC timezones.   The docs for the tz argument of strptime say "A character string specifying the time zone to be used for the conversion. System-specific (see as.POSIXlt), but "" is the current time zone".  The default timezone on the CI is UTC, so the tests will pass there, whereas you and I are both in non-UTC timezones, so we get local failures. 

Have submitted a PR which specifies tz="UTC" which should fix this.

@asfimport
Copy link
Author

Jonathan Keane / @jonkeane:
It looks like we've acquired a few more of these that fail due to local timezone issues, would be good to clean them up in this same way:

══ Failed ════════════════════════════════════════════════════════════════════════════════════════════════════════════
── 1. Failure (test-dplyr-lubridate.R:153:3): extract hour from date ─────────────────────────────────────────────────
`object` not equivalent to `expected`.
Componentx”: Mean relative difference: 1
Backtrace:
 1. arrow:::expect_dplyr_equal(...) test-dplyr-lubridate.R:153:2
 2. arrow:::expect_equivalent(via_batch, expected, ...) helper-expectation.R:88:4
 3. testthat::expect_equivalent(object, expected, ...) helper-expectation.R:46:2

── 2. Failure (test-dplyr-lubridate.R:153:3): extract hour from date ─────────────────────────────────────────────────
`object` not equivalent to `expected`.
Componentx”: Mean relative difference: 1
Backtrace:
 1. arrow:::expect_dplyr_equal(...) test-dplyr-lubridate.R:153:2
 2. arrow:::expect_equivalent(via_table, expected, ...) helper-expectation.R:98:4
 3. testthat::expect_equivalent(object, expected, ...) helper-expectation.R:46:2

── 3. Failure (test-dplyr-string-functions.R:706:3): strptime ────────────────────────────────────────────────────────
`%>%`(...) not equal to `tstamp`.
Componentx”: Mean absolute difference: 18000

@asfimport
Copy link
Author

Neal Richardson / @nealrichardson:
Issue resolved by pull request 10706
#10706

@asfimport asfimport added this to the 5.0.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants