Skip to content

GH-49866: [R][Release] Restore using tzdb on Windows for tzdata#49867

Merged
amoeba merged 8 commits intoapache:mainfrom
amoeba:GH-XXXXX/winbuilder-on-mingw-fix
Apr 28, 2026
Merged

GH-49866: [R][Release] Restore using tzdb on Windows for tzdata#49867
amoeba merged 8 commits intoapache:mainfrom
amoeba:GH-XXXXX/winbuilder-on-mingw-fix

Conversation

@amoeba
Copy link
Copy Markdown
Member

@amoeba amoeba commented Apr 26, 2026

Rationale for this change

Winbuilder is currently failing for the tip of the maint-24.0.0-r branch. #48601 introduced a new mechanism for tzdata resolution for Arrow C++ that requires non-MSVC Windows users to set up the tzdata database on their own.

We need to restore the mechanism we had before to which uses the tzdb package to provide a tzdata database because all CRAN builds are MinGW (GNU) builds.

This change was cherry-picked onto the R 24.0.0 maint branch to avoid any disruption.

What changes are included in this PR?

  • Puts tzdata back in Suggests so it's available
  • Restored a modified detection mechanism to use the tzdata package on non-MSVC Windows systems. This is effectively all R users since CRAN builds with MinGW.

Are these changes tested?

Yes. Here in CI.

Are there any user-facing changes?

No.

@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 26, 2026

I'm using CI to test a potential fix, c1b8c30 should recreate an environment where the runner doesn't have tzdata and the package can't handle it. windows-r should fail

@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 27, 2026

Removing the tzdata install step caused the expected failures:

── Error ('test-dplyr-funcs-datetime.R:3723:3'): timestamp rounding takes place in local time ──
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot locate or parse timezone 'UTC': Timezone database not found at "C:\Users\runneradmin\Downloads\tzdata"
Backtrace:
     ▆
  1. ├─arrow (local) check_timezone_rounding_vs_lubridate(tz_times, ".001 second") at test-dplyr-funcs-datetime.R:3723:3

but also a ton more, such as,

── Error ('test-dplyr-query.R:650:3'): Scalars in expressions match the type of the field, if possible ──
Error: Error: NotImplemented: Function 'greater' has no kernel matching input types (timestamp[us, tz=UTC], string)
── Error ('test-dplyr-funcs-type.R:974:3'): format date/time ───────────────────
Error in `compute.arrow_dplyr_query(x)`: Invalid: Cannot locate or parse timezone 'Pacific/Marquesas': Timezone database not found at "C:\Users\runneradmin\Downloads\tzdata"

So that's fun. Went from 3 identical test failures to 60 total test failures.

See https://github.com/apache/arrow/actions/runs/24970086455/job/73112987124?pr=49867.

This reverts commit c1b8c30.
@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 27, 2026

Testing my potential fix in 89a6ed9. Expecting CI could pass.

@jonkeane
Copy link
Copy Markdown
Member

Those extra failures look like they are from us setting that funky TZ in CI to confirm we are doing the right thing. But presumably the CRAN machines will have a more reasonable TZ 😝

In seriousness though, if we need to make that timezone something else that matches the TZ databases, that's ok. We tried to find one that had very few people in it (and had an odd-hour shift) but I don't think we've ever had issues where we had a confluence of coincidental timezone matching and a developer stuck, so maybe we don't need to be so cute about that...

@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 27, 2026

Thanks for taking a look. The fix here seems like it worked, at least in our CI. I'll mark this as ready for review and try it on Winbuilder in a bit.

@amoeba amoeba marked this pull request as ready for review April 27, 2026 03:21
This reverts commit 88dc550.
amoeba added a commit that referenced this pull request Apr 27, 2026
Copy link
Copy Markdown
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. This looks ok to me, a few small additions to the initialization flow. I might not be familiar enough with C++ build system for R and windows to know this off the top of my head, but this will in effect impact all windows users of the arrow R package, right?

Comment thread r/R/arrow-package.R Outdated
Comment thread r/R/arrow-package.R Outdated
Comment on lines +182 to +183
tzdb::tzdb_initialize()
set_timezone_database(tzdb::tzdb_path("text"))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wrap this in a try/catch and warn if this goes wrong? It would effectively be the same as if they didn't have it, so it might be nice to not block folks from using arrow at all.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea, I'll add it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in deb3bc6. I did it as a packageStartupMessage which felt right.

@github-actions github-actions Bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Apr 27, 2026
amoeba and others added 2 commits April 27, 2026 21:05
Co-authored-by: Jonathan Keane <jkeane@gmail.com>
@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 28, 2026

@github-actions crossbow submit r-binary-packages

@github-actions
Copy link
Copy Markdown

Revision: deb3bc6

Submitted crossbow builds: ursacomputing/crossbow @ actions-b895c1045c

Task Status
r-binary-packages GitHub Actions

@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 28, 2026

Hi @jonkeane, Re: your question above: Yes, this essentially effects all Windows users because CRAN's building under MinGW and most Windows users will be using the CRAN binary. With our latest CRAN Windows binary, I get:

> arrow_info()$build_info[[2]]
[1] "GNU"

@amoeba amoeba changed the title GH-49866: [R][Release] Fix failing tests on Winbuilder due to missing tzdata GH-49866: [R][Release] Restore using tzdb on Windows for tzdata Apr 28, 2026
@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 28, 2026

PR title and body updated to better reflect the change.

@amoeba
Copy link
Copy Markdown
Member Author

amoeba commented Apr 28, 2026

Crossbow failure is fine to ignore.

Copy link
Copy Markdown
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

@amoeba amoeba merged commit caf7f5b into apache:main Apr 28, 2026
23 of 25 checks passed
@amoeba amoeba removed the awaiting merge Awaiting merge label Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants