Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-45505: [CI][R] Use Ubuntu 22.04 instead of 20.04 as much as possible for nightly jobs #45507

Merged
merged 5 commits into from
Feb 19, 2025

Conversation

kou
Copy link
Member

@kou kou commented Feb 12, 2025

Rationale for this change

Ubuntu 20.04 will reach EOL on 2025-05.

What changes are included in this PR?

  • Use Ubuntu 22.04 instead of Ubuntu 20.04 for Apache Arrow C++ 4.0.0 or later.
  • Keep using Ubuntu 20.04 for Apache Arrow C++ 3.0.0 or earlier because we can't build Apache Arrow C++ 3.0.0 or earlier on Ubuntu 22.04. We can use pre-built binaries forApache Arrow C++ 3.0.0 or earlier on Ubuntu 20.04.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@kou
Copy link
Member Author

kou commented Feb 12, 2025

@github-actions crossbow submit -g r

Copy link

⚠️ GitHub issue #45505 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Feb 12, 2025
@@ -313,8 +313,8 @@ jobs:
# fedora-clang-devel cannot use binaries bc of libc++ (uncomment to see the error)
# - {image: "rhub/fedora-clang-devel", libarrow_binary: "TRUE"}
- {image: "rhub/ubuntu-release"} # currently ubuntu-22.04
- {image: "rocker/r-ver:4.0.0"} # ubuntu-20.04
- {image: "rstudio/r-base:4.1-focal"}
- {image: "rstudio/r-base:4.0-jammy"}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK?

I assume that R 4.0 (not Ubuntu 20.04) is important here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonkeane Do you have any opinion for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, you're absolutely right that 4.0 is important and not the version — that comment was mostly to call attention to the fact that rocker/r-ver would end up being ubuntu 20.04

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming this!

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Feb 12, 2025
Copy link

Revision: be959e5

Submitted crossbow builds: ursacomputing/crossbow @ actions-b1cf7ac8c9

Task Status
r-binary-packages GitHub Actions
r-recheck-most GitHub Actions
test-r-arrow-backwards-compatibility GitHub Actions
test-r-clang-sanitizer GitHub Actions
test-r-depsource-bundled Azure
test-r-depsource-system GitHub Actions
test-r-dev-duckdb GitHub Actions
test-r-devdocs GitHub Actions
test-r-extra-packages GitHub Actions
test-r-gcc-11 GitHub Actions
test-r-gcc-12 GitHub Actions
test-r-install-local GitHub Actions
test-r-install-local-minsizerel GitHub Actions
test-r-linux-as-cran GitHub Actions
test-r-linux-rchk GitHub Actions
test-r-linux-valgrind GitHub Actions
test-r-macos-as-cran GitHub Actions
test-r-minimal-build Azure
test-r-offline-maximal GitHub Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-release-latest Azure
test-r-rocker-r-ver-latest Azure
test-r-rstudio-r-base-4.1-opensuse155 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 GitHub Actions
test-r-versions GitHub Actions
test-ubuntu-r-sanitizer GitHub Actions

@@ -89,7 +89,7 @@ jobs:
- { old_arrow_version: '1.0.1', r: '4.0' }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove 1.0.1 - 5.0.0?
It seems that we don't have Arrow C++ binaries for them on Ubuntu 22.04.

https://github.com/ursacomputing/crossbow/actions/runs/13279720095/job/37133898415#step:7:21

Failure ('test-read-files.R:25:3'): Can read the file (parquet)
`df <- read_parquet(pq_file)` threw an error.
Message: Cannot call io___MemoryMappedFile__Open(). See https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow C++ libraries. 
Class:   simpleError/error/condition
Backtrace:
    ▆
 1. ├─testthat::expect_error(df <- read_parquet(pq_file), NA) at test-read-files.R:25:3
 2. │ └─testthat:::quasi_capture(...)
 3. │   ├─testthat (local) .capture(...)
 4. │   │ └─base::withCallingHandlers(...)
 5. │   └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
 6. └─arrow::read_parquet(pq_file)
 7.   └─arrow:::make_readable_file(file)
 8.     └─arrow::mmap_open(file)
 9.       └─arrow:::io___MemoryMappedFile__Open(path, mode)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, we could, but would it be possible to install them from source instead? Or is that not possible at this point?

It would be great to be able to assert that we are still backwards compatible, though in practice going back this far is probably not super important

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. It seems that binary "arrow" package (not source "arrow" package) is used:

https://github.com/ursacomputing/crossbow/actions/runs/13279720095/job/37133898415#step:4:275

* installing *binary* package ‘arrow’ ...

Can we install Apache Arrow C++ from source when we use binary "arrow" package...?

@@ -89,7 +89,7 @@ jobs:
- { old_arrow_version: '1.0.1', r: '4.0' }
env:
ARROW_R_DEV: "TRUE"
RSPM: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"
RSPM: "https://packagemanager.rstudio.com/cran/__linux__/jammy/latest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we moved this later and did something like:

  - name: Set RSPM
        if: ${{ matrix.config.use_rspm  }}
        run: echo "RSPM =https://packagemanager.rstudio.com/cran/__linux__/jammy/latest" >> $GITHUB_ENV

And then set use_rspm in the matrix:

        - { old_arrow_version: '14.0.2.1', r: '4.3', use_rspm = true }
        - { old_arrow_version: '13.0.0.1', r: '4.3', use_rspm = true }
        - { old_arrow_version: '12.0.1.1', r: '4.3', use_rspm = true }
        - { old_arrow_version: '11.0.0.3', r: '4.2', use_rspm = true }
        - { old_arrow_version: '10.0.1', r: '4.2', use_rspm = true }
        - { old_arrow_version: '9.0.0', r: '4.2', use_rspm = true }
        - { old_arrow_version: '8.0.0', r: '4.2', use_rspm = true }
        - { old_arrow_version: '7.0.0', r: '4.1', use_rspm = true }
        - { old_arrow_version: '6.0.1', r: '4.1', use_rspm = true }
        - { old_arrow_version: '5.0.0', r: '4.1', use_rspm = false }
        - { old_arrow_version: '4.0.0', r: '4.0', use_rspm = false }
        - { old_arrow_version: '3.0.0', r: '4.0', use_rspm = false }
        - { old_arrow_version: '2.0.0', r: '4.0', use_rspm = false }
        - { old_arrow_version: '1.0.1', r: '4.0', use_rspm = false }

That should prevent setting RSPM and then it will use source installs when that's not set. This might not work / have other issues, but it's worth trying IMHO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Let's try it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, it may be better that we do this backward compatibility test in C++ not R...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be better that we do this backward compatibility test in C++ not R...

Yeah, totally. There are some R-specific pieces of metadata that it's nice to test, but if we did have c++ compatibility testing for the underlying parquet implementation, the decision to drop 1.0.1 - 5.0.0 testing for R does become less of a worry (yes, the R-specific metadata might not work, but that's also not so catastrophic)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I didn't know that these tests have some R-specific checks.

@kou
Copy link
Member Author

kou commented Feb 14, 2025

@github-actions crossbow submit test-r-arrow-backwards-compatibility

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Feb 14, 2025

This comment was marked as outdated.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Feb 14, 2025
@kou kou force-pushed the ci-r-ubuntu-20.04 branch from 8bdd4ef to 3636c5c Compare February 14, 2025 05:17
@kou
Copy link
Member Author

kou commented Feb 14, 2025

@github-actions crossbow submit test-r-arrow-backwards-compatibility

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Feb 14, 2025
Copy link

Revision: 3636c5c

Submitted crossbow builds: ursacomputing/crossbow @ actions-4f1a2ba4a1

Task Status
test-r-arrow-backwards-compatibility GitHub Actions

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Feb 14, 2025
@jonkeane
Copy link
Member

We got a few more versions with that! 1.0.1–3.0.0 look like they fail when trying to build some of our dependencies. That's not totally surprising, I guess. Arrow 3.0.0 was just over 4 years ago, so maybe we do drop those from the matrix. And especially if we had C++ tests for this kind of backwards compatibility like you mention @kou that would give us more confidence too.

One other alternative (though I think it's more work than it's worth right here / right now) would be to refactor this action so that the old version of arrow is run on an older OS that it is compatible with.

@kou
Copy link
Member Author

kou commented Feb 16, 2025

We can use Ubuntu 20.04 for 1.0.0-3.0.0 but we need to remove them eventually because Ubuntu 20.04 will reach EOL.

Should we keep 1.0.0-3.0.0 while we don't get any error with Ubuntu 20.04? Or can we drop 1.0.0-3.0.0 in this PR?

@jonkeane
Copy link
Member

Should we keep 1.0.0-3.0.0 while we don't get any error with Ubuntu 20.04? Or can we drop 1.0.0-3.0.0 in this PR?

Let's keep them for now (with 20.04, and we might / will eventually get an error) and I can look into what we want to do going forward

@kou kou force-pushed the ci-r-ubuntu-20.04 branch from 3636c5c to eb57ce6 Compare February 18, 2025 00:03
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Feb 18, 2025
@kou
Copy link
Member Author

kou commented Feb 18, 2025

@github-actions crossbow submit test-r-arrow-backwards-compatibility

Copy link

Revision: 83c378f

Submitted crossbow builds: ursacomputing/crossbow @ actions-288e5d5ed0

Task Status
test-r-arrow-backwards-compatibility GitHub Actions

@kou
Copy link
Member Author

kou commented Feb 18, 2025

Let's keep them for now (with 20.04, and we might / will eventually get an error) and I can look into what we want to do going forward

OK! Done!

@kou kou changed the title GH-45505: [CI][R] Use Ubuntu 22.04 instead of 20.04 for nightly jobs GH-45505: [CI][R] Use Ubuntu 22.04 instead of 20.04 as much as possible for nightly jobs Feb 18, 2025
Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the work on this ❤️

Comment on lines +69 to +75
# condition && true-case || false-case
# ==
# condition ? true-case : false-case
#
# We use Ubuntu 20.04 for 3.0.0 or earlier because 3.0.0 or
# earlier can't be built on Ubuntu 22.04. We will drop 3.0.0 or
# earlier when Ubuntu 20.04 is unavailable on GitHub Actions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment + the helpful hint about how these work — I always have to go and look these up when I edit them.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Feb 18, 2025
@kou kou merged commit 15f1d94 into apache:main Feb 19, 2025
10 checks passed
@kou kou deleted the ci-r-ubuntu-20.04 branch February 19, 2025 01:56
@kou kou removed the awaiting merge Awaiting merge label Feb 19, 2025
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 15f1d94.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

kou added a commit to kou/arrow that referenced this pull request Feb 25, 2025
…possible for nightly jobs (apache#45507)

### Rationale for this change

Ubuntu 20.04 will reach EOL on 2025-05.

### What changes are included in this PR?

* Use Ubuntu 22.04 instead of Ubuntu 20.04 for Apache Arrow C++ 4.0.0 or later.
* Keep using Ubuntu 20.04 for Apache Arrow C++ 3.0.0 or earlier because we can't build Apache Arrow C++ 3.0.0 or earlier on Ubuntu 22.04. We can use pre-built binaries forApache Arrow C++ 3.0.0 or earlier on Ubuntu 20.04.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#45505

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@assignUser
Copy link
Member

@jonkeane fyi they moved total deprecation of 20.04 up a month to beginning of april

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants