Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump GitHub Actions #266

Merged
merged 4 commits into from
Jun 11, 2024
Merged

Bump GitHub Actions #266

merged 4 commits into from
Jun 11, 2024

Conversation

mjameswh
Copy link
Contributor

@mjameswh mjameswh commented Jun 5, 2024

What was changed

  • Updated all GHA actions to latest.

Why?

  • GitHub is pulling the plug on download-artifact@v2 and upload-artifact@v2 on June 30th;
  • actions-rs/toolchain@v1 is no longer maintained;
  • All GHA official actions based on Node 16 are deprecated.

# as Node 20 won't run on the docker image we use for linux builds
# (Node 20 require glibc 2.28+, but container image has glibc 2.17).
# https://github.blog/changelog/2024-05-17-updated-dates-for-actions-runner-using-node20-instead-of-node16-by-default/
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which action still needs node 16 here? I wonder if it'd be best to comment above those too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are FIXME comments on each actions in that job that are postponed from updating to Node 20.

      - name: Upload bridge library
        # FIXME: v4+ requires Node 20
        uses: actions/upload-artifact@v3
        with:
          name: ${{ matrix.out-prefix }}-bridge

The reason I made these FIXMEs is that this is only a temporary measure. GHA will completely pull the plug on Node 16 (no longer "soft deprecated") in a few months (exact date to be announced early October). By then, we'll have to either drop support for glibc releases older than 2.28 for all our SDKs, or figure out another way to work around this.

Copy link
Member

@cretz cretz Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question is why not update to node 20 now and update those action versions now? Which steps do not have a node-20 equivalent that is keeping us from upgrading? If it's just a glibc concern, don't worry there. We build our Linux binaries in a certain container w/ older glibc (or is this the problem?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's just a glibc concern, don't worry there. We build our Linux binaries in a certain container w/ older glibc (or is this the problem?).

Ah… Yes, that's specifically the problem. Most GHA Actions are written in JS. When running a job in a container, actions also run in that container. At this time, GHA maps a volume in the container containing Node 16 and Node 20, all the actions git repos (I mean all actions you use in your workflow), and a few other things, hence that works even if your container image doesn't have Node 16/20. But starting on June 30th, they will no longer include the Node 16 binary on the linked volume, unless you set that environment variable.

So the problem is that the Node 20 binary they bring into your container itself requires glibc 2.28. That's a questionable decision of their part, given that they could have use a Node 20 statically linked against musl, but they didn't, and FWIW, let's just assume they had good reasons for doing it the way they did.

Anyway, that's where we get these conflicting requirements. On one side, we want a runner or container image that has glibc 2.17, so that are builds artifacts are compatible with older distros, but on the other side, it must have at least glibc 2.28 to be able to run Node 20, which is required for most GHA actions…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so that's why you made sure to only do this in the package workflow and not the CI one. I wonder how long ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION will work for, heh. I also think they "good reasons" they have are probably hinted in the env var name, they don't consider older glibc secure.

In the future we can probably look to isolate just the building part to the container instead of the whole workflow (we lose Rust caching, but this is a packaging CI that probably doesn't need it much and I should probably turn it off on push to main and make it manual dispatch only). This looks good for now, approving/merging...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible alternative would be to use an up-to-date linux runner and no container at the job level, but use a run step to execute the actual build step (and only that step) inside a docker container (i.e. we'd have a step like - run: docker run .... build-native-code-for-linux, or wrap that in a shell script).

That would involve a distinct build path for the Linux production-grade native build artifacts vs other platforms. Can't use GHA actions to install required tools in that case. Caching actions would still be usable, but may require some special handling (they would run outside of the container, so the cached directory might need to be volume-mapped inside the container if they are not already contained in the working directory).

That's essentially what happens for the Python build. I'm not familiar with the details of Python packaging, but essentially the native build is done against the manylinux image, but it is some python-specific build tool that starts the docker container, not the GHA workflow file (i.e. you don't see a container: 'manylinux-2014' directive in the workflow file). But then, that build tool is also responsible for the build process on other platforms, so you don't get the downside of having to support two distinct build processes.

Really, it shouldn't be difficult to create that setup. But that would add extra maintenance that isn't required at this point (probably some custom docker images to maintain, e.g. based on manylinux, but install extra deps). And by the time that GHA actually pull the plug on Node 16 (they will announce the exact date in early October, which would presumably be a few months after that), it's very possible that the "glibc 2.28 minimal requirement" might feel more reasonable (AFAIK, all major distros releases that still predates glibc 2.28 will have reached TLS EOL). If it turns to be so, then we won't need to make things more complicated than they need to be. Otherwise, we'll bite the bullet and implement that alternative build process.

@cretz
Copy link
Member

cretz commented Jun 11, 2024

@mjameswh - I changed the failing/flaky test as part of #270 which is now merged. May want to merge into here and see if it fixes issue.

@cretz
Copy link
Member

cretz commented Jun 11, 2024

LGTM merge at will

@mjameswh mjameswh merged commit af5e004 into main Jun 11, 2024
7 checks passed
@mjameswh mjameswh deleted the bump-gha-artifact-v2 branch June 11, 2024 15:49
@mjameswh mjameswh mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants