Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ats-ci.yml to build images for arm64 #201

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

rfiorella
Copy link

Updates the ats-ci.yml GitHub action to build images for both amd64 and arm64 using buildx. It is built on the TPLs image, which is currently only amd64 and will need to be updated as well.

Currently only runs the tests on the amd64 image. I think the CI action could be modified to run on both architectures - let me know if this would be preferred and I'll update the PR.

One of the regression tests fails on a local test of the arm64 image, but this was due to the timeout threshold being too low. I suspect that this issue will go away when the TPL image is also built for arm64.

@ecoon
Copy link
Collaborator

ecoon commented Jul 18, 2023

That sounds great. I don't have an opinion either way on whether tests should run on both. If they're slow, let's just run amd64.

@jd-moulton
Copy link
Member

The tests don't take that long to run (usually), particularly compared to the build itself. So I think the goal should be to run the tests on both architectures.

@ecoon
Copy link
Collaborator

ecoon commented Jul 18, 2023

Yeah, that's a fair point. What is the plan for TPLs then? Is that getting updated in Amanzi, and this PR should wait until that one is done? Or should we accept this one, make a ticket, and then do a second PR once @rfiorella is ready to turn on arm tests?

@rfiorella
Copy link
Author

We're working on updating the TPL images (ticket amanzi/amanzi#761).

I should be able to add arm64 to this PR pretty quickly. I think the odds of generating merge conflicts here is fairly minor since only one file is changing - I'm happy to do this as one PR or two, though slight preference from me for one as there won't be much performance benefit to using the arm64 image until the TPLs image is updated (probably should have waited a bit to open the PR!)

Adds arm64 tests to CI workflow. Runs through the emulator,
so may be slower than the amd64 image. Will keep looking for
a way to run native arm64, but I think all GitHub runners are
still x86_64.
@rfiorella
Copy link
Author

@jd-moulton @ecoon An interesting wrinkle here with running the arm64 tests: all of the github actions task runners are x86_64, so it doesn't seem to be possible to run the arm64 container tests on an arm64 machine without using an external service (e.g., Azure) or a self-hosted runner.

Last commit (d2d2a58) added a step running the tests on arm64 through emulation. All tests passed here, but I had a failure on an M1-Ultra where a test just didn't run fast enough (it will pass if given more time) that oddly didn't happen through the emulator on GitHub actions.

I'll trigger the workflow again when the TPLs library is updated.

@rfiorella rfiorella marked this pull request as draft July 25, 2023 15:23
@rfiorella
Copy link
Author

Converted PR to draft as compiling TPLs and ATS is unworkably slow. Looking into a few alternatives (primarily Azure pipelines and/or cross-compiling).

@ecoon
Copy link
Collaborator

ecoon commented Jan 19, 2024

@rfiorella Where are we at with this? I've noticed that our current images are not debuggable -- if you install gdb into the container after a CI has failed, I get a bunch of errors about ptrace that suggest that the problem might be fixed by building an arm64 container, see docker/for-mac#5191 , so I'm interested in checking that if this is ready to go.

@rfiorella
Copy link
Author

@ecoon I do remember having some CI failures on arm64/aarch64 images but not the the x86 ones, maybe this is the same issue. We can build arm64/aarch64 images for amanzi and ats fine, I think I've manually added a few versions here and there (e.g., for the IDEAS all hands), but it didn't work on CI as compiling through the QEMU emulator was fatally slow (like 10x longer).

Assuming GitHub doesn't roll out arm64 runners and make this a lot easier, @jd-moulton and I had a few discussions about either using cross-compilers (which we could include in the tpls image, but would have to update all the cmake workflows to pick up these compilers in a cross compiling case) or using a self-hosted arm64 runner somewhere. The latter has some issues that maybe we should talk about in a different venue, but might be a possibility of cross-compiling doesn't work

I had tried the cross-compiler route a few months ago but was pulled away by other priorities before I could fix the cmake part of it and it seemed like lower priority when we could just have occasional releases - obviously though if we want to include this in CI we'll have to revisit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants