-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
single dockerfile and single step in the docker building CI #863
single dockerfile and single step in the docker building CI #863
Conversation
Action is running on my branch with 24 jobs in parallel |
All jobs on my fork CI passed, it took 3 hours 28mins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for getting this started @shimwell.
I think it went a little to far/too fast and has lost the narrative of what we want to do:
- build docker images to be used by CI later on that only have appropriate dependencies - up to moab (this part can proceed in parallel)
- push those images to the registry to be used later using a temporary tag
- build and test DAGMC using those images as the container (this part can proceed in parallel)
- rename the docker images in the registry to a more stable tag
BONUS: figure out how to populate a cache for the first docker build using the existing docker images available in the registry. (separate PR?)
.github/workflows/docker_publish.yml
Outdated
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v1 | ||
uses: docker/setup-qemu-action@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need QEMU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QEMU has been removed from the yml file so I think this is resolved
.github/workflows/docker_publish.yml
Outdated
with: | ||
file: CI/Dockerfile | ||
target: external_deps | ||
target: dagmc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recall that this should only build the images to be used in CI for building/testing DAGMC.
We should only build to moab
, and then test using this image for building DAGMC as we will later be doing in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server-stage of multi stage build is set to moab so I think this is now resolved
.github/workflows/docker_publish.yml
Outdated
uses: actions/checkout@v3 | ||
MOAB=${{ matrix.moab_versions }} | ||
build_mw_reg_tests=ON | ||
tags: ghcr.io/${{ github.repository_owner }}/dagmc-ci-ubuntu-${{ matrix.ubuntu_versions }}-${{ matrix.compiler}}-ext-hdf5_${{ matrix.hdf5_versions}}-moab_${{ matrix.moab_versions }}-dagmc:ci_testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is push: false
then tags don't matter, but I think we should push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushing now taken care of by the multistage docker build action so I think this is resolved
.github/workflows/docker_publish.yml
Outdated
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v1 | ||
if: ${{ github.repository_owner == 'svalinn' }} | ||
uses: docker/setup-qemu-action@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also don't need QEMU here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QEMU has been removed from the yml file so I think this is resolved
.github/workflows/docker_publish.yml
Outdated
|
||
build-housekeeping-img: | ||
needs: build-base-img | ||
build-test-dagmc-img: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename based on comments below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has now been renamed to build-dependency-img
so I think this is resolved
Had a go at all those comments and pushed, CI is running on my fork https://github.com/shimwell/DAGMC/actions/runs/4325712183 |
CI on fork passed https://github.com/shimwell/DAGMC/actions/runs/4325712183 2nd stage only took a few seconds so the local stage cache appears to have worked. This used the local stage cache instead of the container repo. But the layer has the same tag locally and I'm the container repo. Perhaps this is not what we want |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely looks simpler and probably easier to maintain over the long term. I think there is still a discussion point about how to use the docker image cache.
.github/workflows/docker_publish.yml
Outdated
OFF, | ||
ON, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't appear to be passing this into the docker build or the tag naming and therefore have duplicate images being built and tagged. We should discuss whether we really need the statically linked version. Not sure I remember who requires/desires that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_exe has been removed from the matrix so I think this is resolved
.github/workflows/docker_publish.yml
Outdated
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v1 | ||
build_mw_reg_tests=ON | ||
dependency_image_location=ghcr.io/${{ github.repository_owner }}/dagmc-ci-ubuntu-${{ matrix.ubuntu_versions }}-${{ matrix.compiler}}-ext-hdf5_${{ matrix.hdf5_versions}}-moab_${{ matrix.moab_versions }}:ci_testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to see this working, but wonder if we are better off with a different strategy that uses the cache instead? something to discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dependency_image_location has now been removed and the PR has moved to a multistage build so I think this is resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple of additional comments
.github/workflows/docker_publish.yml
Outdated
@@ -141,14 +63,17 @@ jobs: | |||
UBUNTU_VERSION=${{ matrix.ubuntu_versions }} | |||
COMPILER=${{ matrix.compiler }} | |||
HDF5=${{ matrix.hdf5_versions }} | |||
tags: ghcr.io/${{ github.repository_owner }}/dagmc-ci-ubuntu-${{ matrix.ubuntu_versions }}-${{ matrix.compiler}}-ext-hdf5_${{ matrix.hdf5_versions}}:ci_testing | |||
MOAB=${{ matrix.moab_versions }} | |||
build_mw_reg_tests=ON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want this off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this build arg has been removed from the yml and dockerfile so I think this is now resolved
.github/workflows/docker_publish.yml
Outdated
steps: | ||
- name: Set up QEMU | ||
uses: docker/setup-qemu-action@v1 | ||
build_mw_reg_tests=ON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want this off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this build arg has been removed from the yml and dockerfile so I think this is now resolved
.github/workflows/docker_publish.yml
Outdated
with: | ||
file: CI/Dockerfile | ||
target: moab | ||
target: dagmc_test | ||
context: . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we may not need this line in this case. The action documentation indicates that the right version of the repository will be checked out in the Docker image by default. This isn't true in the case of our dependencies though, so I can see why we'd need it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line has been removed from the yml file so I think this one is resolved
I just opened a PR to the branch this PR is on to replace the docker build actions with the multistage build action. |
Use multistage docker build action
@shimwell - this appears to need a rebase... |
This also failed here (https://github.com/shimwell/DAGMC/actions/runs/4752350398/jobs/8442595961) - apparently it failed to push to GHCR? and it was successful for @bquan0 here (https://github.com/bquan0/DAGMC/actions/runs/4749240986/jobs/8436860572) but was based on cached version?? |
I ran into that problem a few times on my workflow runs too and I mentioned it in the PR. I solved it by going to the settings of the package it was trying to push to, then checking the DAGMC repo under the "Manage Actions access" section. |
Another PR for some final cleanup |
Github actions for this PR are successful |
Two small changes to clean things up a little further
Thanks Paul, I've merged that in. |
Confirmation of success here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is in good shape and is a useful model for many projects going forward. I'd love another review since I had a hand in authoring this in the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small line comments from me, but this looks really nice! I was expecting a much larger changeset to accomplish this.
Q: Which of the stages gets uploaded in the end?
.github/workflows/docker_publish.yml
Outdated
src: ghcr.io/${{ github.repository_owner }}/dagmc-ci-ubuntu-${{ matrix.ubuntu_versions }}-${{ matrix.compiler}}-ext-hdf5_${{ matrix.hdf5_versions}}-moab_${{ matrix.moab_versions }}/dagmc:refs_heads_${{ github.ref_name }}-bk0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the bk0
suffix specific to something or arbitrary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its generated by the multistage-build-action - not sure why the author uses that string.
Co-authored-by: Patrick Shriwise <pshriwise@gmail.com>
Every stage that is explicitly referenced in a
They each get pushed with the custom tag and then we convert push the |
Co-authored-by: Paul Wilson <paul.wilson@wisc.edu>
I'd love to approve and merge this, but there a runner isn't picking up the Mac testing job unfortunately. |
Saw that... |
Github has turned off the MacOS 10.15 runners, so I made this PR to move us forward. |
update MacOS runner
This is still passing here: https://github.com/shimwell/DAGMC/actions/runs/4904019862 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the teamwork @shimwell @bquan0 & @pshriwise ! |
Delighted this one got in, a real team effort. Nice work all. What shall we do next 😁 |
posting error message here so we don't lose it when the CI gets old
|
|
Yep - working on that, too |
Description
This PR is an alternative to #822
In addition to moving the scripts into a single dockerfile this PR also changes the CI for publishing dockerfiles.
Currently we had a branching and caching approach to the CI dockerfile building that is efficient but complex.
This approach reduces the complexity of the CI and allows all the perturbations to run in parallel.
This might actually end up being quicker to build and quicker to fail when something is broken as the CI doesn't have breaks waiting for all the docker stages to reach the same stage to proceed to the next stage. Also if caching can be figured out then this workflow could make use of pre-built images to be very fast
Motivation and Context
simpler is nice for maintainers 😄
Changes
refactored CI
Behavior
single dockerfile instead of scripts and simpler CI