New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to cross-compiled docker containers & container build #1571
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Looking at the failure it looks like an E2E test failure with insufficient resources on the cluster, could someone with permission run it again? |
Let me restart it now. |
Don't worry. We will restart CI as soon as we find failure. But github action really runs not so stable, which bothers us for some time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
Looks like it's still having some E2E failures, I can dig into those in more detail but if this is sort of a known issue with GHA E2E tests being flaky I'll leave that to y'all to restart the tests as needed :) Thanks :) |
Seems the Github action is unstable :( |
/lgtm |
CI has been fixed and commits can be rebased and pushed again. refer: #1595 |
By the way, how about the current experience of using |
Hi Team, this PR has been awaiting a pending review for a long time. May I know, is there anything stopping this PR from merging? |
@@ -12,8 +12,12 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
FROM golang:1.16.5 AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three places where golang mirrors are used. If I want to update the version of go that compiles volcano, I need to update three places. Is there any room for optimization here?
@@ -12,8 +12,11 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
|
|||
FROM golang:1.16.5 AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The golang image also used here
@@ -15,8 +15,13 @@ | |||
|
|||
# The base image is created via `Dockerfile.base`, the base image is cached | |||
# since the required packages change very rarely. | |||
FROM golang:1.16 AS builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The golang image also used here
Sorry for not give a response. The only problem is that this PR cannnot pass the CI. We are all OK about the modifications. Is there any update to fix? @odidev |
@holdenk, I have checked the changes in this PR, and I have some confusion in the same. I can see that you have replaced W.R.T the tests failures in the CI, these E2e tests are failing to locate Kubernetes pods, as can be seen here < https://github.com/volcano-sh/volcano/runs/4137023885?check_suite_focus=true#step:7:348 >. I am not sure why that has happened, but the latest source code passes in those tests, so maybe a rebase can help to pass those tests. However, I feel if the multi-arch Docker images are not getting released with these changes using the CI, then the basic motive of this PR is not achieved. Kindly provide your suggestions on the same. Please correct me if my understanding seems erroneous. |
volcano/.github/workflows/release.yaml Line 44 in eaf1f59
Looks like we should also add the buildx and qemu deps for github action?
|
@odidev As I last comments, what we need to do is just add a |
@Yikun, yes you are correct. We need to install qemu before using buildx. But my concern is, docker buildx will only build images for amd64 itself, since the “DOCKER_PLATFORMS” is initialized with $(shell uname -m), which will read Linux/amd64 on GitHub Actions CI. To build and release multi-arch docker images for both AMD64 and ARM64, we will have to add “DOCKER_PLATFORMS="linux/amd64,linux/arm64"” with the docker buildx command. |
@odidev Ah I got your idea now. But I think it's OK maybe.
So by default, it will be set to host arch (can reduce time in some level). If user specify the platform, will defer to user choice. We can get this merged first, and add followup PR to setup qemu/buildx in ci and enable CI. |
@Yikun, yes correct. I can see that volcanosh dockerhub repo contains separate images for ARM64 and pushed 7 days ago. GitHub Actions only supports Linux/AMD64 native build. So, may I know, are you releasing Linux/ARM64 docker images manually? Also, are there any plans to release multi-arch docker images (AMD64 and ARM64)? |
@william-wang helped to update latest images ( @odidev I hope we can deliver this soon, maybe v1.5.1. Could you also validate this if you have time, thanks! also cc @martin-g |
No problem. the images will be updated automatically once the pr is merged. |
I've just tested the PR locally! I've made the following minor change locally because I was running out of disk space due to many new Docker images being downloaded: diff --git Makefile Makefile
index 5ec17dab..082e0a17 100644
--- Makefile
+++ Makefile
@@ -93,7 +93,7 @@ image_bins: init
fi;
images:
- for name in controller-manager scheduler webhook-manager; do\
+ for name in controller-manager ; do\
docker buildx build -t "${IMAGE_PREFIX}-$$name:$(TAG)" . -f ./installer/dockerfile/$$name/Dockerfile --output=type="${BUILDX_OUTPUT_TYPE}" --platform "${DOCKER_PLATFORMS}"; \
done The following has been executed on my amd64 laptop: $ make images DOCKER_PLATFORMS="linux/amd64,linux/arm64" BUILDX_OUTPUT_TYPE=registry IMAGE_PREFIX=ghcr.io/martin-g/volcano and it created https://github.com/users/martin-g/packages/container/package/volcano-controller-manager And on my Linux ARM64 machine: $ make images DOCKER_PLATFORMS="linux/amd64,linux/arm64" BUILDX_OUTPUT_TYPE=registry IMAGE_PREFIX=ghcr.io/martin-g/volcano-from-aarch64 created https://github.com/users/martin-g/packages/container/package/volcano-from-aarch64-controller-manager Now I will update and test release.yaml Github workflow! |
Related-to: volcano-sh#1571 Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
https://github.com/martin-g?tab=packages - the ones with names starting with "volcano-from-gha**" are created by https://github.com/martin-g/volcano/pull/1/files#diff-e426ed45842837026e10e66af23d9c7077e89eacbe6958ce7cb991130ad05adaR1 |
Related-to: volcano-sh#1571 Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
What could be the reason for this warning:
I think vcctl e2e tests fails because of it: |
@holdenk Any progress? |
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This allows for one container name to support multiple archs, and once pushed we can remove more of the arm64 specific installation stuff since the same container names will support both.
docker does "the right thing" and pulls the container associated with the arch.
You can take a look at the containers I built with this change in my own dockerhub at https://hub.docker.com/repository/docker/holdenk/volcanosh-scheduler , https://hub.docker.com/repository/docker/holdenk/volcanosh-controller-manager , https://hub.docker.com/repository/docker/holdenk/volcanosh-webhook-manager-base , etc.
This is in response to #1570 (although it could also solve #1568 since we wouldn't need volcano-development-arm64.yaml anymore).
To preserve backwards compatibility with users who might be developing locally in single arch mode I have that the default. If there are release docs I should update as well let me know.
Signed-off-by: Holden Karau holden@pigscanfly.ca