-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ppc64le helm image build #1533
ppc64le helm image build #1533
Conversation
Hi @clnperez. Thanks for your PR. I'm waiting for a operator-framework or openshift member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I tested this as much as possible using our internal Travis, but, we'll see what happens when it runs here. |
Oh, I've also tested out using the cli and the helm image to create and deploy an operator on Power (ppc64le) using https://github.com/IBM/charts/tree/master/stable/ibm-mqadvanced-server-dev. |
Cool! A couple of quick comments.
@AlexNPavel Any consequences this would have on the transition to prow? |
@joelanford If we use go cross-compilation, we should still be fine migrating to prow since we're not testing the ppc64 images, only building them. |
/ok-to-test |
@AlexNPavel What about the need for |
The UBI images are multi-arch (actually manifest lists) already. We could cross-compile instead of using the power nodes built into Travis, but then we'd need to know the hash of the ppc64le UBI image (unless they're pushed with an arch-name...which would be nice for this case, and I'm not sure whether they are or who to ask). |
To answer @joelanford 's other questions:
You can cross compile, but some of this, IIRC, does some things in docker (like setting permission bits of the cli binary, etc.) that will blow up if you try to pull down a power image and then do anything other than just COPY into it.
That's the manifest list part that thankfully the redhat registry supports and the creators of that image went ahead and configured.
I had that thought as well, but, I was thinking one at a time and also to see how this one goes. :) And if you want a ppc64le build node for prow, we can get you a free one hosted at OSU since this is an open-source project. |
@joelanford That's good point. We might need to stick to travis for the |
Has any more thought been put into whether merging this as-is is okay? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the typical build/push workflow for multi-arch images? Is it to build and push all of the images then do a single manifest push? Is it possible to incrementally build the manifest as the individual images are being pushed?
Makefile
Outdated
./hack/image/build-helm-image.sh $(HELM_BASE_IMAGE):dev | ||
image/build/helm: build/operator-sdk-dev-${BIN_ARCH}-linux-gnu | ||
./hack/image/build-helm-image.sh ${GOARCH} ${BIN_ARCH} $(HELM_BASE_IMAGE):dev | ||
./hack/image/build-helm-image.sh ${GOARCH} ${BIN_ARCH} $(HELM_ARCH_IMAGE):dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment in build-helm-image.sh
. I could be forgetting something, but I don't think we need to build the image twice. Can we leave these lines unchanged and rely on the fact that our build host handles the architecture specific things?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was me being lazy and not wanting to redo the script. The image doesn't build twice b/c of docker's caching. It just gets you both names. the arch image is needed for pushing, but the base one is needed for testing locally. I can find a non-lazy solution to that if you'd prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I could just put in a docker tag
there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already do docker tag
in the push script to change the image tag from :dev
to whatever tags we want to push. I'm wondering if we can update the push script to also include the architecture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i tackled that. will see what happens in travis land
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i made a dryrun test function and this still needs more work so hold off on reviewing for the time being
That's how I set this up.
Nope. The manifest list references image layers in a registry so you have to do the build and push of all the images, and then do the manifest create (to get those layers) and push (to put the list on the registry too). |
f598f32
to
27fe8fb
Compare
@clnperez Sorry, we've been pretty busy over the past week. Last time I tried, I was still having issues with the quay manifest list support, and I didn't have time to dig into it further. I should have more time this week to pick this back up. |
@clnperez I was able to manually run the make targets locally and get things pushed to quay and then confirm that the manifest and manifests lists made it there correctly with However, when trying to push the manifest list from travis, I'm still getting a "No such manifest" error. I tried adding a Have you had success with travis pushing the manifest list to quay? One difference I see between travis and my local environment is that I'm on docker 19.03.1-ce and travis is on 18.06.0-ce. |
@joelanford it looks like your Travis local docker doesn't have the quay certs: https://travis-ci.org/joelanford/operator-sdk/jobs/581841450#L365 -- Here's where I fixed that issue (only seen, IIRC, with rhel/centos distros): You can try adding the BTW, you don't need to pull the images down. The manifest info is always pulled from the registry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip about that permissions error! I had been ignoring it since all of the other push/pull commands seemed to be working. Looks like we need to chmod /etc/docker
.
@joelanford it looks like the failing travis job was cancelled but i can't see why or by whom. any ideas? can we kick that off again? |
Not sure why it was cancelled. Just restarted it. I've also got a separate build on my fork that enables the deploy tasks, so I'm babysitting that one too :) https://travis-ci.org/joelanford/operator-sdk/builds/583868166 |
thanks @joelanford. looks like all the tests here passed 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hack/image/push-manifest-list.sh
Outdated
push_image=$1; shift || fatal "${FUNCNAME} usage error" | ||
arches=$@ | ||
|
||
image_name=$push_image # @TODO bug workaround |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this line that sets image_name
if we add the line in docker_login
that I mentioned in this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yah that's better. i misread your comment then. thanks. fixed.
hack/image/push-image-tags.sh
Outdated
docker_login | ||
check_can_push || return 0 | ||
|
||
image_name=$push_image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto with this line.
hack/image/push-image-tags.sh
Outdated
image_name=$push_image | ||
docker_login $image_name | ||
|
||
check_can_push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we no longer return 0
if check_can_push
fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. fixed.
hack/image/push-image-tags.sh
Outdated
|
||
check_can_push | ||
|
||
docker tag $source_image $push_image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this docker tag
necessary? I think we're doing this in line 31, which also includes the correct image tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i played around with it, trying to remember how it had gotten there. pretty sure it's not, so im taking it out
hack/image/push-manifest-list.sh
Outdated
image_name=$push_image # @TODO bug workaround | ||
docker_login $push_image | ||
|
||
check_can_push || exit 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return
like we do in other places, or do we need to exit
here instead?
check_can_push || exit 0 | |
check_can_push || return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yah, changed that to return
.travis.yml
Outdated
@@ -68,6 +69,24 @@ x_base_steps: | |||
services: | |||
- docker | |||
|
|||
# Manifest list deploy job | |||
- &manifest-deploy | |||
stage: deploy-manifest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (and I'm not positive this works): Can this be the following:
stage: deploy-manifest | |
stage: "Deploy multi-arch manifest lists" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's fine. b/c you can create a manifest list with x86_64 windows and x86_64 linux images :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, did you mean you're not sure if it works functionally b/c there are spaces @joelanford? that i don't know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that does indeed work just fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Great! Let's make this change then.
Verified it created a ppc binary
|
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
|
the sdk cli the helm operator image and create a "multi-arch" image for helm Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>
@joelanford all those changes made and pushed after a little more verification this morning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for all your reviews and attention to detail @joelanford! |
|
||
release: clean $(release_x86_64) $(release_x86_64:=.asc) | ||
release: clean $(release_builds) $(release_builds:=.asc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're supporting ppc64le release binaries now then we should update the release doc to mention that.
https://github.com/operator-framework/operator-sdk/blob/master/doc/dev/release.md#operating-systems-and-architectures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok for a followon pr so it dosn't invalidate this lgtm again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go ahead and fix it here. I can babysit the builds if necessary :)
EDIT: on second thought, wordsmithing that change might involve some back and forth, so maybe it is best left for a follow-up.
@hasbro17 Are you okay with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep we can do a follow up. Just a nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 nit but LGTM overall.
build for ppc64le:
and create a "multi-arch" image for helm
Signed-off-by: Christy Norman christy@linux.vnet.ibm.com
Description of the change:
Motivation for the change: