Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancements for the exporting output image digest #939

Merged
merged 2 commits into from
Jul 1, 2019

Conversation

nader-ziada
Copy link
Member

@nader-ziada nader-ziada commented May 31, 2019

Changes

  • use pod termination message to export the image digest instead of logs
  • add more logs when index.json is not found
  • create a container to create the default folder for image output digest dir

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide
for more details.

Release Notes

release-note

/cc @bobcatfish

@googlebot googlebot added the cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit label May 31, 2019
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 31, 2019
@nader-ziada
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@ghost
Copy link

ghost commented May 31, 2019

It looks like both this PR and #936 are suffering the same integration test problem. I don't fully understand what is going wrong.

@bobcatfish
Copy link
Collaborator

It looks like both this PR and #936 are suffering the same integration test problem. I don't fully understand what is going wrong.

I think it's tektoncd/plumbing#29 😅

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, only one question.
Also, I wonder if we should add more and more initContainers or if we should have a generic init container that can do all those initialization if needed (thinking outloud, might be dumb to do)

@@ -65,5 +65,18 @@ func main() {
if err != nil {
log.Fatalf("Unexpected error converting images to json %v: %v", output, err)
}
fmt.Println(string(imagesJSON))
log.Printf("Image digest exporter output: %s ", string(imagesJSON))
f, err := os.OpenFile("/workspace/builder/termination-log", os.O_WRONLY|os.O_CREATE, 0666)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be /builder/… here ?
I am thinking of the case where a resource would be cloned in /workspace/builder (a git resource with a target path)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, makes sense, should not use /workspace, will fix that

@nader-ziada
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

@dlorenc
Copy link
Contributor

dlorenc commented Jun 4, 2019

Looks like a real failure now:

I0604 02:05:08.877] {"level":"warn","logger":"controller.taskrun-controller","caller":"taskrun/taskrun.go:204","msg":"Failed to update TaskRun labels/annotations{error 25 0  Operation cannot be fulfilled on taskruns.tekton.dev \"run-giraffe\": the object has been modified; please apply your changes to the latest version and try again}","knative.dev/controller":"taskrun-controller"}
I0604 02:05:08.878] {"level":"error","logger":"controller.taskrun-controller","caller":"controller/controller.go:294","msg":"Reconcile error","knative.dev/controller":"taskrun-controller","error":"Operation cannot be fulfilled on taskruns.tekton.dev \"run-giraffe\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).handleErr\n\t/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:294\ngithub.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).processNextWorkItem\n\t/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:280\ngithub.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).Run.func1\n\t/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:231"}
I0604 02:05:08.878] {"level":"info","logger":"controller.taskrun-controller","caller":"controller/controller.go:281","msg":"Reconcile failed. Time taken: 153.861787ms.","knative.dev/controller":"taskrun-controller","knative.dev/key":"arendelle-r92l6/run-giraffe"}

@vdemeester
Copy link
Member

Looks like a real failure now:

I0604 02:05:08.877] {"level":"warn","logger":"controller.taskrun-controller","caller":"taskrun/taskrun.go:204","msg":"Failed to update TaskRun labels/annotations{error 25 0  Operation cannot be fulfilled on taskruns.tekton.dev \"run-giraffe\": the object has been modified; please apply your changes to the latest version and try again}","knative.dev/controller":"taskrun-controller"}
I0604 02:05:08.878] {"level":"error","logger":"controller.taskrun-controller","caller":"controller/controller.go:294","msg":"Reconcile error","knative.dev/controller":"taskrun-controller","error":"Operation cannot be fulfilled on taskruns.tekton.dev \"run-giraffe\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).handleErr\n\t/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:294\ngithub.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).processNextWorkItem\n\t/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:280\ngithub.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller.(*Impl).Run.func1\n\t/go/src/github.com/tektoncd/pipeline/vendor/github.com/knative/pkg/controller/controller.go:231"}
I0604 02:05:08.878] {"level":"info","logger":"controller.taskrun-controller","caller":"controller/controller.go:281","msg":"Reconcile failed. Time taken: 153.861787ms.","knative.dev/controller":"taskrun-controller","knative.dev/key":"arendelle-r92l6/run-giraffe"}

Ah ! I was wondering when this one would show up… We can reproduce this a but with cancelling a pipelinerun too. If the object we got has been updated by the controller between the time we got it and the time we send an update with it (the update could be, a taskrun started, so the status got updated). I wonder why it manifest on this particular test and now, but it is real indeed (unrelated to the PR but real… 😓 )

@nader-ziada
Copy link
Member Author

@vdemeester I'm confused to what the problem is, is there a real failure with the tests. as far as I can tell the YAML tests passed

@bobcatfish
Copy link
Collaborator

/>.>
<.<

/test pull-tekton-pipeline-integration-tests

Even if it's a legit failure it doesn't seem like it's related to this change 🤔 🤔 🤔

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuing to improve this @nader-ziada !!!

use pod termination message to export the image digest instead of logs

Would you be able to update the commit message with a bit more detail about why we wanted to make the switch? I know we were debating between the 2 and I'm curious why we decided not to stick with logs. (Was it so that the binary itself could log?)

/meow space

@@ -65,5 +65,18 @@ func main() {
if err != nil {
log.Fatalf("Unexpected error converting images to json %v: %v", output, err)
}
fmt.Println(string(imagesJSON))
log.Printf("Image digest exporter output: %s ", string(imagesJSON))
f, err := os.OpenFile("/builder/termination-log", os.O_WRONLY|os.O_CREATE, 0666)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we pass the path to write the termination log to into this binary? (i.e. minimize the number of places we hardcode this path)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also would it make sense to put this file into the same dir that the rest of the image related files go in? (/builder/home/image-outputs/)

}
if err := f.Sync(); err != nil {
log.Fatalf("Unexpected error converting images to json %v: %v", output, err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe in another iteration/PR it would be great to see some tests for this - esp as it gets a bit more complex (even if its only a bit!)

@@ -194,7 +194,7 @@ spec:
```

If no value is specified for `outputImageDir`, it will default to
`/builder/image-outputs/{resource-name}`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm why is this moving to /builder/home? (maybe it could be explained in the commit message?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the commit message "move the default image output folder to /builder/home since this will always be mounted in the containers"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"move the default image output folder to /builder/home since this will always be mounted in the containers"

this is still a bit odd to me - as far as i can tell /builder/home is actually the home dir on the executing node? which means that we're expecting the output dir to be the home dir on the node 🤔 i dont want to draw this PR out any further tho so let's go with it for now - I also created #1030 to look further into the paths we're using

- -ce
- |
set -ex
cat <<EOF > /builder/home/image-outputs/builtImage1/index.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we include some comments around here or somewhere else in this file indicating that this example hardcodes the digest and doesn't actually use the value returned by kaniko? (I would be confused about that if I found this while looking for an example of how to expose the digest of the actually built image)

- name: url
value: https://github.com/GoogleContainerTools/skaffold
---
#Builds an image via kaniko and pushes it to registry.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this comment is a bit out of date (since its two images!)

If we have this example, do we still need https://github.com/tektoncd/pipeline/blob/2373feefd5ec6e9c8cbfb7d53f062776fc372a9e/examples/taskruns/task-output-image.yaml ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one uses outputImageDir while the other one uses the default

@@ -59,6 +59,11 @@ type TaskSpec struct {
var _ apis.Validatable = (*Task)(nil)
var _ apis.Defaultable = (*Task)(nil)

const (
// TaskOutputImageDefaultDir is the default directory for output image resource,
TaskOutputImageDefaultDir = "/builder/home/image-outputs"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to be hardcoding the /builder/home path a bunch in https://github.com/tektoncd/pipeline/blob/3d87e2e18174f1d7b088bdd14065d5f06ef86462/pkg/reconciler/v1alpha1/taskrun/resources/pod.go - instead of hardcoding it in these 2 places, maybe we could define it once and share it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -211,6 +211,26 @@ func makeWorkingDirInitializer(steps []corev1.Container) *corev1.Container {
return nil
}

// initOutputResourcesDefaultDir checks if there are any output image resources expecting a default path
// and creates an init container to create that folder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

want *corev1.PodSpec
wantErr error
}{{
desc: "task-with-output-image-resource",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm if we just have one test case, maybe we don't need the for loop? (might make the test a bit simpler to read)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will actually move it inside the already existing table test, no need to duplicate this separately

@tekton-robot
Copy link
Collaborator

@bobcatfish: cat image

In response to this:

Thanks for continuing to improve this @nader-ziada !!!

use pod termination message to export the image digest instead of logs

Would you be able to update the commit message with a bit more detail about why we wanted to make the switch? I know we were debating between the 2 and I'm curious why we decided not to stick with logs. (Was it so that the binary itself could log?)

/meow space

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@nader-ziada
Copy link
Member Author

@bobcatfish @vdemeester ready for another look (if the tests pass :D )

@afrittoli
Copy link
Member

Nice work, thank you!

The latest kaniko still does not support OCI / index.json, but it has a new flag that let's you write the digest to a file. So you can now do something like this:

    - name: build-and-push
      image: gcr.io/kaniko-project/executor@sha256:c6e1f98d3ee4383078929d7b23f55328f5ae2bfeb234f91da61b2e423f5a58fe
      command:
        - /kaniko/executor
      args:
        - --cache=${inputs.params.useImageCache}
        - --cache-dir=/cache
        - --dockerfile=${inputs.params.pathToDockerFile}
        - --reproducible
        - --destination=${outputs.resources.builtImage.url}:${inputs.params.imageTag}
        - --context=/workspace/workspace/${inputs.params.pathToContext}
        - --digest-file=/workspace/workspace/__digest__
      volumeMounts:
        - name: kaniko-base-image-cache
          mountPath: /cache
    - name: build-index-json
      image: busybox
      command:
      - /bin/sh
      args:
      - -ce
      - |
        set -e
        cat <<EOF > /workspace/workspace/index.json
        {
          "schemaVersion": 2,
          "manifests": [
            {
              "mediaType": "application/vnd.oci.image.index.v1+json",
              "size": 314,
              "digest": "$(cat /workspace/workspace/__digest__)"
            }
          ]
        }
        EOF

@nader-ziada
Copy link
Member Author

@afrittoli thanks for the example, didn't know kaniko had this feature.

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me, letting @bobcatfish put her final review 👼

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 27, 2019
Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it! Thanks for your patience @nader-ziada 🙏

/lgtm

@@ -194,7 +194,7 @@ spec:
```

If no value is specified for `outputImageDir`, it will default to
`/builder/image-outputs/{resource-name}`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"move the default image output folder to /builder/home since this will always be mounted in the containers"

this is still a bit odd to me - as far as i can tell /builder/home is actually the home dir on the executing node? which means that we're expecting the output dir to be the home dir on the node 🤔 i dont want to draw this PR out any further tho so let's go with it for now - I also created #1030 to look further into the paths we're using

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 27, 2019
@bobcatfish
Copy link
Collaborator

I0627 21:12:45.436] pkg/reconciler/v1alpha1/taskrun/resources/pod_test.go:396:5: undefined: nopContainer

hm maybe some merge problems?

@tekton-robot tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 27, 2019
@nader-ziada
Copy link
Member Author

had to do a rebase to fix a couple of conflicts in pod_test.go

@bobcatfish
Copy link
Collaborator

/lgtm

thanks again @nader-ziada !!! 😻

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 28, 2019
Name: "outputimage",
},
}},
Params: []v1alpha1.Param{},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this might be causing a problem now b/c of 93de2fc :(

@bobcatfish
Copy link
Collaborator

I0628 23:24:02.691] pkg/reconciler/v1alpha1/taskrun/resources/pod_test.go:511:5: unknown field 'Params' in struct literal of type "github.com/tektoncd/pipeline/pkg/apis/pipeline/v1alpha1".TaskRunOutputs

@bobcatfish
Copy link
Collaborator

ugh, sorry this is getting so drawn out @nader-ziada !!

@vdemeester @imjasonh @dlorenc I'm going to be out tomorrow but if @nader-ziada gets a chance to fix this plz lgtm! (or anyone else in the org really!) 🙏

 - use pod termination message to export the image digest instead of logs
 - add more logs when index.json is not found
 - create a container to create the default folder for image output digest dir
@tekton-robot tekton-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 29, 2019
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jun 29, 2019

CLA Check
All committers are now authorized under a signed CLA.

@nader-ziada
Copy link
Member Author

/check-cla

1 similar comment
@nader-ziada
Copy link
Member Author

/check-cla

…image output digest exporter

- use termination message instead of std out to allow the image exporter binary to log erros or warnings
- move the default image output folder to /builder/home since this will always be mounted in the containers
- move the termination log file to be in the same path as output images and pass it to the binary to reduce
hard-coding
@nader-ziada
Copy link
Member Author

@bobcatfish the builder/home is a volume mounted on all pods as part of the implicit volumes added by default, check here https://github.com/tektoncd/pipeline/blob/master/pkg/reconciler/v1alpha1/taskrun/resources/pod.go#L70
at least thats my understanding

@nader-ziada
Copy link
Member Author

/test pull-tekton-pipeline-integration-tests

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 1, 2019
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nader-ziada, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot merged commit 723b9a9 into tektoncd:master Jul 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants