Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-pick panic fix into v0.10.x #2239

Merged
1 commit merged into from Mar 18, 2020
Merged

Cherry-pick panic fix into v0.10.x #2239

1 commit merged into from Mar 18, 2020

Conversation

ghost
Copy link

@ghost ghost commented Mar 17, 2020

Changes

Fixes #2220

Cherry-picks #2222 into v0.10.x branch.

I'm making a PR just to see how integration runs against the backported patch.

Here's the original PR's blurb:

Panic in controller when step fails before image digest exporter

The image digest exporter (part of the Image Output Resource) is
configured with "terminationMessagePolicy": "FallbackToLogsOnError",.

When a previous step has failed in a Task our entrypoint wrapping the
exporter emits a log line like 2020/03/13 12:03:26 Skipping step because a previous step failed. Because the image digest exporter
is set to FallbackToLogsOnError Kubernetes slurps up this log line as
the termination message.

That line gets read by the Tekton controller, which is looking for JSON
in the termination message. It fails to parse and stops trying to read
step statuses.

That in turn results in a mismatch in the length of the list of steps and
the length of the list of step statuses. Finally we attempt to sort
the list of step statuses alongside the list of steps. This method
panics with an out of bounds error because it assumes the lengths of the two
lists are the same.

So, this PR does the following things:

  1. The image digest exporter has the FallbackToLogsOnError policy
    removed. I can't think of a reason that we need this anymore.
  2. The Tekton controller no longer breaks out of the loop while it's
    parsing step statuses and instead simply ignores non-JSON termination
    messages.

(cherry picked from commit 153f1d1)

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

The image digest exporter (part of the Image Output Resource) is
configured with `"terminationMessagePolicy": "FallbackToLogsOnError",`.

When a previous step has failed in a Task our entrypoint wrapping the
exporter emits a log line like `2020/03/13 12:03:26 Skipping
step because a previous step failed`. Because the image digest exporter
is set to FallbackToLogsOnError Kubernetes slurps up this log line as
the termination message.

That line gets read by the Tekton controller, which is looking for JSON
in the termination message. It fails to parse and stops trying to read
step statuses.

That in turn results in a mismatch in the length of the list of steps and
the length of the list of step statuses. Finally we attempt to sort
the list of step statuses alongside the list of steps. This method
panics with an out of bounds error because it assumes the lengths of the two
lists are the same.

So, this PR does the following things:

1. The image digest exporter has the FallbackToLogsOnError policy
removed. I can't think of a reason that we need this anymore.
2. The Tekton controller no longer breaks out of the loop while it's
parsing step statuses and instead simply ignores non-JSON termination
messages.

(cherry picked from commit 153f1d1)
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign sbwsg
You can assign the PR to them by writing /assign @sbwsg in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@googlebot googlebot added the cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit label Mar 17, 2020
@tekton-robot tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 17, 2020
@ghost ghost changed the title Cherry-pick into v0.10.x Cherry-pick panic fix into v0.10.x Mar 17, 2020
@ghost ghost merged commit f66970e into tektoncd:release-v0.10.x Mar 18, 2020
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Trying to make the CLA bot happy with ppl from different companies work on one commit size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants