Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1779490: Handle missing pids during sidecar teardown #148

Merged

Conversation

ironcladlou
Copy link
Contributor

Make the sidecar container script tolerate missing pids during cleanup
to prevent spurious errors on shutdown.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Dec 4, 2019
@openshift-ci-robot
Copy link
Contributor

@ironcladlou: This pull request references Bugzilla bug 1779491, which is invalid:

  • expected the bug to target the "4.4.0" release, but it targets "4.3.0" instead
  • expected dependent Bugzilla bug 1779490 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1779491: Handle missing pids during sidecar teardown

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 4, 2019
@ironcladlou ironcladlou changed the title Bug 1779491: Handle missing pids during sidecar teardown Bug 1779490: Handle missing pids during sidecar teardown Dec 4, 2019
@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Dec 4, 2019
@openshift-ci-robot
Copy link
Contributor

@ironcladlou: This pull request references Bugzilla bug 1779490, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1779490: Handle missing pids during sidecar teardown

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Dec 4, 2019
@ironcladlou ironcladlou changed the title Bug 1779490: Handle missing pids during sidecar teardown WIP: Bug 1779490: Handle missing pids during sidecar teardown Dec 4, 2019
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 4, 2019
@ironcladlou
Copy link
Contributor Author

Need to add diffing logic to the operator to detect/apply container command changes

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 4, 2019
@ironcladlou ironcladlou changed the title WIP: Bug 1779490: Handle missing pids during sidecar teardown Bug 1779490: Handle missing pids during sidecar teardown Dec 4, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 4, 2019
@@ -194,6 +194,14 @@ func daemonsetConfigChanged(current, expected *appsv1.DaemonSet) (bool, *appsv1.
changed = true
}

for i, a := range current.Spec.Template.Spec.Containers {
b := expected.Spec.Template.Spec.Containers[i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be prudent to verify that len(current.Spec.Template.Spec.Containers) equals len(expected.Spec.Template.Spec.Containers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — I updated the logic and expanded the test cases

Copy link
Contributor

@Miciah Miciah Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional test case checks for a change in the number of words in the command, not a change in the number of containers. Did you mean to test the latter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new command diffing and the existing image diffing are against the contents of container arrays which are assumed to be the same structurally.

Seems like we would want a separate check entirely to detect changes to length of containers and if so just take the expected containers and don't bother diffing the contents of the container arrays — agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested comparing lengths of the container arrays because you were using the iterator of one container array to index the other container array. Detecting changes to the length of the containers arrays would be orthogonal but for that—we don't want the operator to panic if a naughty administrator or a future version of the operator changes the number of containers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we can just copy the containers array wholesale if any change is detected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested comparing lengths of the container arrays because you were using the iterator of one container array to index the other container array. Detecting changes to the length of the containers arrays would be orthogonal but for that—we don't want the operator to panic if a naughty administrator or a future version of the operator changes the number of containers.

That was my understanding as well, and I implemented that guard. Then you said

The additional test case checks for a change in the number of words in the command, not a change in the number of containers. Did you mean to test the latter?

Which led to the discussion about the containers array itself (not the command array), which you now seem to say is orthogonal (which I would agree with).

I think the current patch is correct — agree?

Copy link
Contributor

@Miciah Miciah Dec 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which led to the discussion about the containers array itself (not the command array), which you now seem to say is orthogonal (which I would agree with).

I think the current patch is correct — agree?

It is correct. I asked about the intention of the additional test case because it seemed unnecessary to test both that daemonsetConfigChanged detects a change to the command and a that daemonsetConfigChanged detects a change to the length of the command when we're just using cmp.Equal to compare the command arrays, and since you were adding logic to detect changes to the length of the containers array, I thought maybe your intention was to add a test case for that. If not, never mind. We've probably spent more time discussing this matter than it deserves—sorry for the distraction!

Make the sidecar container script tolerate missing pids during cleanup
to prevent spurious errors on shutdown.
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 4, 2019
@Miciah
Copy link
Contributor

Miciah commented Dec 4, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ironcladlou, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 0ac94ad into openshift:master Dec 4, 2019
@ironcladlou
Copy link
Contributor Author

/cherrypick release-4.3

@openshift-cherrypick-robot

@ironcladlou: failed to push cherry-picked changes in GitHub: pushing failed, output: "remote: Not Found\nfatal: repository 'https://openshift-cherrypick-robot:CENSORED@github.com/openshift-cherrypick-robot/openshift/cluster-dns-operator/' not found\n", error: exit status 128

In response to this:

/cherrypick release-4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -194,6 +194,21 @@ func daemonsetConfigChanged(current, expected *appsv1.DaemonSet) (bool, *appsv1.
changed = true
}

// Detect changes to container commands
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact you are having to write this code is terrifying. Is there a library-go function you should be using instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like library-go's resourceapply package might work. However, the prospect of using library-go presents at least two salient challenges for us: (1) library-go is poorly documented and (2) library-go uses client-go whereas the DNS operator uses the controller-runtime client library. Given these considerations, I don't see why using library-go would be less terrifying.

@ironcladlou
Copy link
Contributor Author

/cherrypick release-4.3

@openshift-cherrypick-robot

@ironcladlou: new pull request created: #149

In response to this:

/cherrypick release-4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants