Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/daemon: remove annotation equality test from validation #138

Merged
merged 1 commit into from
Oct 31, 2018

Conversation

sdemos
Copy link
Contributor

@sdemos sdemos commented Oct 17, 2018

the node state validation function had a test right at the beginning
that checked if the values of the current annotation and the desired
annotation were equal. if they were, it returned, under the assumption
that because the annotations were equal the node was in the state we
wanted it to be in.

with the way that we try to behave, in general that would be true.
however, we instead opt for more rigorous testing of the node - every
time the validation function is called, it does a full validation of the
node state. if something changed, we attempt to reconcile the state back
to the desired. this also reflects the fact that nobody should be
modifying the node outside of the daemon, and if they do they should
expect to get clobbered.

also, my editor ran gofmt on the file and fixed the import ordering.

fixes #98

/cc @ashcrow @jlebon

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 17, 2018
@ashcrow
Copy link
Member

ashcrow commented Oct 18, 2018

/test e2e-aws

@ashcrow
Copy link
Member

ashcrow commented Oct 18, 2018

/lgtm

// system state is valid.
if strings.Compare(dcAnnotation, ccAnnotation) == 0 {
return true, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. Though... now I wonder about the opposite case. If the annotations do match, but isDesiredMachineState is false, then something has gone wrong, right? I guess this gets into #99, but it's slightly different: there, we're talking about repeatedly trying and failing to meet the desired config so that we can set the current config label to the same. Here, we've already in the past validated the node for that config (since the current config label already matches), but somehow the state doesn't match. I feel like we should just set it to degraded in that case?

So basically, it'd mean still doing this comparison but e.g. in process after the call to isDesiredMachineState(). Anyway, we can do that in a follow-up!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my thought was that if there was a one-time change of a file under our control, that we can (and I think should) easily just change it back to what we want. if it happens multiple times right now, we don't do anything, but that was already true (maybe there are more cases of it now). I think doing some kind of backoff like described in #99 is the way to go in the long run, which I think would cover this case as well.

"path/filepath"
"strings"
"os"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably from me not getting in the habit yet of running gofmt. :)

@jlebon
Copy link
Member

jlebon commented Oct 18, 2018

/lgtm

@ashcrow
Copy link
Member

ashcrow commented Oct 18, 2018

I don't think the e2e test is going to return. It's been "building" for 4 hours and the Jenkins link 404s.

@sdemos
Copy link
Contributor Author

sdemos commented Oct 18, 2018

I see a page when I click on the details link - it says it's not finished, started at 7:30ish PDT and doesn't have any build logs. seems busted. will it restart if I tell it to retest?

@sdemos
Copy link
Contributor Author

sdemos commented Oct 18, 2018

let's try it

/retest e2e-aws

@sdemos
Copy link
Contributor Author

sdemos commented Oct 18, 2018

well that didn't seem to restart it

@smarterclayton any ideas?

@ashcrow
Copy link
Member

ashcrow commented Oct 18, 2018

@sdemos the bot is currently down and being fixed.

@sdemos
Copy link
Contributor Author

sdemos commented Oct 19, 2018

/retest e2e-aws

@ashcrow
Copy link
Member

ashcrow commented Oct 19, 2018

/lgtm

(to retrigger based on @jlebon's command)

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 19, 2018
@sdemos
Copy link
Contributor Author

sdemos commented Oct 22, 2018

/test e2e-aws

@sdemos
Copy link
Contributor Author

sdemos commented Oct 22, 2018

/retest

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 22, 2018
@sdemos
Copy link
Contributor Author

sdemos commented Oct 23, 2018

/test e2e-aws

1 similar comment
@ashcrow
Copy link
Member

ashcrow commented Oct 23, 2018

/test e2e-aws

@ashcrow
Copy link
Member

ashcrow commented Oct 23, 2018

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 23, 2018
@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

7 similar comments
@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@jlebon
Copy link
Member

jlebon commented Oct 29, 2018

Heh, OK this is failing on:

--> RUN WHAT=machine-config-daemon ./hack/build-go.sh
Using version from git...
Building github.com/openshift/machine-config-operator/cmd/machine-config-daemon (v3.11.0-193-g038d67de-dirty)
# github.com/openshift/machine-config-operator/pkg/daemon
pkg/daemon/daemon.go:285:1: syntax error: unexpected <<, expecting }
error: build error: running 'WHAT=machine-config-daemon ./hack/build-go.sh' failed with exit code 2

Which took me a while to understand, but I think it's due to the merge conflicts. Can you rebase this?

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ashcrow
Copy link
Member

ashcrow commented Oct 29, 2018

rebase 😄

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ashcrow
Copy link
Member

ashcrow commented Oct 30, 2018

@sdemos

# github.com/openshift/machine-config-operator/pkg/daemon
pkg/daemon/daemon.go:285:1: syntax error: unexpected <<, expecting }
error: build error: running 'WHAT=machine-config-controller ./hack/build-go.sh' failed with exit code 2

Specifically:

pkg/daemon/daemon.go:285:1: syntax error: unexpected <<, expecting }

@openshift-merge-robot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

the node state validation function had a test right at the beginning
that checked if the values of the current annotation and the desired
annotation were equal. if they were, it returned, under the assumption
that because the annotations were equal the node was in the state we
wanted it to be in.

with the way that we try to behave, in general that would be true.
however, we instead opt for more rigorous testing of the node - every
time the validation function is called, it does a full validation of the
node state. if something changed, we attempt to reconcile the state back
to the desired. this also reflects the fact that nobody should be
modifying the node outside of the daemon, and if they do they should
expect to get clobbered.

also, my editor ran gofmt on the file and fixed the import ordering.

fixes openshift#98
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Oct 30, 2018
@sdemos
Copy link
Contributor Author

sdemos commented Oct 30, 2018

alright, it's rebased. let's see if the tests pass.

@ashcrow
Copy link
Member

ashcrow commented Oct 31, 2018

/lgtm

... while we can!

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 31, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, jlebon, sdemos

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 214b930 into openshift:master Oct 31, 2018
@sdemos sdemos deleted the always-validate branch October 31, 2018 20:59
osherdp pushed a commit to osherdp/machine-config-operator that referenced this pull request Apr 13, 2021
remove extra get on crd during imagestream processing (a conflict win…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MCD: consider removing annotation equality checking from node validation logic
5 participants