New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daemon: restructure the process function #109

Merged
merged 2 commits into from Oct 9, 2018

Conversation

Projects
None yet
5 participants
@sdemos
Contributor

sdemos commented Oct 3, 2018

there was a bug where config changes where the only difference was
unreconcilable the node wouldn't be marked as degraded. it happened
because the update would not be applied if the node verification was
successful, and the verification logic only checks things it
understands, and skips unreconcilable stuff.

this restructuring of the top level logic sets up the process function
to verify the machine every time the mcd starts, and after a successful
update that didn't require a reboot, but only then. essentially this
means verification happens after every update. however, unlike before,
the update is always attempted regardless of whether or not the
machine seems like it is in the correct state. the update logic is
responsible for checking if a config is reconcilable, so that is still
called first.

fixes #97

/cc: @ashcrow @jlebon

daemon: restructure the process function
there was a bug where config changes where the only difference was
unreconcilable the node wouldn't be marked as degraded. it happened
because the update would not be applied if the node verification was
successful, and the verification logic only checks things it
understands, and skips unreconcilable stuff.

this restructuring of the top level logic sets up the process function
to verify the machine every time the mcd starts, and after a successful
update that didn't require a reboot, but only then. essentially this
means verification happens after every update. however, unlike before,
the update is _always_ attempted regardless of whether or not the
machine seems like it is in the correct state. the update logic is
responsible for checking if a config is reconcilable, so that is still
called first.
@ashcrow

ashcrow approved these changes Oct 4, 2018

LGTM. Will let @jlebon do a second review and merge when ready.

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 4, 2018

Contributor

/hold

after playing around with this a bit more I found something I want to dig into before this gets merged in

Contributor

sdemos commented Oct 4, 2018

/hold

after playing around with this a bit more I found something I want to dig into before this gets merged in

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 4, 2018

Contributor

okay yeah I confirmed - this pr is actually a regression on the current behavior, if there are only irreconcilable changes it successfully recognizes them, but then the daemon stops and restarts, it comes back up again, validates the node (which succeeds for the same reason this change is required), sets the update to done and waits for another change.

it seems like if we figure out what the daemon is supposed to be doing if the node is degraded (it seems like it shouldn't be doing anything? just sleep forever?) that'll fix this issue, which I think is a better way to fix this pr than to restructure the logic here in any other way.

Contributor

sdemos commented Oct 4, 2018

okay yeah I confirmed - this pr is actually a regression on the current behavior, if there are only irreconcilable changes it successfully recognizes them, but then the daemon stops and restarts, it comes back up again, validates the node (which succeeds for the same reason this change is required), sets the update to done and waits for another change.

it seems like if we figure out what the daemon is supposed to be doing if the node is degraded (it seems like it shouldn't be doing anything? just sleep forever?) that'll fix this issue, which I think is a better way to fix this pr than to restructure the logic here in any other way.

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Oct 4, 2018

Member

When degraded it should note to the cluster that it is in that state and continuing it's normal behavior. The node may be destroyed via the cluster and re-provisioned.

Member

ashcrow commented Oct 4, 2018

When degraded it should note to the cluster that it is in that state and continuing it's normal behavior. The node may be destroyed via the cluster and re-provisioned.

@jlebon

If we mark it as degraded, there's nothing else to do, right? Isn't it like saying "I give up, just nuke me"? I'd prefer the daemon just sleep forever then (since exiting will just spawn us up again) to reinforce the idea that setting the degraded label is a one way street. So e.g. when setting degraded, exit, but then also add a check at the top that if the node is degraded, we just sleep forever?

}
if err := dn.syncOnce(); err != nil {
// validate machine state
isDesired, err := dn.isDesiredMachineState()

This comment has been minimized.

@jlebon

jlebon Oct 4, 2018

Member

We still need to enhance isDesiredMachineState() to check if the configs are reconcilable, right?

@jlebon

jlebon Oct 4, 2018

Member

We still need to enhance isDesiredMachineState() to check if the configs are reconcilable, right?

This comment has been minimized.

@sdemos

sdemos Oct 4, 2018

Contributor

I wasn't going to initially do that, but it would also fix the issue with this pr, and it's probably a good idea to do that, since it makes the validation function much more complete.

@sdemos

sdemos Oct 4, 2018

Contributor

I wasn't going to initially do that, but it would also fix the issue with this pr, and it's probably a good idea to do that, since it makes the validation function much more complete.

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 4, 2018

Contributor

the daemon is already not going to continue its normal behavior if the node is in a degraded state. it can't apply the config for whatever reason, so it fundamentally can't do what it's supposed to. the current behavior without this pr is to crash loop, particularly if the node is degraded because the config is irreconcilable, since it starts up, tries to apply the config, fails, quits, and starts again. I think the best option is either to sleep forever after starting up and finding a degraded status, or to do something with taints/tolerations to prevent the daemon from being rescheduled again once it quits.

Contributor

sdemos commented Oct 4, 2018

the daemon is already not going to continue its normal behavior if the node is in a degraded state. it can't apply the config for whatever reason, so it fundamentally can't do what it's supposed to. the current behavior without this pr is to crash loop, particularly if the node is degraded because the config is irreconcilable, since it starts up, tries to apply the config, fails, quits, and starts again. I think the best option is either to sleep forever after starting up and finding a degraded status, or to do something with taints/tolerations to prevent the daemon from being rescheduled again once it quits.

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Oct 4, 2018

Member

so it fundamentally can't do what it's supposed to.

That could be the case. It depends heavily on the pods running on the node. But I follow your logic.

I think the best option is either to sleep forever after starting up and finding a degraded status, or to do something with taints/tolerations to prevent the daemon from being rescheduled again once it quits.

I'm 👍 for tainting the node when degraded so that no new pods are deployed to the node. If the node is recycled then the pods will reschedule on their own. By adding in taint logic to the MCD we also can take advantage of that for the SSH access taint work.

FWIW I'm also OK with sleeping forever and letting the MCO and higher deal with the details.

Member

ashcrow commented Oct 4, 2018

so it fundamentally can't do what it's supposed to.

That could be the case. It depends heavily on the pods running on the node. But I follow your logic.

I think the best option is either to sleep forever after starting up and finding a degraded status, or to do something with taints/tolerations to prevent the daemon from being rescheduled again once it quits.

I'm 👍 for tainting the node when degraded so that no new pods are deployed to the node. If the node is recycled then the pods will reschedule on their own. By adding in taint logic to the MCD we also can take advantage of that for the SSH access taint work.

FWIW I'm also OK with sleeping forever and letting the MCO and higher deal with the details.

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 4, 2018

Contributor

I'm 👍 for tainting the node when degraded so that no new pods are deployed to the node

that was actually not what I intended when I wrote that suggestion, but I like that approach. it also is worth noting that at this point we have drained the node and set it unschedulable already and haven't undone that yet, so we are basically doing a more extreme version of that and marking the node tainted so anything that tolerates NoSchedule also won't get scheduled there anymore.

Contributor

sdemos commented Oct 4, 2018

I'm 👍 for tainting the node when degraded so that no new pods are deployed to the node

that was actually not what I intended when I wrote that suggestion, but I like that approach. it also is worth noting that at this point we have drained the node and set it unschedulable already and haven't undone that yet, so we are basically doing a more extreme version of that and marking the node tainted so anything that tolerates NoSchedule also won't get scheduled there anymore.

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Oct 4, 2018

Member

The more I think about it the more I like it as well. It also gives a head start on the SSH node tainting work.

/cc @miabbott as he reviewed the spike document recently.

Member

ashcrow commented Oct 4, 2018

The more I think about it the more I like it as well. It also gives a head start on the SSH node tainting work.

/cc @miabbott as he reviewed the spike document recently.

pkg/daemon: validation function checks reconcilability
we can make the validation function much more complete by guaranteeing
that the change is one that we are in fact checking for. we still need
to check reconcilability at the start of the update function as well,
though, since validation is not a gate to start updating.

this also fixes a bug in the previous commit, where if irreconcilable
changes were made, the daemon would set the node degraded, restart,
successfully validate the node is in the "correct" state (which it's
not, but it didn't know that) and then set the state from degraded to
done and wait for another update.
@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Oct 9, 2018

Member

@sdemos let folks know when it's ready for a re-review!

Member

ashcrow commented Oct 9, 2018

@sdemos let folks know when it's ready for a re-review!

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 9, 2018

Contributor

/hold cancel

@ashcrow should be good now. the behavior right now is to crashloop if the node is degraded - it keeps trying to update, discovering it can't reconcile the node, and exiting. we can add the tainting in a different pr.

Contributor

sdemos commented Oct 9, 2018

/hold cancel

@ashcrow should be good now. the behavior right now is to crashloop if the node is degraded - it keeps trying to update, discovering it can't reconcile the node, and exiting. we can add the tainting in a different pr.

@jlebon

This comment has been minimized.

Show comment
Hide comment
@jlebon

jlebon Oct 9, 2018

Member

Checking for reconcilability in isDesiredMachineState() looks good!

the behavior right now is to crashloop if the node is degraded - it keeps trying to update, discovering it can't reconcile the node, and exiting. we can add the tainting in a different pr.

Agreed on adding taints in a separate PR. Though... that looks like a pretty indirect way of crash looping. Let's just check right at the start if it's degraded and exit? (This also complements the tainting; i.e. if somehow we are scheduled on a tainted node, we immediately jump ship). Open to doing this in a separate PR as well, though.

Member

jlebon commented Oct 9, 2018

Checking for reconcilability in isDesiredMachineState() looks good!

the behavior right now is to crashloop if the node is degraded - it keeps trying to update, discovering it can't reconcile the node, and exiting. we can add the tainting in a different pr.

Agreed on adding taints in a separate PR. Though... that looks like a pretty indirect way of crash looping. Let's just check right at the start if it's degraded and exit? (This also complements the tainting; i.e. if somehow we are scheduled on a tainted node, we immediately jump ship). Open to doing this in a separate PR as well, though.

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 9, 2018

Contributor

I agree that we should probably do the check more explicitly, but I think it would be fine to do in another pr. crashlooping is essentially the behavior that already existed, so I would prefer to keep this pr focused on fixing the issue and deal with it later.

Contributor

sdemos commented Oct 9, 2018

I agree that we should probably do the check more explicitly, but I think it would be fine to do in another pr. crashlooping is essentially the behavior that already existed, so I would prefer to keep this pr focused on fixing the issue and deal with it later.

@jlebon

This comment has been minimized.

Show comment
Hide comment
@jlebon

jlebon Oct 9, 2018

Member

OK, this PR looks good to me. Will let @ashcrow have a look and do the merge.

Member

jlebon commented Oct 9, 2018

OK, this PR looks good to me. Will let @ashcrow have a look and do the merge.

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Oct 9, 2018

Member

/lgtm

Member

ashcrow commented Oct 9, 2018

/lgtm

@sdemos

This comment has been minimized.

Show comment
Hide comment
@sdemos

sdemos Oct 9, 2018

Contributor

/test e2e-aws

Contributor

sdemos commented Oct 9, 2018

/test e2e-aws

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Oct 9, 2018

Member

/approve

Member

ashcrow commented Oct 9, 2018

/approve

@openshift-ci-robot

This comment has been minimized.

Show comment
Hide comment
@openshift-ci-robot

openshift-ci-robot Oct 9, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, sdemos

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot commented Oct 9, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, sdemos

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 2923802 into openshift:master Oct 9, 2018

4 checks passed

ci/prow/e2e-aws Job succeeded.
Details
ci/prow/images Job succeeded.
Details
ci/prow/unit Job succeeded.
Details
tide In merge pool.
Details

@sdemos sdemos deleted the sdemos:process-restructure branch Oct 9, 2018

jlebon added a commit to jlebon/machine-config-operator that referenced this pull request Oct 10, 2018

daemon: exit early if node is degraded
As discussed in openshift#109, if the node is marked degraded, then we don't want
the MCD to do anything at all. As a first step, just check for this on
start and exit right away if that's the case. Down the line, we should
investigate tainting the node to make sure we don't get scheduled in the
first place as was discussed in openshift#109.

@sdemos sdemos referenced this pull request Oct 15, 2018

Closed

MCD: pods are crashlooping #127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment