sdn: better warning when non-restarted 3.3 node process calls updated 3.4 script#11990
sdn: better warning when non-restarted 3.3 node process calls updated 3.4 script#11990dcbw wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
Thanks, we should really be restarting the service and we can probably fix Cc: @dgoodwin On Nov 21, 2016 6:22 PM, "Dan Williams" notifications@github.com wrote:
|
|
I think there's an issue with restarting node as soon as the rpm is updated, at least on masters the rpms get updated during control plane upgrade, which pulls in node/ovs packages as dependencies. At this point it's not safe to restart the node service as it hasn't been unscheduled safely, that part comes a little later during node+docker upgrade. |
pkg/sdn/plugin/bin/openshift-sdn-ovs
Outdated
| case "$action" in | ||
| setup) | ||
| # With openshift 3.3 $2 was the netns path; check for that | ||
| if [[ "${veth_host}" =~ "/proc" ]]; then |
There was a problem hiding this comment.
should we put this at the top level rather than making it be setup-specific?
…usters When the openshift-sdn RPM gets updated, that drops a new /usr/bin/openshift-sdn-ovs script on disk. But if the openshift-node process hasn't been restarted yet for whatever reason, openshift-sdn-ovs will print unhelpful error messages due to argument mismatches between the 3.3 and 3.4 versions of that script. Bring back the 3.3 script and redirect the request to the 3.3 script when we detect one. And yell at people to restart their node process. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1396919
afbb99e to
8b5cb03
Compare
|
@danwinship @sdodson @dgoodwin how about something more like this then? Except we apply it only to the release-1.4 branch, not to git master. Do we really support running a 3.3 node process against a 3.4 master process anyway? How are cluster upgrades supposed to work? |
|
My understanding we do need to have an older node work against a newer master. In the context of upgrade that should be a short lived scenario, just for the period of time in between control plane upgrade and node upgrade. |
|
Yeah, we definitely need to support newer master and old nodes. |
|
@dcbw that looks good to me, 👍 to only doing this on release-1.4 |
|
@danwinship @knobunc what do you guys think about this approach? |
|
"ugh". We shouldn't be deploying new pods during an upgrade should we? If so, we'd really only need to deal with the teardown case, which should be simpler... |
|
I reordered the router/registry upgrade (which uses a deployer pod) to follow the node upgrade, so everything should be up and running by that point now, and we won't be running new pods during upgrade as far as I'm aware. Not to say the cluster won't be trying to though. |
|
Do we mark the node as unschedulable while upgrading it? (Starting before we install the new RPM and continuing until after we restart it.) If not, should we? |
|
Indeed we do set unschedulable before touching docker, or our services, and only make it schedulable again when everything is completed. For masters, their node rpms can get updated before evacuation because the node rpm is pulled in as a dep when updating the master rpm, however we don't bounce the node services until later when it should be safe. |
|
@dgoodwin @danwinship hmm, if we already mark the node as unschedulable, how did we get into the situation in the RH bugzilla bug? Was it because it was a node-on-master and thus the service wasn't bounced? Couldn't we just mark the node-on-master as unschedulable when upgrading the master RPMs, since the SDN scripts get updated at the same time? Then when the node-on-master service gets updated and restarted, mark it schedulable again. |
|
race conditions maybe? blah. I don't really like shipping the 3.3 script in 3.4, but if that's the only way to make upgrades work without undesirable downtime then I guess we have to. |
|
@dcbw @danwinship the bug was reported against a "master node" yes. I don't think we can commit to reworking upgrade to push a node restart into master upgrade. This has been a delicate balance to get everything covered, without duplicating evacuations, docker restarts, and service restarts. Separate control plane + node upgrade phases is just becoming a feature in 1.4 where you can run each separately, and on sub-sets of your nodes by label. If it were possible to have the sdn keep working until the services are restarted, that would be ideal from my perspective. It would mean carrying the previous version of the script, if there has been a breaking change like this. (only one previous version required as we don't upgrade beyond more than one release) I assume there's no way for the new script to just handle the old args intelligently? If absolutely necessary, I could go back and rework upgrade again to try to get master upgrade to include a node restart, I can't pinpoint why but I'm fairly certain there was a blocker with this but it's been awhile. In any case it will be too destabilizing I think to consider now, and would have to be 1.5+. |
Well, it could, but that would be a larger change with more chance of breaking the non-upgrade codepaths. Or else it would end up just having two copies of every code path and be incredibly ugly and confusing. |
|
@danwinship @knobunc should we just merge this one, or do you think there's a better approach? |
|
Ideally I think we'd just return an error if you tried to deploy a pod while the node was in the middle of an upgrade, but kubernetes doesn't really recover very well from the network plugin returning errors (as seen in the original report) so that's not a useful solution. I hate this patch but I have no better suggestion. |
|
So I think this should be closed and reopened against OSE 3.4? |
|
So we never dealt with this, right? And now the same problem exists for 3.5, except that instead of returning errors, it will just install OVS rules that don't have any effect (due to the table renumbering from 3.4 to 3.5). |
|
Do we intend to do anything with this? Or are we hand-waving it as an installer problem? |
|
@knobunc at this point, perhaps we just close and ignore. I haven't seen anyone yell about this in a long time, though I may just not be looking for complaints. @danwinship for 3.3 -> 3.5, wouldn't this still work, more or less? We update this PR to add the 3.4 script back, and then each version calls its respective script and renumbering won't be an issue. When they restart to 3.5+, it just won't call a script anymore at all. |
If we had committed it, maybe, but we didn't. |
3.5 still used a script (which is not compatible with 3.4 because of table numbering). 3.6 ships the 3.5 script but doesn't use it. 3.7 will drop the script entirely |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dcbw Assign the PR to them by writing No associated issue. Update pull-request body to add a reference to an issue, or get approval with The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these OWNERS Files:You can indicate your approval by writing |
When the openshift-sdn RPM gets updated, that drops a new
/usr/bin/openshift-sdn-ovs script on disk. But if the openshift-node
process hasn't been restarted yet for whatever reason, openshift-sdn-ovs
will print unhelpful error messages due to argument mismatches between
the 3.3 and 3.4 versions of that script.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1396919
@openshift/networking @sdodson