Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return err when there is a failure upgrading #1056

Merged
merged 1 commit into from
Jun 18, 2020
Merged

Return err when there is a failure upgrading #1056

merged 1 commit into from
Jun 18, 2020

Conversation

manuelbuil
Copy link
Contributor

Currently we are logging the error but not returning error. As a
consequence, the upgrade might fail but we still get the successful
message. For example:

E0422 08:25:36.316862 19710 addons.go:334] failed to run kubectl delete: error: the path "/tmp/skuba-addon-cilium478546698/base/cilium-preflight.yaml" does not exist
E0422 08:25:36.317170 19710 addons.go:170] failed to apply "cilium" addon (exit status 1)
[apply] Successfully upgraded addons

Signed-off-by: Manuel Buil mbuil@suse.com

Why is this PR needed?

Fixing a bug.

Currently we are logging the error but not returning error. As a
consequence, the upgrade might fail but we still get the successful
message. For example:

E0422 08:25:36.316862 19710 addons.go:334] failed to run kubectl delete: error: the path "/tmp/skuba-addon-cilium478546698/base/cilium-preflight.yaml" does not exist
E0422 08:25:36.317170 19710 addons.go:170] failed to apply "cilium" addon (exit status 1)
[apply] Successfully upgraded addons

What does this PR do?

Adds a return err. Currently we were only logging it

Anything else a reviewer needs to know?

Special test cases, manual steps, links to resources or anything else that could be helpful to the reviewer.

Status BEFORE applying the patch

Run an upgrade that fails and you'll see the success message

Status AFTER applying the patch

If the upgrade fails, we should see:
"[apply] Failed to deploy addons"

Merge restrictions

(Please do not edit this)

We are in v4-maintenance phase, so we will restrict what can be merged to prevent unexpected surprises:

What can be merged (merge criteria):
    2 approvals:
        1 developer: code is fine
        1 QA: QA is fine
    there is a PR for updating documentation (or a statement that this is not needed)

@manuelbuil manuelbuil requested review from ereslibre, innobead and vadorovsky and removed request for ereslibre April 22, 2020 08:44
vadorovsky
vadorovsky previously approved these changes Apr 22, 2020
Copy link
Contributor

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

innobead
innobead previously approved these changes Apr 23, 2020
c3y1huang
c3y1huang previously approved these changes Apr 23, 2020
Copy link
Contributor

@c3y1huang c3y1huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jcaamano
Copy link

This changes the behavior. Previously even if an addon failed to apply, the remaining addons would still be applied. With this change, it stops at the first failed addon. I don't know if this would be a problem but perhaps it should be checked and mentioned?

@innobead
Copy link
Contributor

ping @ereslibre

@ereslibre
Copy link
Contributor

This changes the behavior. Previously even if an addon failed to apply, the remaining addons would still be applied. With this change, it stops at the first failed addon. I don't know if this would be a problem but perhaps it should be checked and mentioned?

I agree this is a change in behavior. Perhaps what we can do is to enrichen the type returned by DeployAddons. So, we could give feedback addon per addon to the user, something like:

- addon A: successfully upgraded from version X to version Y
- addon B: failed to upgrade from version N to version M; reason: <error reason>

This way I think we could still keep the same behavior by iterating over all addons, and still give meaningful feedback to the user. WDYT?

@jenting
Copy link

jenting commented May 19, 2020

This PR is important to have, at least I encounter this issue that partial addons install failed but CLI still bumps out bootstrapped successfully.

[bootstrap] deploying core add-ons on node "10.84.73.156"
I0519 13:37:29.439628    9802 addons.go:346] applying "psp" addon
I0519 13:38:11.713697    9802 addons.go:168] "psp" addon correctly applied
I0519 13:38:11.713717    9802 addons.go:346] applying "dex" addon
I0519 13:38:12.254263    9802 config.go:39] loading configuration from "kubeadm-init.conf"
I0519 13:38:28.126005    9802 addons.go:168] "dex" addon correctly applied
I0519 13:38:28.126054    9802 addons.go:346] applying "gangway" addon
I0519 13:38:30.471602    9802 config.go:39] loading configuration from "kubeadm-init.conf"
E0519 13:38:33.396738    9802 addons.go:408] failed to run kubectl apply: Unable to connect to the server: x509: certificate is valid for 10.96.0.1, 10.84.73.94, 10.84.73.94, 10.84.73.94, not 10.84.72.137
E0519 13:38:33.396911    9802 addons.go:170] failed to apply "gangway" addon (exit status 1)
I0519 13:38:33.396944    9802 addons.go:346] applying "kured" addon
I0519 13:38:54.570757    9802 addons.go:168] "kured" addon correctly applied
I0519 13:38:54.570799    9802 addons.go:346] applying "metrics-server" addon
E0519 13:38:58.239828    9802 addons.go:408] failed to run kubectl apply: Unable to connect to the server: x509: certificate is valid for 10.96.0.1, 10.84.73.94, 10.84.73.94, 10.84.73.94, not 10.84.72.137
E0519 13:38:58.240532    9802 addons.go:170] failed to apply "metrics-server" addon (exit status 1)
I0519 13:38:58.240574    9802 addons.go:346] applying "cilium" addon
I0519 13:39:23.088281    9802 addons.go:168] "cilium" addon correctly applied
[bootstrap] successfully bootstrapped core add-ons on node "10.84.73.156"

@evrardjp
Copy link
Contributor

BTW I think it's fine to fail at the first failure. It's not perfect, but it's fine.

I think this needs to pass testing though ;)

@jenting
Copy link

jenting commented Jun 10, 2020

ref to http://bugzilla.suse.com/show_bug.cgi?id=1172764

@jenting
Copy link

jenting commented Jun 17, 2020

@manuelbuil Could you please rebase to the latest master branch in order to let CI pass, thx.
Also, please consider add bsc#1172764 into the commit message since there is a bug related to this.

@manuelbuil manuelbuil dismissed stale reviews from c3y1huang, innobead, and vadorovsky via 9fa2912 June 17, 2020 06:48
Currently we are logging the error but not returning error. As a
consequence, the upgrade might fail but we still get the successful
message. For example:

E0422 08:25:36.316862   19710 addons.go:334] failed to run kubectl delete: error: the path "/tmp/skuba-addon-cilium478546698/base/cilium-preflight.yaml" does not exist
E0422 08:25:36.317170   19710 addons.go:170] failed to apply "cilium" addon (exit status 1)
[apply] Successfully upgraded addons

Signed-off-by: Manuel Buil <mbuil@suse.com>
@manuelbuil
Copy link
Contributor Author

@manuelbuil Could you please rebase to the latest master branch in order to let CI pass, thx.
Also, please consider add bsc#1172764 into the commit message since there is a bug related to this.

done. Is it ok to merge it?

Copy link
Contributor

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we shouldn't do this. It leaves the cluster in a bad state, but it's better than silently failing.
Happy to discuss about it.

Copy link

@jenting jenting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@innobead innobead merged commit 9a4b4e9 into SUSE:master Jun 18, 2020
mmnelemane pushed a commit to mmnelemane/skuba that referenced this pull request Jul 3, 2020
Currently we are logging the error but not returning error. As a
consequence, the upgrade might fail but we still get the successful
message. For example:

E0422 08:25:36.316862   19710 addons.go:334] failed to run kubectl delete: error: the path "/tmp/skuba-addon-cilium478546698/base/cilium-preflight.yaml" does not exist
E0422 08:25:36.317170   19710 addons.go:170] failed to apply "cilium" addon (exit status 1)
[apply] Successfully upgraded addons

Signed-off-by: Manuel Buil <mbuil@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants