New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
install/0000_90_cluster-version-operator_02_servicemonitor: Info-level alert for available updates #415
install/0000_90_cluster-version-operator_02_servicemonitor: Info-level alert for available updates #415
Conversation
I like the idea of adding this to alertmanager, but wouldn't it be too annoying for admin console users? We show "upgrade available" on the top navbar, on Overview page and on Cluster Settings page. This alert would show on Notifications page - isn't that enough? Lets have this approved by UI/UX team? |
In description, shouldn't this be: |
@ncameronbritt please lend us your eyes, I know you were looking at how to improve notification of updates, this seems like a good approach and can be completely orthogonal to anything that we're doing elsewhere, the main thing is that alerts are meant to feed into other systems and those systems could allow customers to choose how to handle this without being in the console. |
I think this is a good approach. And I believe we discussed that if we had alerts for available upgrades (as proposed here), then we probably don't need the indication we have in the console today, because customers could still see the alert in the console. This design shows that: https://marvelapp.com/prototype/64j205c/screen/70755417 |
c53d639
to
b397d8f
Compare
install/0000_90_cluster-version-operator_02_servicemonitor.yaml
Outdated
Show resolved
Hide resolved
…l alert for available updates The purpose of this alert is to give cluster-admins something they can subscribe to (via Alertmanager notifications) to let them know when they can initiate an update for a particular cluster. It is info severity because we want the quick push notification, but letting your cluster sit on the old version for a few days or a week or so is unlikely to cause much trouble. For things like CVE fixes, a more timely update might be advisable, but the cluster-version operator currently has no way to make the "critical CVE" vs. "no fixes, but adds a feature" distinction. So I'm punting on the warn-level or so "CVO is getting grumpy about your failure to take the long-recommended update" alert for now, and just addressing the info-level "you could update, no pressure".
b397d8f
to
ccf0882
Compare
e2e failed on /override ci/prow/e2e |
@wking: Overrode contexts on behalf of wking: ci/prow/e2e In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vrutkovs, wking The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
annotations: | ||
message: Your upstream update recommendation service recommends you update your cluster. For more information refer to 'oc adm upgrade'{{ "{{ with $console_url := \"console_url\" | query }}{{ if ne (len (label \"url\" (first $console_url ) ) ) 0}} or {{ label \"url\" (first $console_url ) }}/settings/cluster/{{ end }}{{ end }}" }}. | ||
expr: | | ||
cluster_version_available_updates > 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigh, this should be >0
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so - I'll try it out and file bug to fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed; filed #432 to fix.
… UpdateAvailable Fixing an off-by-one from ccf0882 (install/0000_90_cluster-version-operator_02_servicemonitor: Info-level alert for available updates, 2020-07-22, openshift#415). One available update is all we need for this to go off.
The purpose of this alert is to give cluster-admins something they can subscribe to (via Alertmanager notifications) to let them know when they can initiate an
alertupdate for a particular cluster.It is info severity because we want the quick push notification, but letting your cluster sit on the old version for a few days or a week or so is unlikely to cause much trouble. For things like CVE fixes, a more timely update might be advisable, but the cluster-version operator currently has no way to make the "critical CVE" vs. "no fixes, but adds a feature" distinction. So I'm punting on the warn-level or so "CVO is getting grumpy about your failure to take the long-recommended update" alert for now, and just addressing the info-level "you could update, no pressure".