Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ui: add an alert if upgrade is not finalized for a long time #66987

Closed
RaduBerinde opened this issue Jun 28, 2021 · 11 comments · Fixed by #102895 or #104688
Closed

ui: add an alert if upgrade is not finalized for a long time #66987

RaduBerinde opened this issue Jun 28, 2021 · 11 comments · Fixed by #102895 or #104688
Assignees
Labels
A-webui-warnings C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-escalation-improvement Having this feature would have made an escalation easier O-postmortem Originated from a Postmortem action item.

Comments

@RaduBerinde
Copy link
Member

RaduBerinde commented Jun 28, 2021

We have seen at least a case where a user did not finalize a major version upgrade, which manifested much later at the next upgrade.

Upgrade instructions are here: https://www.cockroachlabs.com/docs/v20.2/upgrade-cockroach-version#step-3-decide-how-the-upgrade-will-be-finalized

The DB Console should alert the user if the upgrade is in non-finalized state for a while; e.g. check if cluster.preserve_downgrade_option has been set for a long time.


2023-02-23 addendum from @rafiss: As we saw in this incident, it would also be useful to show if a cluster was recently upgraded. But the warning about not being finalized is much more important.


Jira issue: CRDB-8317

@RaduBerinde RaduBerinde added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-webui-warnings T-cluster-ui labels Jun 28, 2021
@thtruo
Copy link
Contributor

thtruo commented Jul 9, 2021

We're tracking another issue that seems related to this #67330

@rafiss
Copy link
Collaborator

rafiss commented Feb 23, 2023

Tagging this with O-postmortem since it came up as an action item after this incident. @jordanlewis suggested that @cockroachdb/sql-observability could be a home for this.

@rafiss
Copy link
Collaborator

rafiss commented Mar 28, 2023

I see that the DB Console shows this banner:
Screenshot 2023-03-27 at 4 43 06 PM

But nothing is shown if all the nodes are running on the new version, but the cluster is not finalized.

Perhaps this banner can be enhanced so that it also checks the result of SHOW CLUSTER SETTING version and makes sure it matches the binary version.

@rafiss rafiss added the O-postmortem Originated from a Postmortem action item. label Apr 6, 2023
@rafiss
Copy link
Collaborator

rafiss commented Apr 6, 2023

@maryliag
Copy link
Contributor

An alert was added on DB Console on 22.2: #82633

craig bot pushed a commit that referenced this issue May 8, 2023
102895: ui: show alert during upgrade r=maryliag a=maryliag

Previosuly, an alert was being displayed on DB Console only if the upgrade was happening for more than 48hrs.
This commit updates the alert to show as soon as the cluster setting `cluster.preserve_downgrade_option` is set, making it easier to identify when an upgrade is happening.

Fixes #66987

<img width="1664" alt="Screenshot 2023-05-08 at 2 21 06 PM" src="https://user-images.githubusercontent.com/1017486/236901905-1a1a15ab-55ee-44e3-a633-fb8ed3a6c2ca.png">


Release note (ui change): Show alert on DB Console Overview page when the cluster setting `cluster.preserve_downgrade_option` is set, no longer waiting 48hrs to show.

Co-authored-by: maryliag <marylia@cockroachlabs.com>
@craig craig bot closed this as completed in 3f0a834 May 8, 2023
@maryliag
Copy link
Contributor

is there a specific command you use to know that "all nodes are on the same binary version"?

@rafiss
Copy link
Collaborator

rafiss commented May 18, 2023

No I don't know how that check is made, but as I pointed out in my comment above, the DB Console already has a way to check that, since it shows this warning if the nodes are not on the same binary version.

Screenshot 2023-03-27 at 4 43 06 PM

@rafiss
Copy link
Collaborator

rafiss commented Jun 7, 2023

https://github.com/cockroachlabs/support/issues/2368 and https://github.com/cockroachlabs/support/issues/2295 are two more examples of support cases where this would have helped. Resolving this issue would help a lot with reducing support and customer confusion.

@maryliag maryliag self-assigned this Jun 9, 2023
maryliag added a commit to maryliag/cockroach that referenced this issue Jun 9, 2023
Display a warning on the DB Console overview page when all
the nodes are running on the new version, but the cluster
is not finalized.

Fixes cockroachdb#66987

Release note (ui change): Add warning to DB Console overview page
when all nodes are running on the new version, but the cluster
upgrade is not finalized.
craig bot pushed a commit that referenced this issue Jun 14, 2023
104688: ui: show alert on DB Console overview when upgrade is not finalized r=maryliag a=maryliag

Display a warning on the DB Console overview page when all the nodes are running on the new version, but the cluster is not finalized.

Fixes #66987

<img width="1205" alt="Screenshot 2023-06-09 at 5 29 01 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/cb6f1943-91bd-4528-98f4-bc5704a7e9dc">

Example of cluster with a not finalized upgrade, and the values being used on this PR to compare:
https://www.loom.com/share/1d478f69ea0f459fbdc3d1506f5a8dfc


Release note (ui change): Add warning to DB Console overview page when all nodes are running on the new version, but the cluster upgrade is not finalized.

Co-authored-by: maryliag <marylia@cockroachlabs.com>
@craig craig bot closed this as completed in 50991ea Jun 14, 2023
blathers-crl bot pushed a commit that referenced this issue Jun 14, 2023
Display a warning on the DB Console overview page when all
the nodes are running on the new version, but the cluster
is not finalized.

Fixes #66987

Release note (ui change): Add warning to DB Console overview page
when all nodes are running on the new version, but the cluster
upgrade is not finalized.
maryliag added a commit to maryliag/cockroach that referenced this issue Jun 14, 2023
Display a warning on the DB Console overview page when all
the nodes are running on the new version, but the cluster
is not finalized.

Fixes cockroachdb#66987

Release note (ui change): Add warning to DB Console overview page
when all nodes are running on the new version, but the cluster
upgrade is not finalized.
maryliag added a commit to maryliag/cockroach that referenced this issue Jun 15, 2023
Informs #cockroachdb#66987

Add Learn More link to not finalized alert.

Release note: None
maryliag added a commit to maryliag/cockroach that referenced this issue Jun 15, 2023
Informs #cockroachdb#66987

Add Learn More link to not finalized alert.

Release note: None
craig bot pushed a commit that referenced this issue Jun 15, 2023
104998: ui: add learn more to alert r=maryliag a=maryliag

Informs ##66987

Epic: None

Add Learn More link to not finalized alert.

<img width="1538" alt="Screenshot 2023-06-15 at 2 51 16 PM" src="https://github.com/cockroachdb/cockroach/assets/1017486/bf3fd501-df35-403b-aedd-866e1538521c">


Release note: None

Co-authored-by: maryliag <marylia@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this issue Jun 15, 2023
Informs ##66987

Add Learn More link to not finalized alert.

Release note: None
maryliag added a commit to maryliag/cockroach that referenced this issue Jun 16, 2023
Informs #cockroachdb#66987

Add Learn More link to not finalized alert.

Release note: None
@maryliag maryliag added the C-escalation-improvement Having this feature would have made an escalation easier label Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-webui-warnings C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) C-escalation-improvement Having this feature would have made an escalation easier O-postmortem Originated from a Postmortem action item.
Projects
None yet
4 participants