Commit 0abdd33
Add
New boolean `contact_support` field on update status added in
oxidecomputer/omicron#10271.
I tried it inside the properties table as `Contact support: Yes` and it
felt terrible.
<details>
<summary>
Robot notes on the API logic behind <code>contact_support</code>
</summary>
[omicron#10271](oxidecomputer/omicron#10271)
adds a `contact_support: bool` field to the `system/update/status` API.
It is the last piece of a minimal system health check tied to update
status, intended as a stopgap until the fault management subsystem lands
([RFD 612](https://rfd.shared.oxide.computer/rfd/0612)).
## What it means
When `contact_support` is `true`, Nexus has detected one or more known
conditions in the latest inventory collection (plus a few additional
checks) that require Oxide support to resolve. The field collapses
several sub-checks into a single boolean because none of the individual
conditions are actionable by the customer — the only action is to call
support. The detailed breakdown is logged server-side and lands in
support bundles.
The intended usage maps to two cases:
- **Before an update**: if `contact_support` is true, the customer
should not start an update — resolve the issue with support first.
- **After an update**: if `contact_support` is true, something went
wrong; the customer should call support immediately.
## Conditions that trigger `contact_support: true`
- **Unhealthy zpools** — any zpool not in `online` state (e.g.,
degraded).
- **Enabled SMF services not online** — services that should be running
but are in `maintenance`, `offline`, or `degraded`.
- **Stuck sagas** — sagas that have been running longer than ~15
minutes. (A sample of 10,000 done sagas on dogfood showed only 3
exceeded 15 minutes from creation to completion.)
- **Stale inventory collection** — no recent inventory collection (~15
min threshold), meaning Nexus has lost visibility into rack state.
- **Stalled update** — an update is supposed to be in progress but the
planner hasn't taken a step in ~30 minutes.
The list is explicitly minimal and not exhaustive — `contact_support:
false` does not guarantee the system is fully healthy.
## Suppression during an active update
Health checks often fail transiently during an update, so the API
suppresses `contact_support: true` while an update is genuinely in
progress. The field only surfaces a true value when either (1) there is
no update in progress, or (2) an in-progress update has stalled past the
threshold (matching the [10–15 minute
guidance](https://github.com/oxidecomputer/omicron/blob/main/docs/reconfigurator-ops-guide.adoc#debug-stuck))
in the Reconfigurator Ops Guide for when support considers an update
stuck).
In practice this means the field always presents in one of two contexts:
the system is idle (pre-update or post-update), or the update has
stalled long enough that the result is no longer a transient artifact.
</details>
## Issues to resolve
- Explain the situation without overdoing it
- Tooltip looks terrible in message box, what if we link to docs instead
- Should probably link to a way to actually contact support, probably
the support email that goes to Zendesk
<img width="808" height="381" alt="image"
src="https://github.com/user-attachments/assets/bd8a75ed-7550-440f-81a3-6fc5319b79fb"
/>
---------
Co-authored-by: benjaminleonard <benji@oxide.computer>contact_support on update status page (#3226)1 parent aea6923 commit 0abdd33
3 files changed
Lines changed: 36 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| 17 | + | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
| |||
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
| |||
105 | 108 | | |
106 | 109 | | |
107 | 110 | | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
108 | 126 | | |
109 | 127 | | |
110 | 128 | | |
| |||
193 | 211 | | |
194 | 212 | | |
195 | 213 | | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
196 | 229 | | |
197 | 230 | | |
198 | 231 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
39 | | - | |
| 39 | + | |
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
| |||
0 commit comments