-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set health annotation to Inaccessible when it is not set #852
Conversation
When health status is not set by SPBM, set health annotation to Inaccessible.
jtest wcp |
jtest gc |
Started WCP block pipeline... |
jtest block-vanilla |
Started GC block pipeline... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
/ok-to-test |
@@ -351,6 +351,10 @@ func ConvertVolumeHealthStatus(volHealthStatus string) (string, error) { | |||
case string(pbmtypes.PbmHealthStatusForEntityUnknown): | |||
return string(pbmtypes.PbmHealthStatusForEntityUnknown), nil | |||
default: | |||
return "", fmt.Errorf("cannot convert invalid volume health status %s", volHealthStatus) | |||
// NOTE: volHealthStatus is not set by SPBM in this case. | |||
// This implies the volume does not exist any more. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Volume not existing is different to saying the volume health is inaccessible. Isn't this a bug in SPBM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a discussion with @subramanian-neelakantan on this. In this case, FCD is missing from inventory so SPBM will not populate the health status field. According to Subbu, this is a permanent failure case so the caller would want to do some reaction to fix it as it actually means the volume is "inaccessible". So this should be treated differently from "Unknown" which is a temporary failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SPBM is the component that should report the volume health correctly. I did not quite understand why SPBM should not be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are the notes after the offline meeting on this topic. We agreed that the volume health can be set to "inaccessible" whenever CNS says it is "red" or "" (missing). PSP operators expect the Volume health to be eventually consistent. They also have the health timestamp to allow waiting for a "long" time (like an hour) before taking corrective actions based on the Volume health. So it is a correct design choice to reflect the current CNS health snapshot on the PVCs.
Started Vanilla block pipeline... |
|
jtest block-vanilla |
Started Vanilla block pipeline... |
|
|
|
jtest gc |
Started GC block pipeline... |
jtest block-vanilla |
Started Vanilla block pipeline... |
|
jtest gc |
Started GC block pipeline... |
|
|
jtest gc |
Started GC block pipeline... |
|
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: divyenpatel, subramanian-neelakantan, xing-yang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
What this PR does / why we need it:
If HealthStatus is not set by SPBM, it implies the volume does not exist any more.
Set health annotation to "Inaccessible" so that the caller can make appropriate reactions based on this status.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Special notes for your reviewer:
Release note: