New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] rke2 clusters with invalid values for tolerations / affinity agent customization do not show error to user, stay in updating
state on cluster create
#41606
Comments
Kicked it out to Q3 for now as it's not a critical bug applicable only if the user provides an invalid configuration and is essentially a poor UX where error root cause is not bubbled up in the UI. Adding |
referencing UI issue: rancher/dashboard#8920 |
We may want to follow something like Kicking out to Q1. |
I was trying to take a look into this, it is impossible to change to the Wile testing:
|
Would adding validation to the CRD bubble up via rke2? The fact that rke bubbles up the error and rke2 does not may due to how RKE2 is designed (I've run into this before). cc @jakefhyde This won't show the rke2 logs, but another option is to add a validator/regex to the rancher/pkg/systemtemplate/import.go Lines 102 to 123 in c3411e0
This approach will alert the user a lot earlier if they try to add an invalid agent customization configuration via kubectl instead of the UI. It will also not require modifying the "large and core" |
@a-blender, after talking with @snasovich we decided to add the validation on the web-hook. |
test plan automation considerations:
|
Rancher Server Setup
Information about the Cluster
User Information
Describe the bug
when provisioning a cluster with invalid values for tolerations or affinity agent customizations (cluster agent or fleet agent), the cluster is able to start being created. However, the cluster never shows any error state and hangs in an
updating
stateTo Reproduce
badLabel 123"[];'{}-+=
Result
cluster is able to be created, but stays in an
updating
state, see screenshotsExpected Result
cluster should error out and bubble this state up to the user on the
management
pageerror should be shown to the user
if user ssh's into the node and views rke2-server logs, you can see there is an error that spams there every few seconds:
Screenshots
Additional context
error is bubbled up to the user for rke1
The text was updated successfully, but these errors were encountered: