Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Failed while: mgmt cluster create" when creating RKE2 cluster #6010

Closed
gaktive opened this issue May 20, 2022 · 1 comment
Closed

"Failed while: mgmt cluster create" when creating RKE2 cluster #6010

gaktive opened this issue May 20, 2022 · 1 comment
Assignees
Milestone

Comments

@gaktive
Copy link
Member

gaktive commented May 20, 2022

Internal reference: SURE-4658
Reported in 2.6.5

Issue description:
When provisioning an AWS cluster using node driver, often getting a "Failed while: mgmt cluster create" error after clicking create. Options chosen for cluster seem to be irrelevant. Doesn't happen consistently. If you navigate back to Cluster Management, cluster was created and provisions successfully

Business impact:
Error message indicates cluster was not created and is poorly worded. May prompt users to file support cases or GitHub issues.

Repro steps:

Create an RKE2 cluster, using EC2 nodes. (doesn't seem to matter what options are selected)

Workaround:
Is workaround available and implemented? no

Actual behavior:
Error message is displayed, but cluster does eventually provision.

Expected behavior:
No error message is displayed. Or error is displayed that properly describes the problem that was hit.

Additional notes:
Backend found the reference here:

}, `mgmt cluster create`, timeout, interval);

We need a better worded message here, especially if the cluster was actually created properly. If this is tied to a race condition, we have to take that into consideration.

@Josh-Diamond
Copy link

Ticket #6010 - Test Results

With Docker on a single-node instance:

Reproduced on rancher v2.6.5:

  1. Fresh install of rancher v2.6.5
  2. Provision a downstream AWS Node driver RKE2 cluster - [issue intermittent; more easily reproduced with high resource cluster specs]
  3. Reproduced - Error is encountered; cluster provisions successfully

Screenshots:

Error
Screen Shot 2022-06-14 at 9 59 07 AM

Cluster Successfully Provisions
Screen Shot 2022-06-14 at 9 59 29 AM


Verified on rancher v2.6-67c2c49953f11f8a39bf0231a13b521517197da3-head:

  1. Fresh install of rancher v2.6-head
  2. Provision a downstream AWS Node driver RKE2 cluster - [give cluster high resource availability in hopes of catching error]
  3. Verified - Unable to encounter issue with extended timeout fix;

If encountered, new error will read:
172713277-cd1e538d-8d25-46e3-b364-2950befacbb1

@zube zube bot removed the [zube]: Done label Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants