New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider all possible cluster states before passing them to StateChangeConf #1114
Consider all possible cluster states before passing them to StateChangeConf #1114
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@furkatgofurov7 I see that you were unable to repro this bug after multiple attempts #1003 (comment) and the original author of the issue saw a 30 sec timeout in terraform apply but the cluster still became active in about a minute. This was with using the latest terraform v3.0.0.
-
Is it still an issue for them?
-
What evidence is there that not predefining the
expectedState
in the provider fixes this particular issue? I'm fine to have it go in as a noted enhancement but I don't see how this explicitly resolves Intermittently imports of EKS clusters never finish #1003.
Hey @a-blender thanks for review, as per #1003 (comment) it should be still an issue
Based on the logs from the linked issue description:
and this comment, considering all possible cluster states should not hurt and in case of provider missing to catch the state of the rancher it will still be able to move on. Since the reproduction was not possible, maybe we should take this in as an improvement to the provider, WDYT? |
@a-blender can we merge this, based on #1003 (comment) it helps to fix the issue in #1003 ? |
Issue:
#1003
rancher/eks-operator#84
Problem
When importing an EKS cluster using the terraform provider, sometimes rancher cluster reconciliation loop can be fast and go ahead with cluster creation (going from
"pending"
to"active"
too fast) and the provider can miss it, resulting on provider waiting for the cluster to be in the "pending" state even though it went pass that state long before and provider simply missed thatSolution
Instead of predefining the
expectedState
beforehand and overwriting it later if need to be, define it as slice literal with keeping the same state ("active") and append the new states later if need to be (i.e.if cluster.Driver == clusterDriverImported || (cluster.Driver == clusterDriverEKSV2 && cluster.EKSConfig.Imported)
==True
) and pass it toStateChangeConf
struct. Anyways, StateChangeConf struct expectsTarget
to be[]string
so nothing is changed behaviour wise but just all possible cluster states are considered before passing them.