Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tolerate InvalidInstanceID.NotFound which deleting instances #594

Merged
merged 1 commit into from
Oct 6, 2016

Conversation

justinsb
Copy link
Member

@justinsb justinsb commented Oct 5, 2016

We treat as instance-already-deleted, i.e. not an error

Fix #592

We treat as instance-already-deleted, i.e. not an error

Fix kubernetes#592
@justinsb justinsb merged commit 0cafb24 into kubernetes:master Oct 6, 2016
hwoarang added a commit to hwoarang/kops that referenced this pull request Nov 25, 2020
Sometimes we see the following error at the end of a rolling update:

I1125 18:37:57.161591     165 instancegroups.go:419] Cluster validated; revalidating in 10s to make sure it does not flap.
I1125 18:38:08.536470     165 instancegroups.go:416] Cluster validated.

error deleting instance "i-XXXXXXXXXXX", node "ip-XXX-XXX-XXX-XXX.XXXXXXX.compute.internal": error deleting instance "i-XXXXXXXXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXXXX' does not exist
	status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0

It's possible that the EC2 instance to have disappeared by the time it
was detached (for example, it may have been a spot instance).

In any case, we can't do much when we do not find an instance id, and
breaking the update because of that is not user friendly.

As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]

[1] kubernetes#594
hwoarang added a commit to hwoarang/kops that referenced this pull request Nov 25, 2020
Sometimes we see the following error at the end of a rolling update:

I1125 18:37:57.161591     165 instancegroups.go:419] Cluster validated; revalidating in 10s to make sure it does not flap.
I1125 18:38:08.536470     165 instancegroups.go:416] Cluster validated.

error deleting instance "i-XXXXXXXXXXX", node "ip-XXX-XXX-XXX-XXX.XXXXXXX.compute.internal": error deleting instance "i-XXXXXXXXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXXXX' does not exist
	status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0

It's possible that the EC2 instance to have disappeared by the time it
was detached (for example, it may have been a spot instance).

In any case, we can't do much when we do not find an instance id, and
breaking the update because of that is not user friendly.

As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]

[1] kubernetes#594
hwoarang added a commit to hwoarang/kops that referenced this pull request Nov 25, 2020
Sometimes we see the following error at the end of a rolling update:

I1125 18:12:46.467059     165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal".
I1125 18:12:46.473365     165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes
I1125 18:12:46.476756     165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while).
E1125 18:12:46.523269     165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist
	status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0

It's possible that the EC2 instance to have disappeared by the time it
was detached (it may have been a spot instance for example)

In any case, we can't do much when we do not find an instance id, and
throwing this error during the update is not very user friendly.

As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]

[1] kubernetes#594
hwoarang added a commit to hwoarang/kops that referenced this pull request Nov 26, 2020
Sometimes we see the following error during a rolling update:

I1125 18:12:46.467059     165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal".
I1125 18:12:46.473365     165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes
I1125 18:12:46.476756     165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while).
E1125 18:12:46.523269     165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist
	status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0

It's possible that the EC2 instance to have disappeared by the time it
was detached (it may have been a spot instance for example)

In any case, we can't do much when we do not find an instance id, and
throwing this error during the update is not very user friendly.

As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]

[1] kubernetes#594
hakman pushed a commit to hakman/kops that referenced this pull request Nov 26, 2020
Sometimes we see the following error during a rolling update:

I1125 18:12:46.467059     165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal".
I1125 18:12:46.473365     165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes
I1125 18:12:46.476756     165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while).
E1125 18:12:46.523269     165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist
	status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0

It's possible that the EC2 instance to have disappeared by the time it
was detached (it may have been a spot instance for example)

In any case, we can't do much when we do not find an instance id, and
throwing this error during the update is not very user friendly.

As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]

[1] kubernetes#594
hakman pushed a commit to hakman/kops that referenced this pull request Nov 26, 2020
Sometimes we see the following error during a rolling update:

I1125 18:12:46.467059     165 instancegroups.go:340] Draining the node: "ip-X-X-X-X.X.compute.internal".
I1125 18:12:46.473365     165 instancegroups.go:359] deleting node "ip-X-X-X-X.X.compute.internal" from kubernetes
I1125 18:12:46.476756     165 instancegroups.go:486] Stopping instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal", in group "X" (this may take a while).
E1125 18:12:46.523269     165 instancegroups.go:367] error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX", node "ip-X-X-X-X.X.compute.internal": error deleting instance "i-XXXXXXXX": InvalidInstanceID.NotFound: The instance ID 'i-XXXXXXXXX' does not exist
	status code: 400, request id: 91238c21-1caf-41eb-91d7-534d4ca67ed0

It's possible that the EC2 instance to have disappeared by the time it
was detached (it may have been a spot instance for example)

In any case, we can't do much when we do not find an instance id, and
throwing this error during the update is not very user friendly.

As such, we can simply report and tolerate this problem instead of
exiting with non-zero code. This is similar to how we handle missing
EC2 when updating an IG[1]

[1] kubernetes#594
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants