Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement: Add timeout to default wait_for_cluster_cmd #791

Merged
merged 1 commit into from
Mar 17, 2020

Conversation

dpiddockcmp
Copy link
Contributor

@dpiddockcmp dpiddockcmp commented Mar 13, 2020

PR o'clock

Description

The current default command for wait_for_cluster will happily sit forever in a failing state. Module previously had a timeout before the switch to kubernetes provider in 8.0.0.

Defaults to 5 minutes of sleeps. May be a lot longer depending on how wget hangs.

Have tested logic on bash 3 (for those older Mac users), zsh, bash 5, and busybox (default in alpine)

$ docker run --rm -it alpine /bin/sh -c 'for i in `seq 1 60`; do false && exit 0 || true; sleep 0; done; echo TIMEOUT && exit 1' ; echo $?
TIMEOUT
1
$ docker run --rm -it alpine /bin/sh -c 'for i in `seq 1 60`; do true && exit 0 || true; sleep 0; done; echo TIMEOUT && exit 1' ; echo $?
0

Sample failure output (I lowered the seq count because ain't got all day):

module.eks.null_resource.wait_for_cluster[0]: Creating...
module.eks.null_resource.wait_for_cluster[0]: Provisioning with 'local-exec'...
module.eks.null_resource.wait_for_cluster[0] (local-exec): Executing: ["/bin/sh" "-c" "for i in `seq 1 2`; do wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1"]
module.eks.null_resource.wait_for_cluster[0]: Still creating... [10s elapsed]
module.eks.null_resource.wait_for_cluster[0] (local-exec): TIMEOUT


Error: Error running command 'for i in `seq 1 2`; do wget --no-check-certificate -O - -q $ENDPOINT/healthz >/dev/null && exit 0 || true; sleep 5; done; echo TIMEOUT && exit 1': exit status 1. Output: TIMEOUT

I don't think there are any issues directly about this but it would have helped the user in #777 for example. Or at least wasted slightly less time watching paint dry 🙂

Checklist

variables.tf Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@dpiddockcmp dpiddockcmp changed the title Add timeout to default wait_for_cluster_cmd improvement: Add timeout to default wait_for_cluster_cmd Mar 17, 2020
@barryib barryib merged commit 2c98a00 into terraform-aws-modules:master Mar 17, 2020
@dpiddockcmp dpiddockcmp deleted the timeout branch March 17, 2020 18:09
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants