Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry to autorevert to ensure correct deploy is tracked. #134

Merged
merged 2 commits into from
Apr 16, 2018

Conversation

jrasell
Copy link
Member

@jrasell jrasell commented Apr 9, 2018

In some situations, Levant was calling Nomad before a new
deployment had been started for the auto-revert meaning the
original failed deployment ID was returned and checked. This change
adds logic to ensure Levant waits for an updated deployment ID
before running the auto-revert checker.

Closes #133

In some situations, Levant was calling Nomad before a new
deployment had been started for the auto-revert meaning the
original failed deployment ID was returned and checked. This change
adds logic to ensure Levant waits for an updated deployment ID
before running the auto-revert checker.

Closes #133
@jrasell jrasell added the bug label Apr 9, 2018
@jrasell
Copy link
Member Author

jrasell commented Apr 9, 2018

tagging @wlonkly

@wlonkly
Copy link

wlonkly commented Apr 10, 2018

Heya! I gave it a shot with my test case, but levant segfaulted when it tried to monitor the rollback deployment.

-- rlafferty@stg-ci-agent68:~ $ ./levant-local deploy -log-level=debug -address=http://http.nomad-cluster.service.consul:4646 microbot.hcl
2018/04/10 14:39:50 UTC [DEBUG] levant/templater: no variable file passed, trying defaults
2018/04/10 14:39:50 UTC [DEBUG] helper/files: no default var-file found
2018/04/10 14:39:50 UTC [DEBUG] levant/templater: no command line variables passed
2018/04/10 14:39:50 UTC [DEBUG] levant/templater: variable file not passed
2018/04/10 14:39:50 UTC [DEBUG] levant/deploy: running dynamic job count updater for job microbot-rlafferty
2018/04/10 14:39:50 UTC [INFO] levant/deploy: using dynamic count 1 for job microbot-rlafferty and group web
2018/04/10 14:39:50 UTC [INFO] levant/deploy: triggering a deployment of job microbot-rlafferty
2018/04/10 14:39:51 UTC [INFO] levant/deploy: evaluation 20243c42-0249-7162-ce72-3b1b4443d857 finished successfully
2018/04/10 14:39:51 UTC [INFO] levant/deploy: beginning deployment watcher for job microbot-rlafferty
2018/04/10 14:39:51 UTC [DEBUG] levant/deploy: deployment e96093c2-c9d1-7f28-7ec1-a7d6270ea38b running for 0.00s
2018/04/10 14:39:56 UTC [DEBUG] levant/deploy: deployment e96093c2-c9d1-7f28-7ec1-a7d6270ea38b running for 5.28s
2018/04/10 14:39:58 UTC [DEBUG] levant/deploy: deployment e96093c2-c9d1-7f28-7ec1-a7d6270ea38b running for 7.04s
2018/04/10 14:39:58 UTC [ERROR] levant/deploy: deployment e96093c2-c9d1-7f28-7ec1-a7d6270ea38b has status failed
2018/04/10 14:39:58 UTC [DEBUG] levant/failure_inspector: launching allocation inspector for alloc b825015f-9dfd-5dec-6ca6-963b22bee0e0
2018/04/10 14:39:58 UTC [ERROR] levant/failure_inspector: alloc b825015f-9dfd-5dec-6ca6-963b22bee0e0 incurred event driver failure because failed to initialize task "microbot" for alloc "b825015f-9dfd-5dec-6ca6-963b22bee0e0": Failed to pull `dontrebootme/microbot:honk`: API error (404): {"message":"manifest for dontrebootme/microbot:honk not found"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xacbb95]

goroutine 1 [running]:
github.com/jrasell/levant/levant.(*levantDeployment).deploy(0xc420336d50, 0xc420336d50)
	/home/rlafferty/go/src/github.com/jrasell/levant/levant/deploy.go:165 +0x745
github.com/jrasell/levant/levant.TriggerDeployment(0xc420076a80, 0xc)
	/home/rlafferty/go/src/github.com/jrasell/levant/levant/deploy.go:66 +0xad
github.com/jrasell/levant/command.(*DeployCommand).Run(0xc4202e7c20, 0xc4200b8130, 0x1, 0x3, 0xc42000cf80)
	/home/rlafferty/go/src/github.com/jrasell/levant/command/deploy.go:125 +0x438
github.com/jrasell/levant/vendor/github.com/mitchellh/cli.(*CLI).Run(0xc420233900, 0xc420233900, 0x4, 0xc42000d000)
	/home/rlafferty/go/src/github.com/jrasell/levant/vendor/github.com/mitchellh/cli/cli.go:255 +0x1eb
main.RunCustom(0xc4200b8100, 0x4, 0x4, 0xc4202e7aa0, 0xc4202e79e0)
	/home/rlafferty/go/src/github.com/jrasell/levant/main.go:49 +0x40f
main.Run(0xc4200b8100, 0x4, 0x4, 0xc42009c058)
	/home/rlafferty/go/src/github.com/jrasell/levant/main.go:17 +0x56
main.main()
	/home/rlafferty/go/src/github.com/jrasell/levant/main.go:11 +0x63

@jrasell
Copy link
Member Author

jrasell commented Apr 10, 2018

hmmmm that is interesting @wlonkly and something I didn't see when I did my testing. I will take another look using the trace.

@wlonkly
Copy link

wlonkly commented Apr 10, 2018

@jrasell Thanks! I should've mentioned too that this was built with Go 1.10. Also if you want to rule out something amiss in my toolchain, if you want to toss a Linux x86_64 binary my way I'd be happy to test that out too.

When the canary value in not set in a job by the user, the struct
Levant uses will be set to null and not 0 as it does not go through
any merging with the default config. This should be updated at a
later date.
@jrasell
Copy link
Member Author

jrasell commented Apr 16, 2018

@wlonkly I believe I have now fixed it for the short term; although I will look at fixing this is a better manner in the future.

@jrasell jrasell merged commit dffbdaf into master Apr 16, 2018
@jrasell jrasell deleted the b_gh_133 branch April 16, 2018 20:37
@wlonkly
Copy link

wlonkly commented Apr 23, 2018

Sorry about the silence on this one, been on vacation! I've built master and verified that it successfully tracks the auto_revert deployment. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants