Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Augur process control commands #904

Merged
merged 3 commits into from
Sep 4, 2020
Merged

Fix Augur process control commands #904

merged 3 commits into from
Sep 4, 2020

Conversation

ccarterlandis
Copy link
Contributor

At somepoint in the summer, a patch was introduced that introduced a 15 second wait time to the augur util kill command that occured after the first attempt to gracefully stop the process with SIGTERM, after which it would attempt to kill all the process with SIGKILL. While this fallback makes sense, it in inadvertently introduced this artifical 15 second wait time to the startup of the main server. This was due to poor design during the initial implementation, which copied code blocks that should have been refactored. The 15 second delay was added to both of these code blocks, one of which was used in augur run, thus causing the slowdown.

This was fixed by refactoring out the copied code blocks to a singular underlying method call, removing the 15 second wait altogether, and exposing a new command as well changing the meaning of an existing one.

The augur util stop command was added, which will attempt to gracefully terminate any existing Augur process. This was how augur util kill previously behaved; augur util kill will now send a SIGKILL signal instead of a SIGTERM, in order to forcefully halt the execution of the process. This should only be used when Augur has crashed hard, and is not responding to attempts to use augur util stop.

There is further refactoring to be done, but this is better than where it was before!

…ses to send SIGKILL after waiting 15 seconds and checking what is still running again."

This reverts commit 7826039.
Signed-off-by: Carter Landis <c@carterlandis.com>
@ccarterlandis ccarterlandis added bug Documents unexpected/wrong/buggy behavior usability Usability related issues (bugs, features, etc) CLI Related to Augur's CLI labels Sep 2, 2020
@ccarterlandis ccarterlandis self-assigned this Sep 2, 2020
Signed-off-by: Carter Landis <c@carterlandis.com>
@sgoggins
Copy link
Member

sgoggins commented Sep 3, 2020

@ccarterlandis : I did that. As long as we are using SIGKILL instead of SIGTERM, the change is fine. I assumed you switched from SIGKILL for a reason, so I let it try that first, wait a short time, then do SIGKILL … SIGTERM was not and does not kill all of the augur processes for some reason.

@sgoggins
Copy link
Member

sgoggins commented Sep 3, 2020

@ccarterlandis : SIGTERM was CONSISTENTLY not working on ALL OF THE SERVERS … we either need the pause OR we need to use SIGKILL ... basically, until I made the change, augur util kill did NOT WORK on any of the servers.

@ccarterlandis
Copy link
Contributor Author

We were never using SIGKILL, only SIGTERM. This is the first time I've heard of this version not working on the servers, but either way, neither adding a wait nor using SIGKILL is the right choice. For what it's worth, I've never run into any problems when I use augur util kill on the servers, it always works just fine for me. But, if it was actually failing on the servers, then there is a deeper reason that needs to be addressed, because it works fine elsewhere. Additionally, is it JUST the servers, or any machine running the same Linux distro version? If so, that is a BIG difference that has a lot of implications.

@sgoggins
Copy link
Member

sgoggins commented Sep 3, 2020

@ccarterlandis : OK. I get it now. As long as there's a way to trigger SIGKILL, which there is, I'm good with this.
First: augur util stop
Second, after some reasonable period, if there are still augur processes running for the instance augur util kill

THANK YOU!!!

@sgoggins sgoggins merged commit d74f60a into dev Sep 4, 2020
@ccarterlandis ccarterlandis deleted the fix-util-kill branch September 21, 2020 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Documents unexpected/wrong/buggy behavior CLI Related to Augur's CLI usability Usability related issues (bugs, features, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants