Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gaiad does not shutdown gracefully when run in-process with Tendermint #4323

Closed
4 tasks
ebuchman opened this issue May 10, 2019 · 0 comments · Fixed by #4324
Closed
4 tasks

gaiad does not shutdown gracefully when run in-process with Tendermint #4323

ebuchman opened this issue May 10, 2019 · 0 comments · Fixed by #4324

Comments

@ebuchman
Copy link
Member

Summary of Bug

gaiad fails to shutdown Tendermint gracefully upon receiving signals from the user like CTRL-C.

This seems to be the source of tendermint/tendermint#3295

On investigation, it seems that when gaiad is run in-process with Tendermint, it uses a custom version of TrapSignal, rather than the one in tendermint/libs/common. The custom version has a bug where it defers the cleanup function, and then calls os.Exit, causing the cleanup function to never run:

cosmos-sdk/server/util.go

Lines 213 to 228 in 829ce17

// TrapSignal traps SIGINT and SIGTERM and terminates the server correctly.
func TrapSignal(cleanupFunc func()) {
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
go func() {
sig := <-sigs
switch sig {
case syscall.SIGTERM:
defer cleanupFunc()
os.Exit(128 + int(syscall.SIGTERM))
case syscall.SIGINT:
defer cleanupFunc()
os.Exit(128 + int(syscall.SIGINT))
}
}()
}

I'm not sure if there's a reason to use this custom TrapSignal, but from testing locally it looks like just using cmn.TrapSignal instead makes the problem of tendermint/tendermint#3295 go away.

Version

v0.34.1

Steps to Reproduce

Run gaiad start with debug logs. Then send Ctrl-C. The process will exit immediately with out printing any logs related to shutting down Tendermint. Compare to running gaiad out of process.

Reproducing tendermint/tendermint#3295 is non-deterministic but relatively easy by just stopping and starting a synced node a few times. Soon enough it will throw the error reported there.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
alessio pushed a commit that referenced this issue May 11, 2019
Ensure gaiad shutdown Tendermint gracefully upon
receiving SIGINT and SIGTERM.

Closes: #4323
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant