-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tendermint breaks when BeginBlock (any ABCI call?) in ABCI server is not returning #1993
Comments
One could think about guarding the ABCI app against exceptions, but then the question is what to return on ABCI calls. |
Why not? Important thing is that all instances should return error. If exception is caused by non-deterministic source, you need to eliminate it (if possible). |
@melekes Yeah, I guess I was referring to the non-deterministic part. Those probably can't be solved via ABCI. |
NOTE: there is |
@srmo thanks for raising this. I think @ethanfrey found and fixed a related issue recently but perhaps it wasn't comprehensive enough. Unfortunately, we need to de-prioritize all non-Cosmos-launch related issues right now as we try to focus on getting this thing launched, so that means issues relating to external ABCI apps will be put on our back log. Of course, we're still happy to accept PRs for fixes. Thanks for understanding. |
What abci driver are you using? My fix was only on the socket client. I'd like to see the launch scripts |
@ebuchman sure. It's not really breaking anything on our side and I agree with @melekes here. The basic use case is that the ABCI app has to handle such exceptions either way and such non-deterministic errors are usually handled manually per node/abci app. So this issue is rather a reminder to consider more convenience on tendermint side (like the nuissance of needing a sigkill) or to generally think through possible implications and how to work with them. |
@ethanfrey socket |
I believe this is fixed now (i.e. if ABCI app dies, Tendermint will exit too with the expectation that process supervisor will restart both failed processes). |
BUG REPORT
Tendermint version
0.22.4-c64a3c74 (self compiled: make dist)
ABCI app
JavaCounter - part of jABCI https://github.com/jTendermint/jabci
Environment
What happened:
I've forced the BeginBlock implementation of the ABCI app to throw an exception instead of returning a ResponseBeginBlock. This was implemented such that the exception is thrown after the third block.
After that, no more Blocks are being built, broadcastTx etc still working.
No messages in tendermint log (default loglevel) could be found, hinting at a problem with beginBlock.
Ctrl+Cing tendermint also wasn't possible, i.e. it was just hanging there after sending the signal.
What you expected to happen:
Tendermint should timeout on an ABCI request, maybe?
Expected this to be fixed via
#1891
#1890
How to reproduce it (as minimally and precisely as possible):
Just throw an exception in your ABCI app on BeginBlock or DeliverTx
Logs (you can paste a small part showing an error or link a pastebin, gist, etc. containing more of the log file):
https://gist.github.com/srmo/a647b5db2a3c1a5268edc0ec62006b08
Contains the full flow, throwing exception at height 3->4
Then doing a broadcast_tx
Then doing Ctrl+C
Then doing multiple Ctrl+C
Config (you can paste only the changes you've made):
only changed loglevel to "*:debug"
The text was updated successfully, but these errors were encountered: