Implement a lame duck mode for RPC services #4489
Labels
A-RPC
Area: rpc
C-enhancement
Category: An issue proposing an enhancement or a PR with one.
Node
Node team
T-node
Team: issues relevant to the node experience team
Projects
With #3266 done, the node handles SIGINT and SIGTERM somewhat gracefully, i.e. all the cleanup code is executed so the result is no longer effectively a crash. However, all processing is interrupted. Most notably, if a client sent an RPC request, if the node gets SIGINT it will stop the processing and close connection to the client.
This should be improved by adding a lame duck mode to RPC endpoints. That is, once SIGINT or SIGTERM is received, the node should stop accepting new requests on RPC endpoints, finish whatever it’s doing and only then exit.
Note that there’s similar situation with nodes communicating with each other when the node may drop a connection to another peer while working on some request. This is less of an issue since peers should expect and handle correctly unexpected network interruptions and peer crashes. However, RPC clients are usually less fault tolerant so it would be nice if the node handled all incoming requests properly.
The text was updated successfully, but these errors were encountered: