Conversation
03b5a99 to
6977819
Compare
|
This is now running in OSC LAB. with Florin also changed the health check inside metalnet to use the gprc health service and it seems to be working. Thus I think this feature is implemented correctly. Of course checkpatch is complaining about C++ code, please ignore. |
6977819 to
dc54706
Compare
dc54706 to
5aa9628
Compare
guvenc
left a comment
There was a problem hiding this comment.
I think this is quite good and can be merged and offers an equivalent functionality as before. (with a best practice approach)
One thing which was missing before and also with this implementation is that we can not detect the hang of the worker thread in dpservice.
Some sort of timeout mechanism is needed in this busy loop to be able to detect this
while (cq_->Next(&tag, &ok) && ok) {
call = static_cast<BaseCall*>(tag);
while (call->HandleRpc() == CallState::AWAIT_MSG) {
// wait for response from worker
};
}
what do you think ? @PlagueCZ
This is probably a separate thing.
Fixes #653
I implemented the service "from scratch" because I could not find a built-in service (unlike golang has). But it's not that big in the end.
One difference to golng implementation - golang uses TCP keepalive (packets of length 0), I was unable to find such option here (without overriding some methods to set it for all sockets by calling
setsockopt), so I usedGRPC_ARG_KEEPALIVE_TIME_MS. This is layer 7 though, so instead of two 0-length packets, this uses two packets of length 17 and an additional ACK.I chose the same keepalive interval as observed by TCPdump for golang implementation, 15s.
This implementation uses
std::mutexandstd::condition_variableto prevent periodic sending of current status, simply only send a "change" in the value of status of the service (that's what the lock+cv is used for).