Summary:
In one of the customer environments, we saw a case where RBS threads get stuck on the client side with the following stack
```
<tr><td>rpc_tp_TabletServer-high-pri_13-231</td><td>6.940s</td><td>0.000s</td><td>0.000s</td><td rowspan="43"><pre> @ 0x7ffff6bafffc __GI_nanosleep
@ 0x3b52cca yb::RateLimiter::UpdateTimeSlotSizeAndMaybeSleep()
@ 0x37dad42 yb::tserver::RemoteBootstrapFileDownloader::DownloadFile<>()
@ 0x37d9f00 yb::tserver::RemoteBootstrapFileDownloader::DownloadFile()
@ 0x37d5dfe yb::tserver::RemoteBootstrapClient::FetchAll()
@ 0x384f845 yb::tserver::TSTabletManager::StartRemoteBootstrap()
@ 0x3837b65 yb::tserver::ConsensusServiceImpl::StartRemoteBootstrap()
@ 0x2ee5482 std::__1::__function::__func<>::operator()()
@ 0x2eec415 yb::consensus::ConsensusServiceIf::Handle()
@ 0x3584537 yb::rpc::ServicePoolImpl::Handle()
@ 0x34d2ec4 yb::rpc::InboundCall::InboundCallTask::Run()
@ 0x359351c yb::rpc::(anonymous namespace)::Worker::Execute()
@ 0x3b805d3 yb::Thread::SuperviseThread()
@ 0x7ffff66de693 start_thread
@ 0x7ffff6be041c __clone
```
```
<tr><td>rpc_tp_TabletServer-high-pri_377-12804</td><td>979.670s</td><td>0.000s</td><td>0.000s</td><td rowspan="1"><pre> @ 0x7ffff6bafffc __GI_nanosleep
@ 0x3b52cca yb::RateLimiter::UpdateTimeSlotSizeAndMaybeSleep()
```
This diff enhances the visibility for any unexpected rate limiter sleep times that could have creeped in either due to math overflow/or other clock issues. Note that we don't see any possible math overflow issues though.
Jira: DB-13252
Test Plan:
Jenkins
Manually tested the logging part, can see the following.
RBS source side logs
```
I1109 14:06:45.567804 1885761536 rate_limiter.cc:111] Current iteration stats: { start: 3903865.228s end: 3903865.228s data_size: 135993 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 1 } Last few iteration stats: []
I1109 14:06:45.569423 1893216256 rate_limiter.cc:111] Current iteration stats: { start: 3903865.229s end: 3903865.231s data_size: 16 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 0 } Last few iteration stats: [{ start:
3903865.228s end: 3903865.228s data_size: 135993 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 1 }]
I1109 14:06:45.570024 1892069376 rate_limiter.cc:111] Current iteration stats: { start: 3903865.231s end: 3903865.231s data_size: 475 target_rate: 134217728 time_slot_ms: 200 sleep_time_ms: 0 } Last few iteration stats: [{ start:
3903865.228s end: 3903865.228s data_size: 135993 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 1 }, { start: 3903865.229s end: 3903865.231s data_size: 16 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 0 }]
I1109 14:06:45.773761 1895510016 rate_limiter.cc:111] Current iteration stats: { start: 3903865.231s end: 3903865.237s data_size: 26843545 target_rate: 134217728 time_slot_ms: 200 sleep_time_ms: 194 } Last few iteration stats: [{ start:
3903865.228s end: 3903865.228s data_size: 135993 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 1 }, { start: 3903865.229s end: 3903865.231s data_size: 16 target_rate: 134217728 time_slot_ms: 100 sleep_time_ms: 0 }, { start:
3903865.231s end: 3903865.231s data_size: 475 target_rate: 134217728 time_slot_ms: 200 sleep_time_ms: 0 }]
...
```
RBS client side logs. Note that the client side starts one rate limiter per file, while the server side the same rate limiter throughout the rbs session.
```
I1109 14:06:45.568485 1914236928 rate_limiter.cc:111] Current iteration stats: { start: 3903865.228s end: 3903865.230s data_size: 136012 target_rate: 268435456 time_slot_ms: 100 sleep_time_ms: 0 } Last few iteration stats: []
I1109 14:06:45.569579 1914236928 rate_limiter.cc:111] Current iteration stats: { start: 3903865.231s end: 3903865.231s data_size: 29 target_rate: 268435456 time_slot_ms: 100 sleep_time_ms: 0 } Last few iteration stats: []
I1109 14:06:45.570200 1914236928 rate_limiter.cc:111] Current iteration stats: { start: 3903865.231s end: 3903865.232s data_size: 491 target_rate: 268435456 time_slot_ms: 100 sleep_time_ms: 0 } Last few iteration stats: []
I1109 14:06:45.866606 1914236928 rate_limiter.cc:111] Current iteration stats: { start: 3903865.232s end: 3903865.528s data_size: 26843567 target_rate: 268435456 time_slot_ms: 100 sleep_time_ms: 0 } Last few iteration stats: []
I1109 14:06:45.959517 1914236928 rate_limiter.cc:111] Current iteration stats: { start: 3903865.612s end: 3903865.621s data_size: 2275725 target_rate: 268435456 time_slot_ms: 200 sleep_time_ms: 0 } Last few iteration stats: [{ start:
3903865.232s end: 3903865.528s data_size: 26843567 target_rate: 268435456 time_slot_ms: 100 sleep_time_ms: 0 }]
```
Reviewers: amitanand, rthallam, slingam
Reviewed By: amitanand
Subscribers: svc_phabricator, ybase
Differential Revision: https://phorge.dev.yugabyte.com/D39857