Skip to content

2.25.0.0-b426

@spolitov spolitov tagged this 05 Dec 05:33
Summary:
When BreakConnectivity functionality is used in Messenger, it acquires lock_ while queueing outbound call.
But queue outbound call could be invoked from reactor thread when retriable rpc is used (via RpcRetries => DelayedTask).

Messenger also acquire lock when executing DumpRunningRpcs, and keep this lock until dump is done.
But dump query reactors one by one and is blocked until all reactors are responded.

Since reactor thread could be waiting on the lock_ it results in deadlock.

It is actual only for tests, since BreakConnectivity should be used.

Fixed by introducing separate mutex for BreakConnectivity functionality.

Also fixed PerCpuRwSharedLock to use `lock_shared` instead of `lock`, that was broken by D26242.

Also added login to ProcessSupervisor to kill supervised process after 10 seconds timeout in Stop.
Jira: DB-14284

Test Plan: ./yb_build.sh fastdebug --gcc11 --gtest_filter PgNamespaceTest.CreateNamespaceFromTemplateLeaderFailover -n 4000 -- -p 20

Reviewers: hsunder

Reviewed By: hsunder

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D40377
Assets 2
Loading