Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Leader Master timed out for many parallel transactions #2940

Closed
jaki opened this issue Nov 14, 2019 · 1 comment
Closed

[YSQL] Leader Master timed out for many parallel transactions #2940

jaki opened this issue Nov 14, 2019 · 1 comment
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@jaki
Copy link
Contributor

jaki commented Nov 14, 2019

Jira Link: DB-2248
There is a certain situation where spawning multiple transactions in parallel causes stress that makes the master leader time out.

Run the following:

./bin/ysqlsh <<EOT
CREATE TABLE p (i int PRIMARY KEY);
INSERT INTO p VALUES (1), (2), (3);
EOT
sleep 3
for _ in $(seq 100); do
  ./bin/ysqlsh -c "SELECT * FROM p;" &
done

The logs will fill up with all sorts of warnings.

master logs:

W1114 15:56:26.928185 252993536 long_operation_tracker.cc:112] Read running for 2.576s:
    @     0x7fff65c86b1c  _sigtramp
    @        0x1130bc62d  yb::Status::OK::operator ?()
    @        0x10e3f15b7  yb::docdb::IntentAwareIterator::FetchKey()
    @        0x10e37b600  yb::docdb::(anonymous namespace)::BuildSubDocument()
    @        0x10e37ad71  yb::docdb::GetSubDocument()
    @        0x10e3d16f0  yb::docdb::DocRowwiseIterator::HasNext()
    @        0x10e40237f  yb::docdb::PgsqlReadOperation::Execute()
    @        0x10d8b9dfb  yb::tablet::AbstractTablet::HandlePgsqlReadRequest()
    @        0x10d8cbe81  yb::tablet::Tablet::HandlePgsqlReadRequest()
    @        0x10cd63c4a  yb::tserver::TabletServiceImpl::DoRead()
    @        0x10cd62a4f  yb::tserver::TabletServiceImpl::CompleteRead()
    @        0x10cd60dfd  yb::tserver::TabletServiceImpl::Read()
    @        0x1123da000  yb::tserver::TabletServerServiceIf::Handle()
    @        0x113effb73  yb::rpc::ServicePoolImpl::Handle()
    @        0x113e586b9  yb::rpc::InboundCall::InboundCallTask::Run()
    @        0x113f1442f  yb::rpc::(anonymous namespace)::Worker::Execute()
W1114 15:56:27.010690 261578752 yb_rpc.cc:359] Call yb.tserver.TabletServerService.Read 127.0.0.1:62088 => 127.0.0.1:7100 (request call id 149) took 2658ms (client timeout 2500ms).

postgres logs:

W1114 15:35:51.856225 137011200 meta_cache.cc:610] 0x0000000118c71880 -> LookupByIdRpc(tablet: 00000000000000000000000000000000, num_attempts: 18): Failed to determine new Master: Timed out (yb/rpc/outbound_call.cc:510): GetMasterRegistration RPC to 127.0.0.1:7100 timed out after 2.500s
W1114 15:35:51.908969 137011200 meta_cache.cc:649] Leader Master timed out, re-trying...

tserver logs:

W1114 15:34:34.592809 60391424 heartbeater.cc:598] P 53176385ac134f86abf8882494a9eb94: Failed to heartbeat to 127.0.0.1:7100: Timed out (yb/rpc/outbound_call.cc:510): Failed to send heartbeat: TSHeartbeat RPC to 127.0.0.1:7100 timed out after 15.000s tries=0, num=1, masters=0x00000001187fb158 -> [[127.0.0.1:7100]], code=Timed out
@jaki jaki added the area/ysql Yugabyte SQL (YSQL) label Nov 14, 2019
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 9, 2022
@m-iancu
Copy link
Contributor

m-iancu commented Oct 25, 2022

This looks like it might be caused by the the metadata lookups as part of starting the connections with /bin/ysqlsh.
So closing as duplicate of #10452.

@m-iancu m-iancu closed this as completed Oct 25, 2022
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug labels Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
Status: Done
Development

No branches or pull requests

4 participants