Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YCQL] Querying Transaction Enabled YCL table using Rust Driver v1.8 results Seg Fault: SQL Error: Execution Error. Timed out waiting for read time #1982

Open
davidoabile opened this issue Aug 6, 2019 · 9 comments
Labels
area/ysql Yugabyte SQL (YSQL) community/request Issues created by external users kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@davidoabile
Copy link

davidoabile commented Aug 6, 2019

Jira Link: DB-2098
This is my setup

sudo $HOME/yugabyte-1.3.1.0/bin/yb-master \

  --master_addresses 139.99.221.130:7100 \

  --rpc_bind_addresses 139.99.221.130 \

  --fs_data_dirs "/home/ubuntu/disk1,/home/ubuntu/disk2" \

  >& /home/ubuntu/disk1/yb-master.out \

  --replication_factor=1 &
sudo $HOME/yugabyte-1.3.1.0/bin/yb-tserver \
  --tserver_master_addrs 139.99.221.130:7100 \
  --rpc_bind_addresses 139.99.221.130 \
  --start_pgsql_proxy \
  --pgsql_proxy_bind_address=139.99.221.130:5433 \
  --cql_proxy_bind_address=139.99.221.130:9042 \
  --fs_data_dirs "/home/ubuntu/disk1,/home/ubuntu/disk2" \
  >& /home/ubuntu/disk1/yb-tserver.out &

When I run SELECT * FROM articles WHERE key="key_id" LIMIT 1 using Cassandra driver this works fine.

But when I run SELECT * FROM articles LIMIT 1

get Seg fault from the server Logs can be found here 139.99.221.130:7100

Some error messages

Timed out (yb/tserver/tablet_service.cc:980): Failed Read(tablet: 71b4d17c06e949d7a6ea653277b7deb2, num_ops: 1, num_attempts: 47, txn: 0000000
E0806 12:00:28.461766  2095 process_context.cc:180] SQL Error: Execution Error. Timed out waiting for read time
SELECT * FROM rekisa.articles WHERE store_id = ? LIMIT 6 OFFSET 0
^^^^^^
W0806 12:00:28.462041  2097 cql_rpc.cc:268] CQL Call from 120.29.52.204:54710, stream id: 10492 took 60001ms. Details:
W0806 12:00:28.462188  2097 cql_rpc.cc:271] cql_details {
  type: "QUERY"
  call_details {
    sql_string: "SELECT * FROM rekisa.articles WHERE store_id = ? LIMIT 6 OFFSET 0  "
  }
}
Trace:
W0806 12:00:28.462211  2097 cql_rpc.cc:272]
@yugabyte-ci yugabyte-ci added the community/request Issues created by external users label Aug 6, 2019
@kmuthukk
Copy link
Collaborator

kmuthukk commented Aug 8, 2019

@davidoabile provided access to his test environment. Here's the stack trace from the core file:

(gdb) where
#0  0x00007fc6bd284c63 in tcmalloc::CentralFreeList::FetchFromOneSpans(int, void**, void**) () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libtcmalloc.so.4
#1  0x00007fc6bd284f5c in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libtcmalloc.so.4
#2  0x00007fc6bd285009 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libtcmalloc.so.4
#3  0x00007fc6bd288967 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int, void* (*)(unsigned long)) () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libtcmalloc.so.4
#4  0x00007fc6b922c0cd in google::protobuf::TextFormat::Printer::SetUseUtf8StringEscaping(bool) () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libprotobuf.so.15
#5  0x00007fc6b922dac2 in google::protobuf::TextFormat::Printer::Printer() () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libprotobuf.so.15
#6  0x00007fc6b922f59f in google::protobuf::Message::DebugString() const () from /home/centos/yugabyte-1.3.1.0/lib/yb-thirdparty/libprotobuf.so.15
#7  0x00007fc6c6f4bc6b in yb::cqlserver::CQLInboundCall::LogTrace (this=0x2361410) at ../../src/yb/yql/cql/cqlserver/cql_rpc.cc:271
#8  0x00007fc6bf3d9699 in yb::rpc::InboundCall::QueueResponse (this=this@entry=0x2361410, is_success=is_success@entry=true) at ../../src/yb/rpc/inbound_call.cc:178
#9  0x00007fc6c6f47d81 in yb::cqlserver::CQLInboundCall::RespondSuccess (this=0x2361410, buffer=..., metrics=...) at ../../src/yb/yql/cql/cqlserver/cql_rpc.cc:189
#10 0x00007fc6c6f3d6aa in yb::cqlserver::CQLProcessor::SendResponse (this=this@entry=0x286a000, response=...) at ../../src/yb/yql/cql/cqlserver/cql_processor.cc:189
#11 0x00007fc6c6f3e5cd in yb::cqlserver::CQLProcessor::StatementExecuted (this=0x286a000, s=..., result=...) at ../../src/yb/yql/cql/cqlserver/cql_processor.cc:387
#12 0x00007fc6c62e89d2 in Run (a2=..., a1=..., this=0x7fc69d4e8480) at ../../src/yb/gutil/callback.h:494
#13 yb::ql::QLProcessor::RunAsyncDone(std::string const&, yb::ql::StatementParameters const&, yb::ql::ParseTree const*, yb::Callback<void (yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>, yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&) (this=<optimized out>, stmt=..., params=..., parse_tree=<optimized out>, cb=..., s=..., result=...) at ../../src/yb/yql/cql/ql/ql_processor.cc:386
#14 0x00007fc6c62e8ccf in Run (a6=..., a5=..., a4=..., a3=<synthetic pointer>, a2=..., a1=..., object=<optimized out>, this=<synthetic pointer>) at ../../src/yb/gutil/bind_internal.h:545
#15 MakeItSo (a7=..., a6=..., a5=..., a4=<optimized out>, a3=..., a2=..., a1=<optimized out>, runnable=...) at ../../src/yb/gutil/bind_internal.h:1005
#16 yb::internal::Invoker<5, yb::internal::BindState<yb::internal::RunnableAdapter<void (yb::ql::QLProcessor::*)(std::string const&, yb::ql::StatementParameters const&, yb::ql::ParseTree const*, yb::Callback<void (yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>, yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>, void (yb::ql::QLProcessor*, std::string const&, yb::ql::StatementParameters const&, yb::ql::ParseTree const*, yb::Callback<void (yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>, yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&), void (yb::internal::UnretainedWrapper<yb::ql::QLProcessor>, yb::internal::ConstRefWrapper<std::string>, yb::internal::ConstRefWrapper<yb::ql::StatementParameters>, yb::internal::OwnedWrapper<yb::ql::ParseTree const>, yb::Callback<void (yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>)>, void (yb::ql::QLProcessor*, std::string const&, yb::ql::StatementParameters const&, yb::ql::ParseTree const*, yb::Callback<void (yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>, yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&)>::Run(yb::internal::BindStateBase*, yb::Status const&, std::shared_ptr<yb::ql::ExecutedResult> const&) (base=<optimized out>, x6=..., x7=...) at ../../src/yb/gutil/bind_internal.h:2323
#17 0x00007fc6c5e8f5db in Run (a2=..., a1=..., this=0x7fc69d4e8500) at ../../src/yb/gutil/callback.h:494
#18 yb::ql::Executor::StatementExecuted (this=this@entry=0x286a478, s=...) at ../../src/yb/yql/cql/ql/exec/executor.cc:2184
#19 0x00007fc6c5e95440 in yb::ql::Executor::ProcessAsyncResults (this=0x286a478, rescheduled=<optimized out>) at ../../src/yb/yql/cql/ql/exec/executor.cc:1524
#20 0x00007fc6bf437788 in yb::rpc::(anonymous namespace)::Worker::Execute (this=<optimized out>) at ../../src/yb/rpc/thread_pool.cc:100
#21 0x00007fc6bd825849 in operator() (this=0x11f7d08) at /home/centos/yugabyte-1.3.1.0/linuxbrew-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#22 yb::Thread::SuperviseThread (arg=<optimized out>) at ../../src/yb/util/thread.cc:690
#23 0x00007fc6b853f694 in start_thread (arg=0x7fc69d4e9700) at pthread_create.c:333
#24 0x00007fc6b7c7c41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@kmuthukk
Copy link
Collaborator

kmuthukk commented Aug 8, 2019

Per @davidoabile -

queries like:

SELECT COUNT(id) as row FROM articles

runs fine on in cqlsh, but fails/crashes from their Rust program.

He mentioned they are using
Rust version 1.18
from https://github.com/AlexPikalov/cdrs

@kmuthukk
Copy link
Collaborator

kmuthukk commented Aug 8, 2019

Perhaps the above stack trace is unrelated to crash.

When @davidoabile ran the workload again, the corresponding core file generated had this backtrace this time:

Core was generated by `/home/centos/yugabyte-1.3.1.0/bin/yb-tserver --tserver_master_addrs 139.99.221.'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000005075b00 in ?? ()
(gdb) where
#0  0x0000000005075b00 in ?? ()
#1  0x00007f1ce5a4ad6e in MessageSize (value=...) at /n/jenkins/thirdparty/yugabyte-thirdparty-2019-04-29T16_41_21/thirdparty/installed/uninstrumented/include/google/protobuf/wire_format_lite_inl.h:1020
#2  yb::tserver::ReadResponsePB::ByteSizeLong (this=0x50a9d88) at src/yb/tserver/tserver.pb.cc:5951
#3  0x00007f1ce5571c6e in ByteSize (this=<optimized out>) at /n/jenkins/thirdparty/yugabyte-thirdparty-2019-04-29T16_41_21/thirdparty/installed/uninstrumented/include/google/protobuf/message_lite.h:346
#4  yb::rpc::RpcContext::RespondSuccess (this=this@entry=0x7f1cb6603938) at ../../src/yb/rpc/rpc_context.cc:131
#5  0x00007f1ceaed746e in yb::tserver::RpcOperationCompletionCallback<yb::tserver::ReadResponsePB>::OperationCompleted (this=this@entry=0x7f1cb6603920) at ../../src/yb/tserver/service_util.h:159
#6  0x00007f1ceaf33e43 in OperationCompleted (this=0x7f1cb6603920) at /home/centos/yugabyte-1.3.1.0/linuxbrew-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/Cellar/gcc/5.5.0_4/include/c++/5.5.0/bits/atomic_base.h:477
#7  yb::tserver::TabletServiceImpl::CompleteRead (this=this@entry=0x1e787e0, read_context=read_context@entry=0x7f1cb6603b70) at ../../src/yb/tserver/tablet_service.cc:1274
#8  0x00007f1ceaf351f2 in yb::tserver::TabletServiceImpl::Read (this=this@entry=0x1e787e0, req=req@entry=0x50a9cf0, resp=resp@entry=0x50a9d88, context=...) at ../../src/yb/tserver/tablet_service.cc:1207
#9  0x00007f1ce935317a in yb::tserver::TabletServerServiceIf::Handle (this=0x1e787e0, call=...) at src/yb/tserver/tserver_service.service.cc:154
#10 0x00007f1ce5589dc1 in yb::rpc::ServicePoolImpl::Handle (this=0x227da40, incoming=...) at ../../src/yb/rpc/service_pool.cc:243
#11 0x00007f1ce5536ef4 in yb::rpc::InboundCall::InboundCallTask::Run (this=<optimized out>) at ../../src/yb/rpc/inbound_call.cc:207
#12 0x00007f1ce5595788 in yb::rpc::(anonymous namespace)::Worker::Execute (this=<optimized out>) at ../../src/yb/rpc/thread_pool.cc:100
#13 0x00007f1ce3983849 in operator() (this=0x4478e28) at /home/centos/yugabyte-1.3.1.0/linuxbrew-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/Cellar/gcc/5.5.0_4/include/c++/5.5.0/functional:2267
#14 yb::Thread::SuperviseThread (arg=<optimized out>) at ../../src/yb/util/thread.cc:690
#15 0x00007f1cde69d694 in start_thread (arg=0x7f1cb6605700) at pthread_create.c:333
#16 0x00007f1cdddda41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

@kmuthukk kmuthukk added area/docdb YugabyteDB core features kind/bug This issue is a bug labels Aug 8, 2019
@kmuthukk
Copy link
Collaborator

UPDATE:

@davidoabile indicated that he was able to narrow this down further.

He's running into these issues when using transaction enabled YCQL tables with Rust driver. The same table, when queried from cqlsh works just fine.

@kmuthukk
Copy link
Collaborator

@srhickma was able to repro this locally using Rust driver and the sample program provided by @davidoabile . Looks like the combination of using Rust driver against a transaction enabled YCQL table is somehow causing simple selects to timeout and some time segfault/crash.

@srhickma - for now, assigning to you now to share the steps you used to reproduce the program.

After that, we can then make a call on who can help further investigate/work on the fix.

Thanks!

@kmuthukk kmuthukk changed the title Seg Fault: SQL Error: Execution Error. Timed out waiting for read time [YSCQL] Querying Transaction Enabled YCL table using Rust Driver v1.8 results Seg Fault: SQL Error: Execution Error. Timed out waiting for read time Aug 11, 2019
@srhickma
Copy link
Contributor

srhickma commented Aug 12, 2019

This can be reproduced using the following steps:

  1. Download files from https://gist.github.com/srhickma/4baf24dbbb2bce311e0550ce02e5661a and place the files in the following structure:
test_yb/Cargo.toml
test_yb/src/main.rs
  1. Build the project:
$ cd test_yb
$ cargo build
  1. Start a local yugabyte cluster using yb-ctl create, then run CREATE=1 ./target/debug/test_yb to create the test table and keyspace.

  2. Running the following will produce the error (it will hang until a timeout):

./target/debug/test_yb

I was able to fix this issue by manually setting the query consistency to QUORUM (cqlsh uses SERIAL internally):

    let params = QueryParamsBuilder::new()
        .consistency(Consistency::Quorum)
        .finalize();

    session.query_with_params("SELECT * FROM test_ks.users", params)
        .expect("Query failed");

So this could be used as a workaround for the time being.

@kmuthukk
Copy link
Collaborator

@davidoabile :

This seems to be related to issue we saw with Lua Driver (#684) which was also using a default consistency level of ONE. For YB, the recommended default consistency is QUORUM (since we use Raft for distributed consensus based replication), and as @srhickma mentioned, using that workaround should unblock you.

@davidoabile
Copy link
Author

davidoabile commented Aug 12, 2019 via email

@kmuthukk
Copy link
Collaborator

@davidoabile - you are absolutely correct. That's why we will keep this issue open, and we have re-opened #684 also.

@yugabyte-ci yugabyte-ci added the priority/medium Medium priority issue label Jun 9, 2022
@yugabyte-ci yugabyte-ci added area/ysql Yugabyte SQL (YSQL) and removed area/docdb YugabyteDB core features labels Jul 27, 2022
@yugabyte-ci yugabyte-ci changed the title [YSCQL] Querying Transaction Enabled YCL table using Rust Driver v1.8 results Seg Fault: SQL Error: Execution Error. Timed out waiting for read time [YCQL] Querying Transaction Enabled YCL table using Rust Driver v1.8 results Seg Fault: SQL Error: Execution Error. Timed out waiting for read time Sep 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) community/request Issues created by external users kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

4 participants