Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

our latest ami coredump with c-s #380

Closed
asias opened this issue Sep 21, 2015 · 25 comments
Closed

our latest ami coredump with c-s #380

asias opened this issue Sep 21, 2015 · 25 comments

Comments

@asias
Copy link
Contributor

asias commented Sep 21, 2015

[fedora@ip-172-31-33-47 ~]$ rpm -qa|grep scylla
scylla-server-debuginfo-0.8-20150920.1a4c8db.fc22.x86_64
scylla-server-0.8-20150920.1a4c8db.fc22.x86_64
scylla-tools-0.8-20150920.cb4cbcb.fc22.noarch
scylla-jmx-0.8-20150920.197c8c7.fc22.noarch


scylla_2015-09-20T19-16-24Z (ami-b3abb583) Oregon

The instance memory is 60GB which is two large to get core dump to disk.

Sep 21 05:42:29 ip-172-31-33-48 systemd-journal[1378]: Journal started
Sep 21 05:41:14 ip-172-31-33-48 systemd-coredump[1371]: Coredump of 1173 (scylla) is larger than configured processing limit, refusing.
Sep 21 05:41:59 ip-172-31-33-48 systemd[1]: systemd-journald.service watchdog timeout (limit 1min)!
Sep 21 05:42:29 ip-172-31-33-48 audit[504]: <audit-1701> auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:syslogd_t:s0 pid=504 comm="systemd-journal" exe="/usr/lib/systemd/systemd-journald" sig=6
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1130> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1130> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1131> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1130> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1131> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 audit: <audit-1305> audit_enabled=1 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:syslogd_t:s0 res=1
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1130> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 systemd[1]: Starting Flush Journal to Persistent Storage...
Sep 21 05:42:29 ip-172-31-33-48 systemd[1]: Started Flush Journal to Persistent Storage.
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: <audit-1130> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=
Sep 21 05:42:29 ip-172-31-33-48 systemd-coredump[1375]: Detected coredump of the journal daemon itself, diverted to /var/lib/systemd/coredump/core.systemd-journal.0.2bdb2c4069cd450fb0256b0759bd320c.504.1442814149000000.xz.
Sep 21 05:42:31 ip-172-31-33-48 systemd-coredump[1371]: Process 1173 (scylla) of user 992 dumped core.
Sep 21 05:42:31 ip-172-31-33-48 systemd[1]: scylla-server.service: main process exited, code=exited, status=134/n/a
Sep 21 05:42:31 ip-172-31-33-48 systemd[1]: Stopping User Manager for UID 992...
@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

scripts to reproduce

. ./IPS.CONF
export IPS
rm -f bigperf.*
rm -f err.bigperf.*
KS=`pwgen -1`
echo $KS > big.date.time
date
date >> big.date.time
echo Using keyspace=$KS
pkill java
IP1=`echo $IPS| awk  '{print $1}'`
RES=r503-1
CL='cl=one -schema 'replication\(factor=1\)''
CL='cl=TWO -schema 'replication\(factor=2\)''
cassandra-stress write n=1000 no-warmup $CL "keyspace=$KS"  -mode native cql3 -rate threads=10 -node $IP1
sleep 10
for IP in $IP1;do
    echo === start cassandra-stress against $IP ===
    for CPU in `seq 0 4 31`; do
        CPULIST="$CPU,$((CPU+1)),$((CPU+2)),$((CPU+3))"
        echo Start cassandra-stress on $CPULIST
        logfile=bigperf.$IP.cpu$CPU
        taskset -c $CPULIST cassandra-stress write duration=5m $CL "keyspace=$KS" -mode native cql3 -rate threads=900 -node $IP -log file=$logfile >err.${logfile} 2>&1 &
    done
    sleep 2
done

@asias asias changed the title HEADS UP: our latest ami coredump with c-s our latest ami coredump with c-s Sep 21, 2015
@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

[New LWP 1268]
[New LWP 1176]
[New LWP 1303]
[New LWP 1282]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000000000046c013 in fetch_or (__m=std::memory_order_relaxed, __i=32768, this=<error reading variable: Asked for position 0 of stack, stack only has 0 elements on it.>) at /usr/include/c++/5.1.1/bits/atomic_base.h:544
544     /usr/include/c++/5.1.1/bits/atomic_base.h: No such file or directory.
Missing separate debuginfos, use: dnf debuginfo-install boost-filesystem-1.57.0-6.fc22.x86_64 boost-program-options-1.57.0-6.fc22.x86_64 boost-system-1.57.0-6.fc22.x86_64 boost-test-1.57.0-6.fc22.x86_64 boost-thread-1.57.0-6.fc22.x86_64 cryptopp-5.6.2-9.fc22.x86_64 glibc-2.21-7.fc22.x86_64 hwloc-libs-1.11.0-1.fc22.x86_64 jsoncpp-0.6.0-0.14.rc2.fc22.x86_64 keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.2-5.fc22.x86_64 libaio-0.3.110-4.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libgcc-5.1.1-4.fc22.x86_64 libpciaccess-0.13.3-0.3.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64 libtool-ltdl-2.4.2-34.fc22.x86_64 libxml2-2.9.2-4.fc22.x86_64 lz4-r131-1.fc22.x86_64 numactl-libs-2.0.10-2.fc22.x86_64 openssl-libs-1.0.1k-12.fc22.x86_64 pcre-8.37-4.fc22.x86_64 snappy-1.1.1-3.fc22.x86_64 thrift-0.9.1-13.fc22.3.x86_64 xz-libs-5.2.0-2.fc22.x86_64 yaml-cpp-0.5.1-6.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64

(gdb) bt
#0  0x000000000046c013 in fetch_or (__m=std::memory_order_relaxed, __i=32768, this=<error reading variable: Asked for position 0 of stack, stack only has 0 elements on it.>) at /usr/include/c++/5.1.1/bits/atomic_base.h:544
#1  reactor::signals::action (signo=15, siginfo=0x7f05037bb570, ignore=<optimized out>) at core/reactor.cc:185
#2  <signal handler called>
#3  0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
#4  0x00000000006382fc in eal_intr_thread_main ()
#5  0x00007f0509a3a555 in start_thread () from /lib64/libpthread.so.0
#6  0x00007f0509775b9d in clone () from /lib64/libc.so.6

(gdb) info reg
rax            0x8000   32768
rbx            0x0      0
rcx            0xf      15
rdx            0x0      0
rsi            0x7f05037bb570   139659510003056
rdi            0xf      15
rbp            0x7f05037bba90   0x7f05037bba90
rsp            0x7f05037bb438   0x7f05037bb438
r8             0x7f05037c0700   139659510023936
r9             0x7f05037c0700   139659510023936
r10            0xffffffff       4294967295
r11            0x293    659
r12            0x0      0
r13            0x7f05037c0700   139659510023936
r14            0x800000 8388608
r15            0x0      0
rip            0x46c013 0x46c013 <reactor::signals::action(int, siginfo_t*, void*)+19>
eflags         0x10206  [ PF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0


(gdb) i th
  Id   Target Id         Frame
  65   Thread 0x7f04f4fa3700 (LWP 1282) 0x00007ffdd7c7bb8d in ?? ()
  64   Thread 0x7f04ebbff700 (LWP 1303) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  63   Thread 0x7f050d5fcbc0 (LWP 1176) 0x000000000049575d in _mm_pause () at /usr/lib/gcc/x86_64-redhat-linux/5.1.1/include/xmmintrin.h:1264
  62   Thread 0x7f04fbfb1700 (LWP 1268) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  61   Thread 0x7f04fc7b2700 (LWP 1267) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  60   Thread 0x7f04fdfb5700 (LWP 1264) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  59   Thread 0x7f04fcfb3700 (LWP 1266) 0x00007ffdd7c7bb8d in ?? ()
  58   Thread 0x7f04fd7b4700 (LWP 1265) 0x00007ffdd7c7bb8d in ?? ()
  57   Thread 0x7f04fffb9700 (LWP 1260) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  56   Thread 0x7f04fe7b6700 (LWP 1263) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  55   Thread 0x7f04f3fa1700 (LWP 1284) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  54   Thread 0x7f04f7fa9700 (LWP 1276) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  53   Thread 0x7f04fefb7700 (LWP 1262) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  52   Thread 0x7f04ff7b8700 (LWP 1261) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  51   Thread 0x7f05007ba700 (LWP 1259) 0x00007ffdd7c7bb8d in ?? ()
  50   Thread 0x7f04f57a4700 (LWP 1281) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  49   Thread 0x7f04f77a8700 (LWP 1277) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  48   Thread 0x7f04f87aa700 (LWP 1275) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  47   Thread 0x7f0500fbb700 (LWP 1258) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  46   Thread 0x7f04f5fa5700 (LWP 1280) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  45   Thread 0x7f04f8fab700 (LWP 1274) 0x00007ffdd7c7bb8d in ?? ()
  44   Thread 0x7f04f47a2700 (LWP 1283) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  43   Thread 0x7f04f67a6700 (LWP 1279) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  42   Thread 0x7f04f6fa7700 (LWP 1278) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  41   Thread 0x7f04fb7b0700 (LWP 1269) smp_message_queue::process_queue<4ul, smp_message_queue::process_completions()::<lambda(smp_message_queue::work_item*)> >(smp_message_queue::lf_queue &) (q=..., this=<optimized out>,
    process=...) at core/reactor.cc:1465
  40   Thread 0x7f04f9fad700 (LWP 1272) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  39   Thread 0x7f04f97ac700 (LWP 1273) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  38   Thread 0x7f05017bc700 (LWP 1257) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  37   Thread 0x7f0501fbd700 (LWP 1256) 0x00007ffdd7c7bb8d in ?? ()
  36   Thread 0x7f04fafaf700 (LWP 1270) 0x00007ffdd7c7bb8d in ?? ()
  35   Thread 0x7f0502fbf700 (LWP 1254) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  34   Thread 0x7f04fa7ae700 (LWP 1271) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  33   Thread 0x7f05027be700 (LWP 1255) 0x00007f0509776193 in epoll_wait () from /lib64/libc.so.6
  32   Thread 0x7f04e97ff700 (LWP 1312) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  31   Thread 0x7f04f21ff700 (LWP 1291) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  30   Thread 0x7f04f03ff700 (LWP 1292) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  29   Thread 0x7f04f33ff700 (LWP 1285) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  28   Thread 0x7f04ee5ff700 (LWP 1296) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  27   Thread 0x7f04e85ff700 (LWP 1315) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  26   Thread 0x7f04ea9ff700 (LWP 1307) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  25   Thread 0x7f04eebff700 (LWP 1298) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  24   Thread 0x7f04edfff700 (LWP 1297) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  23   Thread 0x7f04f1bff700 (LWP 1288) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  22   Thread 0x7f04ec7ff700 (LWP 1302) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  21   Thread 0x7f04efdff700 (LWP 1293) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  20   Thread 0x7f04ecdff700 (LWP 1301) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  19   Thread 0x7f04ea3ff700 (LWP 1314) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  18   Thread 0x7f04e79ff700 (LWP 1316) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  17   Thread 0x7f04f15ff700 (LWP 1287) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  16   Thread 0x7f04eb5ff700 (LWP 1305) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  15   Thread 0x7f04e7fff700 (LWP 1310) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
  14   Thread 0x7f04ec1ff700 (LWP 1304) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  13   Thread 0x7f04e9dff700 (LWP 1309) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  12   Thread 0x7f04ef7ff700 (LWP 1294) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  11   Thread 0x7f04ef1ff700 (LWP 1295) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  10   Thread 0x7f04ed9ff700 (LWP 1299) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  9    Thread 0x7f04ed3ff700 (LWP 1300) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  8    Thread 0x7f04eafff700 (LWP 1306) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  7    Thread 0x7f04f09ff700 (LWP 1308) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  6    Thread 0x7f04e8bff700 (LWP 1313) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  5    Thread 0x7f04e91ff700 (LWP 1311) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  4    Thread 0x7f04f2dff700 (LWP 1290) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  3    Thread 0x7f04f27ff700 (LWP 1286) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
  2    Thread 0x7f04f0fff700 (LWP 1289) 0x00007f0509a4254d in read () from /lib64/libpthread.so.0
* 1    Thread 0x7f05037c0700 (LWP 1234) 0x000000000046c013 in fetch_or (__m=std::memory_order_relaxed, __i=32768, this=<error reading variable: Asked for position 0 of stack, stack only has 0 elements on it.>)
    at /usr/include/c++/5.1.1/bits/atomic_base.h:544

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

This is on another node of the two nodes cluster:

[New LWP 1600]
[New LWP 1568]
[New LWP 1620]
[New LWP 1592]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fb223fbe9c8 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install boost-filesystem-1.57.0-6.fc22.x86_64 boost-program-options-1.57.0-6.fc22.x86_64 boost-system-1.57.0-6.fc22.x86_64 boost-test-1.57.0-6.fc22.x86_64 boost-thread-1.57.0-6.fc22.x86_64 cryptopp-5.6.2-9.fc22.x86_64 glibc-2.21-7.fc22.x86_64 hwloc-libs-1.11.0-1.fc22.x86_64 jsoncpp-0.6.0-0.14.rc2.fc22.x86_64 keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.2-5.fc22.x86_64 libaio-0.3.110-4.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libgcc-5.1.1-4.fc22.x86_64 libpciaccess-0.13.3-0.3.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64 libtool-ltdl-2.4.2-34.fc22.x86_64 libxml2-2.9.2-4.fc22.x86_64 lz4-r131-1.fc22.x86_64 numactl-libs-2.0.10-2.fc22.x86_64 openssl-libs-1.0.1k-12.fc22.x86_64 pcre-8.37-4.fc22.x86_64 snappy-1.1.1-3.fc22.x86_64 thrift-0.9.1-13.fc22.3.x86_64 xz-libs-5.2.0-2.fc22.x86_64 yaml-cpp-0.5.1-6.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64
(gdb) bt
#0  0x00007fb223fbe9c8 in raise () from /lib64/libc.so.6
#1  0x00007fb223fc065a in abort () from /lib64/libc.so.6
#2  0x00007fb223fb7187 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007fb223fb7232 in __assert_fail () from /lib64/libc.so.6
#4  0x00000000008aa7b4 in schedule<future<T>::then(Func&&) [with Func = transport::cql_server::connection::do_flush()::<lambda()>; Result = future<>; T = {}]::<lambda(auto:1&&)> > (
    func=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x2763893, DIE 0x28e3f4c>, this=0x6000000e5c78) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:643
#5  then<transport::cql_server::connection::do_flush()::<lambda()>, future<> > (func=<optimized out>, this=0x6000000e5c78) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:785
#6  transport::cql_server::connection::do_flush (this=0x6000000e5bd0) at transport/server.cc:839
#7  0x00000000008aa88b in transport::cql_server::poll_pending_responders (this=0x6000000dceb0) at transport/server.cc:214
#8  0x00000000004950da in poll_once (this=0x600000549000) at core/reactor.cc:1272
#9  reactor::run (this=0x600000549000) at core/reactor.cc:1250
#10 0x00000000004f0704 in app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=this@entry=0x7ffda3129620, ac=ac@entry=11, av=av@entry=0x7ffda3129868,
    func=func@entry=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x850281, DIE 0x8bf9f0>) at core/app-template.cc:122
#11 0x000000000041c05c in main (ac=11, av=0x7ffda3129868) at main.cc:321

(gdb) info reg
rax            0x0      0
rbx            0x7fb227f27000   140403151106048
rcx            0x7fb223fbe9c8   140403084618184
rdx            0x6      6
rsi            0x619    1561
rdi            0x619    1561
rbp            0x10d2d49        0x10d2d49
rsp            0x7ffda3128be8   0x7ffda3128be8
r8             0xfefefefefefefeff       -72340172838076673
r9             0xffffffffffffff00       -256
r10            0x8      8
r11            0x202    514
r12            0x283    643
r13            0x11ae3c0        18539456
r14            0x0      0
r15            0x7ffda3128dc0   140727339355584
rip            0x7fb223fbe9c8   0x7fb223fbe9c8 <raise+56>
eflags         0x202    [ IF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

(gdb) i th
  Id   Target Id         Frame
  65   Thread 0x7fb20f8ba700 (LWP 1592) 0x00007ffda318fa41 in ?? ()
  64   Thread 0x7fb203bff700 (LWP 1620) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  63   Thread 0x7fb21b8d2700 (LWP 1568) 0x00007ffda318fb8d in ?? ()
  62   Thread 0x7fb20cbff700 (LWP 1600) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  61   Thread 0x7fb21e0d7700 (LWP 1563) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  60   Thread 0x7fb2138c2700 (LWP 1584) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  59   Thread 0x7fb2128c0700 (LWP 1586) 0x00007ffda318fb8d in ?? ()
  58   Thread 0x7fb2120bf700 (LWP 1587) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  57   Thread 0x7fb2170c9700 (LWP 1577) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  56   Thread 0x7fb2140c3700 (LWP 1583) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  55   Thread 0x7fb2168c8700 (LWP 1578) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  54   Thread 0x7fb2160c7700 (LWP 1579) 0x00007ffda318fb8d in ?? ()
  53   Thread 0x7fb2130c1700 (LWP 1585) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  52   Thread 0x7fb2178ca700 (LWP 1576) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  51   Thread 0x7fb2150c5700 (LWP 1581) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  50   Thread 0x7fb2188cc700 (LWP 1574) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  49   Thread 0x7fb2148c4700 (LWP 1582) 0x00007ffda318fb8d in ?? ()
  48   Thread 0x7fb2180cb700 (LWP 1575) 0x00007ffda318fb8d in ?? ()
  47   Thread 0x7fb2190cd700 (LWP 1573) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  46   Thread 0x7fb21a0cf700 (LWP 1571) 0x00007ffda318fb8d in ?? ()
  45   Thread 0x7fb2158c6700 (LWP 1580) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  44   Thread 0x7fb20e8b8700 (LWP 1594) 0x00007ffda318fb8d in ?? ()
  43   Thread 0x7fb21a8d0700 (LWP 1570) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  42   Thread 0x7fb2198ce700 (LWP 1572) 0x00007ffda318fb8d in ?? ()
  41   Thread 0x7fb21b0d1700 (LWP 1569) 0x00007ffda318fb8d in ?? ()
  40   Thread 0x7fb20f0b9700 (LWP 1593) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  39   Thread 0x7fb2100bb700 (LWP 1591) 0x00007ffda318fb8d in ?? ()
  38   Thread 0x7fb21c0d3700 (LWP 1567) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  37   Thread 0x7fb2108bc700 (LWP 1590) 0x00007ffda318fb8d in ?? ()
  36   Thread 0x7fb21c8d4700 (LWP 1566) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  35   Thread 0x7fb21d0d5700 (LWP 1565) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  34   Thread 0x7fb21d8d6700 (LWP 1564) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  33   Thread 0x7fb2110bd700 (LWP 1589) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  32   Thread 0x7fb2118be700 (LWP 1588) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
  31   Thread 0x7fb20ddff700 (LWP 1595) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  30   Thread 0x7fb2065ff700 (LWP 1612) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  29   Thread 0x7fb2059ff700 (LWP 1624) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  28   Thread 0x7fb205fff700 (LWP 1611) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  27   Thread 0x7fb2029ff700 (LWP 1625) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  26   Thread 0x7fb2035ff700 (LWP 1622) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  25   Thread 0x7fb209bff700 (LWP 1603) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  24   Thread 0x7fb20bfff700 (LWP 1598) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  23   Thread 0x7fb2041ff700 (LWP 1618) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  22   Thread 0x7fb2083ff700 (LWP 1613) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  21   Thread 0x7fb20b9ff700 (LWP 1599) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  20   Thread 0x7fb206bff700 (LWP 1610) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  19   Thread 0x7fb20c5ff700 (LWP 1597) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  18   Thread 0x7fb2089ff700 (LWP 1606) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  17   Thread 0x7fb2095ff700 (LWP 1604) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  16   Thread 0x7fb20d1ff700 (LWP 1596) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  15   Thread 0x7fb20a7ff700 (LWP 1617) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  14   Thread 0x7fb2053ff700 (LWP 1615) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
  13   Thread 0x7fb20a1ff700 (LWP 1602) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  12   Thread 0x7fb207dff700 (LWP 1621) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  11   Thread 0x7fb208fff700 (LWP 1605) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  10   Thread 0x7fb204dff700 (LWP 1614) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  9    Thread 0x7fb20adff700 (LWP 1601) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  8    Thread 0x7fb2023ff700 (LWP 1626) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  7    Thread 0x7fb2077ff700 (LWP 1609) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  6    Thread 0x7fb2071ff700 (LWP 1616) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  5    Thread 0x7fb202fff700 (LWP 1623) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  4    Thread 0x7fb20b3ff700 (LWP 1607) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  3    Thread 0x7fb2047ff700 (LWP 1619) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
  2    Thread 0x7fb20d7ff700 (LWP 1608) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
* 1    Thread 0x7fb227f13bc0 (LWP 1561) 0x00007fb223fbe9c8 in raise () from /lib64/libc.so.6

@slivne
Copy link
Contributor

slivne commented Sep 21, 2015

Gleb - can you pleasse take a stab at this and see if you can find why its
breaking

On Mon, Sep 21, 2015 at 9:24 AM, Asias He notifications@github.com wrote:

This is on another node of the two nodes cluster:

[New LWP 1600]
[New LWP 1568]
[New LWP 1620]
[New LWP 1592]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fb223fbe9c8 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install boost-filesystem-1.57.0-6.fc22.x86_64 boost-program-options-1.57.0-6.fc22.x86_64 boost-system-1.57.0-6.fc22.x86_64 boost-test-1.57.0-6.fc22.x86_64 boost-thread-1.57.0-6.fc22.x86_64 cryptopp-5.6.2-9.fc22.x86_64 glibc-2.21-7.fc22.x86_64 hwloc-libs-1.11.0-1.fc22.x86_64 jsoncpp-0.6.0-0.14.rc2.fc22.x86_64 keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.2-5.fc22.x86_64 libaio-0.3.110-4.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libgcc-5.1.1-4.fc22.x86_64 libpciaccess-0.13.3-0.3.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64 libtool-ltdl-2.4.2-34.fc22.x86_64 libxml2-2.9.2-4.fc22.x86_64 lz4-r131-1.fc22.x86_64 numactl-libs-2.0.10-2.fc22.x86_64 openssl-libs-1.0.1k-12.fc22.x86_64 pcre-8.37-4.fc22.x86_64 snappy-1.1.1-3.fc22.x86_64 thrift-0.9.1-13.fc22.3.x86_64 xz-libs-5.2.0-2.fc22.x86_64 yaml-cpp-0.5.1-6.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64
(gdb) bt
#0 0x00007fb223fbe9c8 in raise () from /lib64/libc.so.6
#1 0x00007fb223fc065a in abort () from /lib64/libc.so.6
#2 0x00007fb223fb7187 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007fb223fb7232 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000008aa7b4 in schedule<future::then(Func&&) [with Func = transport::cql_server::connection::do_flush()::<lambda()>; Result = future<>; T = {}]::<lambda(auto:1&&)> > (
func=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x2763893, DIE 0x28e3f4c>, this=0x6000000e5c78) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:643
#5 thentransport::cql_server::connection::do_flush()::<lambda(), future<> > (func=, this=0x6000000e5c78) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:785
#6 transport::cql_server::connection::do_flush (this=0x6000000e5bd0) at transport/server.cc:839
#7 0x00000000008aa88b in transport::cql_server::poll_pending_responders (this=0x6000000dceb0) at transport/server.cc:214
#8 0x00000000004950da in poll_once (this=0x600000549000) at core/reactor.cc:1272
#9 reactor::run (this=0x600000549000) at core/reactor.cc:1250
#10 0x00000000004f0704 in app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=this@entry=0x7ffda3129620, ac=ac@entry=11, av=av@entry=0x7ffda3129868,
func=func@entry=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x850281, DIE 0x8bf9f0>) at core/app-template.cc:122
#11 0x000000000041c05c in main (ac=11, av=0x7ffda3129868) at main.cc:321

(gdb) info reg
rax 0x0 0
rbx 0x7fb227f27000 140403151106048
rcx 0x7fb223fbe9c8 140403084618184
rdx 0x6 6
rsi 0x619 1561
rdi 0x619 1561
rbp 0x10d2d49 0x10d2d49
rsp 0x7ffda3128be8 0x7ffda3128be8
r8 0xfefefefefefefeff -72340172838076673
r9 0xffffffffffffff00 -256
r10 0x8 8
r11 0x202 514
r12 0x283 643
r13 0x11ae3c0 18539456
r14 0x0 0
r15 0x7ffda3128dc0 140727339355584
rip 0x7fb223fbe9c8 0x7fb223fbe9c8 <raise+56>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0

(gdb) i th
Id Target Id Frame
65 Thread 0x7fb20f8ba700 (LWP 1592) 0x00007ffda318fa41 in ?? ()
64 Thread 0x7fb203bff700 (LWP 1620) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
63 Thread 0x7fb21b8d2700 (LWP 1568) 0x00007ffda318fb8d in ?? ()
62 Thread 0x7fb20cbff700 (LWP 1600) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
61 Thread 0x7fb21e0d7700 (LWP 1563) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
60 Thread 0x7fb2138c2700 (LWP 1584) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
59 Thread 0x7fb2128c0700 (LWP 1586) 0x00007ffda318fb8d in ?? ()
58 Thread 0x7fb2120bf700 (LWP 1587) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
57 Thread 0x7fb2170c9700 (LWP 1577) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
56 Thread 0x7fb2140c3700 (LWP 1583) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
55 Thread 0x7fb2168c8700 (LWP 1578) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
54 Thread 0x7fb2160c7700 (LWP 1579) 0x00007ffda318fb8d in ?? ()
53 Thread 0x7fb2130c1700 (LWP 1585) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
52 Thread 0x7fb2178ca700 (LWP 1576) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
51 Thread 0x7fb2150c5700 (LWP 1581) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
50 Thread 0x7fb2188cc700 (LWP 1574) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
49 Thread 0x7fb2148c4700 (LWP 1582) 0x00007ffda318fb8d in ?? ()
48 Thread 0x7fb2180cb700 (LWP 1575) 0x00007ffda318fb8d in ?? ()
47 Thread 0x7fb2190cd700 (LWP 1573) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
46 Thread 0x7fb21a0cf700 (LWP 1571) 0x00007ffda318fb8d in ?? ()
45 Thread 0x7fb2158c6700 (LWP 1580) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
44 Thread 0x7fb20e8b8700 (LWP 1594) 0x00007ffda318fb8d in ?? ()
43 Thread 0x7fb21a8d0700 (LWP 1570) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
42 Thread 0x7fb2198ce700 (LWP 1572) 0x00007ffda318fb8d in ?? ()
41 Thread 0x7fb21b0d1700 (LWP 1569) 0x00007ffda318fb8d in ?? ()
40 Thread 0x7fb20f0b9700 (LWP 1593) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
39 Thread 0x7fb2100bb700 (LWP 1591) 0x00007ffda318fb8d in ?? ()
38 Thread 0x7fb21c0d3700 (LWP 1567) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
37 Thread 0x7fb2108bc700 (LWP 1590) 0x00007ffda318fb8d in ?? ()
36 Thread 0x7fb21c8d4700 (LWP 1566) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
35 Thread 0x7fb21d0d5700 (LWP 1565) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
34 Thread 0x7fb21d8d6700 (LWP 1564) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
33 Thread 0x7fb2110bd700 (LWP 1589) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
32 Thread 0x7fb2118be700 (LWP 1588) 0x00007fb22408d193 in epoll_wait () from /lib64/libc.so.6
31 Thread 0x7fb20ddff700 (LWP 1595) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
30 Thread 0x7fb2065ff700 (LWP 1612) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
29 Thread 0x7fb2059ff700 (LWP 1624) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
28 Thread 0x7fb205fff700 (LWP 1611) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
27 Thread 0x7fb2029ff700 (LWP 1625) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
26 Thread 0x7fb2035ff700 (LWP 1622) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
25 Thread 0x7fb209bff700 (LWP 1603) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
24 Thread 0x7fb20bfff700 (LWP 1598) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
23 Thread 0x7fb2041ff700 (LWP 1618) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
22 Thread 0x7fb2083ff700 (LWP 1613) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
21 Thread 0x7fb20b9ff700 (LWP 1599) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
20 Thread 0x7fb206bff700 (LWP 1610) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
19 Thread 0x7fb20c5ff700 (LWP 1597) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
18 Thread 0x7fb2089ff700 (LWP 1606) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
17 Thread 0x7fb2095ff700 (LWP 1604) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
16 Thread 0x7fb20d1ff700 (LWP 1596) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
15 Thread 0x7fb20a7ff700 (LWP 1617) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
14 Thread 0x7fb2053ff700 (LWP 1615) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
---Type to continue, or q to quit---
13 Thread 0x7fb20a1ff700 (LWP 1602) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
12 Thread 0x7fb207dff700 (LWP 1621) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
11 Thread 0x7fb208fff700 (LWP 1605) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
10 Thread 0x7fb204dff700 (LWP 1614) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
9 Thread 0x7fb20adff700 (LWP 1601) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
8 Thread 0x7fb2023ff700 (LWP 1626) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
7 Thread 0x7fb2077ff700 (LWP 1609) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
6 Thread 0x7fb2071ff700 (LWP 1616) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
5 Thread 0x7fb202fff700 (LWP 1623) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
4 Thread 0x7fb20b3ff700 (LWP 1607) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
3 Thread 0x7fb2047ff700 (LWP 1619) 0x00007fb22435954d in read () from /lib64/libpthread.so.0
2 Thread 0x7fb20d7ff700 (LWP 1608) 0x00007fb22435954d in read () from /lib64/libpthread.so.0

  • 1 Thread 0x7fb227f13bc0 (LWP 1561) 0x00007fb223fbe9c8 in raise () from /lib64/libc.so.6


Reply to this email directly or view it on GitHub
#380 (comment)
.

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

Reproduced with 1 node cluster:

[New LWP 8051]
[New LWP 8073]
[New LWP 8101]
[New LWP 8064]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f33794529c8 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install boost-filesystem-1.57.0-6.fc22.x86_64 boost-program-options-1.57.0-6.fc22.x86_64 boost-system-1.57.0-6.fc22.x86_64 boost-test-1.57.0-6.fc22.x86_64 boost-thread-1.57.0-6.fc22.x86_64 cryptopp-5.6.2-9.fc22.x86_64 glibc-2.21-7.fc22.x86_64 hwloc-libs-1.11.0-1.fc22.x86_64 jsoncpp-0.6.0-0.14.rc2.fc22.x86_64 keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.2-5.fc22.x86_64 libaio-0.3.110-4.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libgcc-5.1.1-4.fc22.x86_64 libpciaccess-0.13.3-0.3.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64 libtool-ltdl-2.4.2-34.fc22.x86_64 libxml2-2.9.2-4.fc22.x86_64 lz4-r131-1.fc22.x86_64 numactl-libs-2.0.10-2.fc22.x86_64 openssl-libs-1.0.1k-12.fc22.x86_64 pcre-8.37-4.fc22.x86_64 snappy-1.1.1-3.fc22.x86_64 thrift-0.9.1-13.fc22.3.x86_64 xz-libs-5.2.0-2.fc22.x86_64 yaml-cpp-0.5.1-6.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64
(gdb) bt
#0  0x00007f33794529c8 in raise () from /lib64/libc.so.6
#1  0x00007f337945465a in abort () from /lib64/libc.so.6
#2  0x00007f337944b187 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f337944b232 in __assert_fail () from /lib64/libc.so.6
#4  0x00000000008aa7b4 in schedule<future<T>::then(Func&&) [with Func = transport::cql_server::connection::do_flush()::<lambda()>; Result = future<>; T = {}]::<lambda(auto:1&&)> > (
    func=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x2763893, DIE 0x28e3f4c>, this=0x6000006b65f8) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:643
#5  then<transport::cql_server::connection::do_flush()::<lambda()>, future<> > (func=<optimized out>, this=0x6000006b65f8) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:785
#6  transport::cql_server::connection::do_flush (this=0x6000006b6550) at transport/server.cc:839
#7  0x00000000008aa88b in transport::cql_server::poll_pending_responders (this=0x6000000dceb0) at transport/server.cc:214
#8  0x00000000004950da in poll_once (this=0x600000549000) at core/reactor.cc:1272
#9  reactor::run (this=0x600000549000) at core/reactor.cc:1250
#10 0x00000000004f0704 in app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=this@entry=0x7ffcf9644990, ac=ac@entry=11, av=av@entry=0x7ffcf9644bd8,
    func=func@entry=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x850281, DIE 0x8bf9f0>) at core/app-template.cc:122
#11 0x000000000041c05c in main (ac=11, av=0x7ffcf9644bd8) at main.cc:321
(gdb)

(gdb) i reg
rax            0x0      0
rbx            0x7f337d3bb000   139859121123328
rcx            0x7f33794529c8   139859054635464
rdx            0x6      6
rsi            0x1f68   8040
rdi            0x1f68   8040
rbp            0x10d2d49        0x10d2d49
rsp            0x7ffcf9643f58   0x7ffcf9643f58
r8             0xfefefefefefefeff       -72340172838076673
r9             0xffffffffffffff00       -256
r10            0x8      8
r11            0x202    514
r12            0x283    643
r13            0x11ae3c0        18539456
r14            0x0      0
r15            0x7ffcf9644130   140724492583216
rip            0x7f33794529c8   0x7f33794529c8 <raise+56>
eflags         0x202    [ IF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

@slivne
Copy link
Contributor

slivne commented Sep 21, 2015

I'll create an ami without enhanced networking to see its not related

On Mon, Sep 21, 2015 at 9:59 AM, Asias He notifications@github.com wrote:

Reproduced with 1 node cluster:

[New LWP 8051]
[New LWP 8073]
[New LWP 8101]
[New LWP 8064]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f33794529c8 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install boost-filesystem-1.57.0-6.fc22.x86_64 boost-program-options-1.57.0-6.fc22.x86_64 boost-system-1.57.0-6.fc22.x86_64 boost-test-1.57.0-6.fc22.x86_64 boost-thread-1.57.0-6.fc22.x86_64 cryptopp-5.6.2-9.fc22.x86_64 glibc-2.21-7.fc22.x86_64 hwloc-libs-1.11.0-1.fc22.x86_64 jsoncpp-0.6.0-0.14.rc2.fc22.x86_64 keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.2-5.fc22.x86_64 libaio-0.3.110-4.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libgcc-5.1.1-4.fc22.x86_64 libpciaccess-0.13.3-0.3.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64 libtool-ltdl-2.4.2-34.fc22.x86_64 libxml2-2.9.2-4.fc22.x86_64 lz4-r131-1.fc22.x86_64 numactl-libs-2.0.10-2.fc22.x86_64 openssl-libs-1.0.1k-12.fc22.x86_64 pcre-8.37-4.fc22.x86_64 snappy-1.1.1-3.fc22.x86_64 thrift-0.9.1-13.fc22.3.x86_64 xz-libs-5.2.0-2.fc22.x86_64 yaml-cpp-0.5.1-6.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64
(gdb) bt
#0 0x00007f33794529c8 in raise () from /lib64/libc.so.6
#1 0x00007f337945465a in abort () from /lib64/libc.so.6
#2 0x00007f337944b187 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f337944b232 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000008aa7b4 in schedule<future::then(Func&&) [with Func = transport::cql_server::connection::do_flush()::<lambda()>; Result = future<>; T = {}]::<lambda(auto:1&&)> > (
func=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x2763893, DIE 0x28e3f4c>, this=0x6000006b65f8) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:643
#5 thentransport::cql_server::connection::do_flush()::<lambda(), future<> > (func=, this=0x6000006b65f8) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:785
#6 transport::cql_server::connection::do_flush (this=0x6000006b6550) at transport/server.cc:839
#7 0x00000000008aa88b in transport::cql_server::poll_pending_responders (this=0x6000000dceb0) at transport/server.cc:214
#8 0x00000000004950da in poll_once (this=0x600000549000) at core/reactor.cc:1272
#9 reactor::run (this=0x600000549000) at core/reactor.cc:1250
#10 0x00000000004f0704 in app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=this@entry=0x7ffcf9644990, ac=ac@entry=11, av=av@entry=0x7ffcf9644bd8,
func=func@entry=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x850281, DIE 0x8bf9f0>) at core/app-template.cc:122
#11 0x000000000041c05c in main (ac=11, av=0x7ffcf9644bd8) at main.cc:321
(gdb)

(gdb) i reg
rax 0x0 0
rbx 0x7f337d3bb000 139859121123328
rcx 0x7f33794529c8 139859054635464
rdx 0x6 6
rsi 0x1f68 8040
rdi 0x1f68 8040
rbp 0x10d2d49 0x10d2d49
rsp 0x7ffcf9643f58 0x7ffcf9643f58
r8 0xfefefefefefefeff -72340172838076673
r9 0xffffffffffffff00 -256
r10 0x8 8
r11 0x202 514
r12 0x283 643
r13 0x11ae3c0 18539456
r14 0x0 0
r15 0x7ffcf9644130 140724492583216
rip 0x7f33794529c8 0x7f33794529c8 <raise+56>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0


Reply to this email directly or view it on GitHub
#380 (comment)
.

@slivne
Copy link
Contributor

slivne commented Sep 21, 2015

ami : ami-63d9ab06 - same as the ami that has the issue - without enhanced
networking

On Mon, Sep 21, 2015 at 10:00 AM, Shlomi Livne shlomi@cloudius-systems.com
wrote:

I'll create an ami without enhanced networking to see its not related

On Mon, Sep 21, 2015 at 9:59 AM, Asias He notifications@github.com
wrote:

Reproduced with 1 node cluster:

[New LWP 8051]
[New LWP 8073]
[New LWP 8101]
[New LWP 8064]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/bin/scylla --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f33794529c8 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install boost-filesystem-1.57.0-6.fc22.x86_64 boost-program-options-1.57.0-6.fc22.x86_64 boost-system-1.57.0-6.fc22.x86_64 boost-test-1.57.0-6.fc22.x86_64 boost-thread-1.57.0-6.fc22.x86_64 cryptopp-5.6.2-9.fc22.x86_64 glibc-2.21-7.fc22.x86_64 hwloc-libs-1.11.0-1.fc22.x86_64 jsoncpp-0.6.0-0.14.rc2.fc22.x86_64 keyutils-libs-1.5.9-4.fc22.x86_64 krb5-libs-1.13.2-5.fc22.x86_64 libaio-0.3.110-4.fc22.x86_64 libcom_err-1.42.12-4.fc22.x86_64 libgcc-5.1.1-4.fc22.x86_64 libpciaccess-0.13.3-0.3.fc22.x86_64 libselinux-2.3-10.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64 libtool-ltdl-2.4.2-34.fc22.x86_64 libxml2-2.9.2-4.fc22.x86_64 lz4-r131-1.fc22.x86_64 numactl-libs-2.0.10-2.fc22.x86_64 openssl-libs-1.0.1k-12.fc22.x86_64 pcre-8.37-4.fc22.x86_64 snappy-1.1.1-3.fc22.x86_64 thrift-0.9.1-13.fc22.3.x86_64 xz-libs-5.2.0-2.fc22.x86_64 yaml-cpp-0.5.1-6.fc22.x86_64 zlib-1.2.8-7.fc22.x86_64
(gdb) bt
#0 0x00007f33794529c8 in raise () from /lib64/libc.so.6
#1 0x00007f337945465a in abort () from /lib64/libc.so.6
#2 0x00007f337944b187 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f337944b232 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000008aa7b4 in schedule<future::then(Func&&) [with Func = transport::cql_server::connection::do_flush()::<lambda()>; Result = future<>; T = {}]::<lambda(auto:1&&)> > (
func=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x2763893, DIE 0x28e3f4c>, this=0x6000006b65f8) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:643
#5 thentransport::cql_server::connection::do_flush()::<lambda(), future<> > (func=, this=0x6000006b65f8) at /usr/src/debug/scylla-server-0.8/seastar/core/future.hh:785
#6 transport::cql_server::connection::do_flush (this=0x6000006b6550) at transport/server.cc:839
#7 0x00000000008aa88b in transport::cql_server::poll_pending_responders (this=0x6000000dceb0) at transport/server.cc:214
#8 0x00000000004950da in poll_once (this=0x600000549000) at core/reactor.cc:1272
#9 reactor::run (this=0x600000549000) at core/reactor.cc:1250
#10 0x00000000004f0704 in app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=this@entry=0x7ffcf9644990, ac=ac@entry=11, av=av@entry=0x7ffcf9644bd8,
func=func@entry=<unknown type in /usr/lib/debug/usr/bin/scylla.debug, CU 0x850281, DIE 0x8bf9f0>) at core/app-template.cc:122
#11 0x000000000041c05c in main (ac=11, av=0x7ffcf9644bd8) at main.cc:321
(gdb)

(gdb) i reg
rax 0x0 0
rbx 0x7f337d3bb000 139859121123328
rcx 0x7f33794529c8 139859054635464
rdx 0x6 6
rsi 0x1f68 8040
rdi 0x1f68 8040
rbp 0x10d2d49 0x10d2d49
rsp 0x7ffcf9643f58 0x7ffcf9643f58
r8 0xfefefefefefefeff -72340172838076673
r9 0xffffffffffffff00 -256
r10 0x8 8
r11 0x202 514
r12 0x283 643
r13 0x11ae3c0 18539456
r14 0x0 0
r15 0x7ffcf9644130 140724492583216
rip 0x7f33794529c8 0x7f33794529c8 <raise+56>
eflags 0x202 [ IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0


Reply to this email directly or view it on GitHub
#380 (comment)
.

@gleb-cloudius
Copy link
Contributor

On Sun, Sep 20, 2015 at 11:52:50PM -0700, slivne wrote:

Gleb - can you pleasse take a stab at this and see if you can find why its
breaking

Works for me.

        Gleb.

@slivne
Copy link
Contributor

slivne commented Sep 21, 2015

Not clear

Gleb are you working on fixing the do_flush

  • are you able to reproduce the do_flush issue with single server with the
    ami with enahnced networking
  • are you unable to reproduce the do_flush issue with single_server with
    the ami without enhanced networking

On Mon, Sep 21, 2015 at 11:11 AM, Gleb Natapov notifications@github.com
wrote:

On Sun, Sep 20, 2015 at 11:52:50PM -0700, slivne wrote:

Gleb - can you pleasse take a stab at this and see if you can find why
its
breaking

Works for me.

Gleb.


Reply to this email directly or view it on GitHub
#380 (comment)
.

@gleb-cloudius
Copy link
Contributor

On Mon, Sep 21, 2015 at 01:14:42AM -0700, slivne wrote:

Not clear

Gleb are you working on fixing the do_flush

  • are you able to reproduce the do_flush issue with single server with the
    ami with enahnced networking
  • are you unable to reproduce the do_flush issue with single_server with
    the ami without enhanced networking

I am not able to reproduce the issue running Asias script against one
server with the same ami Asias reports problem with. Works for me.

        Gleb.

@slivne
Copy link
Contributor

slivne commented Sep 21, 2015

Instance type ? are you using the same instance type

On Mon, Sep 21, 2015 at 11:16 AM, Gleb Natapov notifications@github.com
wrote:

On Mon, Sep 21, 2015 at 01:14:42AM -0700, slivne wrote:

Not clear

Gleb are you working on fixing the do_flush

  • are you able to reproduce the do_flush issue with single server with
    the
    ami with enahnced networking
  • are you unable to reproduce the do_flush issue with single_server with
    the ami without enhanced networking

I am not able to reproduce the issue running Asias script against one
server with the same ami Asias reports problem with. Works for me.

Gleb.


Reply to this email directly or view it on GitHub
#380 (comment)
.

@gleb-cloudius
Copy link
Contributor

On Mon, Sep 21, 2015 at 01:18:02AM -0700, slivne wrote:

Instance type ? are you using the same instance type

c4.8xlarge. I have not idea what Asias reproduce it with. I do not see
anything in the bug report.

        Gleb.

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

On Mon, Sep 21, 2015 at 4:16 PM, Gleb Natapov notifications@github.com
wrote:

On Mon, Sep 21, 2015 at 01:14:42AM -0700, slivne wrote:

Not clear

Gleb are you working on fixing the do_flush

  • are you able to reproduce the do_flush issue with single server with
    the
    ami with enahnced networking
  • are you unable to reproduce the do_flush issue with single_server with
    the ami without enhanced networking

I am not able to reproduce the issue running Asias script against one
server with the same ami Asias reports problem with. Works for me.

Use c3.8xlarge for both server and c-s. If one instance of c-s can not
reproduce the issue. Star another instance for c-s. Stress scylla hard.

Asias

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

When you start scylla server, follow instructions here:

https://github.com/cloudius-systems/scylla/wiki/Using-AWS-AMI

Choose instance store 0 and store 1 as the extra two disks.

@gleb-cloudius
Copy link
Contributor

On Mon, Sep 21, 2015 at 01:23:17AM -0700, Asias He wrote:

On Mon, Sep 21, 2015 at 4:16 PM, Gleb Natapov notifications@github.com
wrote:

On Mon, Sep 21, 2015 at 01:14:42AM -0700, slivne wrote:

Not clear

Gleb are you working on fixing the do_flush

  • are you able to reproduce the do_flush issue with single server with
    the
    ami with enahnced networking
  • are you unable to reproduce the do_flush issue with single_server with
    the ami without enhanced networking

I am not able to reproduce the issue running Asias script against one
server with the same ami Asias reports problem with. Works for me.

Use c3.8xlarge for both server and c-s. If one instance of c-s can not
The only difference between this and what I am using is storage (SSD/EBS).

reproduce the issue. Star another instance for c-s. Stress scylla hard.

I run your script. Is it hard enough?

        Gleb.

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

On Mon, Sep 21, 2015 at 4:28 PM, Gleb Natapov notifications@github.com
wrote:

On Mon, Sep 21, 2015 at 01:23:17AM -0700, Asias He wrote:

On Mon, Sep 21, 2015 at 4:16 PM, Gleb Natapov notifications@github.com
wrote:

On Mon, Sep 21, 2015 at 01:14:42AM -0700, slivne wrote:

Not clear

Gleb are you working on fixing the do_flush

  • are you able to reproduce the do_flush issue with single server
    with
    the
    ami with enahnced networking
  • are you unable to reproduce the do_flush issue with single_server
    with
    the ami without enhanced networking

I am not able to reproduce the issue running Asias script against one
server with the same ami Asias reports problem with. Works for me.

Use c3.8xlarge for both server and c-s. If one instance of c-s can not
The only difference between this and what I am using is storage (SSD/EBS).

Yes, but we do not know if the disk difference maters.

reproduce the issue. Star another instance for c-s. Stress scylla hard.

I run your script. Is it hard enough?

Try to add another instance to stress the server.

Gleb.


Reply to this email directly or view it on GitHub
#380 (comment)
.

Asias

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

I started another 3 instances, Reproduced again with 2 load + 1 server:

Sep 21 09:04:47 ip-172-31-40-176 systemd-coredump[1616]: Process 1549 (scylla) of user 992 dumped core.

                                                         Stack trace of thread 1549:
                                                         #0  0x00007f8b4f7709c8 raise (libc.so.6)
                                                         #1  0x00007f8b4f77265a abort (libc.so.6)
                                                         #2  0x00007f8b4f769187 __assert_fail_base (libc.so.6)
                                                         #3  0x00007f8b4f769232 __assert_fail (libc.so.6)
                                                         #4  0x00000000008aa7b4 schedule<future<T>::then(Func&&) [with Func = transport::cql_server::connection::do_flush()::<lambda()>; Result = future<>; T = {}]::<lambda(auto
                                                         #5  0x00000000008aa88b _ZN9transport10cql_server23poll_pending_respondersEv (scylla)
                                                         #6  0x00000000004950da _ZN7reactor9poll_onceEv (scylla)
                                                         #7  0x00000000004f0704 _ZN12app_template14run_deprecatedEiPPcOSt8functionIFvvEE (scylla)
                                                         #8  0x000000000041c05c main (scylla)
                                                         #9  0x00007f8b4f75c700 __libc_start_main (libc.so.6)
                                                         #10 0x000000000046b899 _start (scylla)

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

btw, to allow coredump to be stored on c3.8xlagre

Modify coredump.conf. Disable compress and enlarge the size limit.

[fedora@ip-172-31-40-176 ~]$ cat /etc/systemd/coredump.conf
[Coredump]
Storage=external
Compress=no
ProcessSizeMax=160G
ExternalSizeMax=160G

Bind coredump dir to our data partition which should have 2*320 GB.

sudo mkdir /data/coredump
sudo mount --bind /data/coredump /var/lib/systemd/coredump

@gleb-cloudius
Copy link
Contributor

Can you try with this?

diff --git a/transport/server.cc b/transport/server.cc
index 6a8ccf8..cb301fb 100644
--- a/transport/server.cc
+++ b/transport/server.cc
@@ -431,13 +431,13 @@ future<> cql_server::connection::process()
}
}).finally([this] {
return _pending_requests_gate.close().then([this] {

  •        return std::move(_ready_to_respond).finally([this] {
    
  •            // Remove ourselves from poll list
    
  •            auto i = std::remove(_server._pending_responders.begin(), _server._pending_responders.end(), this);
    
  •            if (i != _server._pending_responders.end()) {
    
  •                _server._pending_responders.pop_back();
    
  •            }
    
  •        });
    
  •        // Remove ourselves from poll list
    
  •        auto i = std::remove(_server._pending_responders.begin(), _server._pending_responders.end(), this);
    
  •        if (i != _server._pending_responders.end()) {
    
  •            _server._pending_responders.pop_back();
    
  •            do_flush();
    
  •        }
    
  •        return std::move(_ready_to_respond);
     });
    

    });
    }

        Gleb.
    

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

I'm trying

Author: Paweł Dziepak <pdziepak@cloudius-systems.com>
Date:   Mon Sep 21 12:15:15 2015 +0200

    probably a do_flush() fix

    Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>

diff --git a/transport/server.hh b/transport/server.hh
index ff6ceda..664b4d6 100644
--- a/transport/server.hh
+++ b/transport/server.hh
@@ -154,6 +154,7 @@ class cql_server::connection {
     service::client_state _client_state;
     std::unordered_map<uint16_t, cql_query_state> _query_states;
     bool _flush_requested = false;
+    bool _terminating = false;
 public:
     connection(cql_server& server, connected_socket&& fd, socket_address addr);
     ~connection();
diff --git a/transport/server.cc b/transport/server.cc
index 6a8ccf8..de86d5c 100644
--- a/transport/server.cc
+++ b/transport/server.cc
@@ -431,13 +431,13 @@ future<> cql_server::connection::process()
         }
     }).finally([this] {
         return _pending_requests_gate.close().then([this] {
-            return std::move(_ready_to_respond).finally([this] {
-                // Remove ourselves from poll list
-                auto i = std::remove(_server._pending_responders.begin(), _server._pending_responders.end(), this);
-                if (i != _server._pending_responders.end()) {
-                    _server._pending_responders.pop_back();
-                }
-            });
+            _terminating = true;
+            // Remove ourselves from poll list
+            auto i = std::remove(_server._pending_responders.begin(), _server._pending_responders.end(), this);
+            if (i != _server._pending_responders.end()) {
+                _server._pending_responders.pop_back();
+            }
+            return std::move(_ready_to_respond);
         });
     });
 }
@@ -825,7 +825,7 @@ future<> cql_server::connection::write_response(shared_ptr<cql_server::response>
 {
     _ready_to_respond = _ready_to_respond.then([this, response = std::move(response)] () mutable {
         return response->output(_write_buf, _version).then([this, response] {
-            if (!_flush_requested) {
+            if (!_terminating && !_flush_requested) {
                 _flush_requested = true;
                 _server._pending_responders.push_back(this);
             }

I will test yours shortly.

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

@gleb-cloudius with your patch, I still see the panic.

@pdziepak
Copy link
Contributor

I am going to send a simplified version of my patch soon. Hopefully, it still will make the problem go away.

@asias
Copy link
Contributor Author

asias commented Sep 21, 2015

OK

On Mon, Sep 21, 2015 at 7:28 PM, Paweł Dziepak notifications@github.com
wrote:

I am going to send a simplified version of my patch soon. Hopefully, it
still will make the problem go away.


Reply to this email directly or view it on GitHub
#380 (comment)
.

Asias

@gleb-cloudius
Copy link
Contributor

On Mon, Sep 21, 2015 at 04:22:20AM -0700, Asias He wrote:

@gleb-cloudius with your patch, I still see the panic.

And with Paweł's you do not? This is strange since they both do the
same eventually.

        Gleb.

@gleb-cloudius
Copy link
Contributor

This one can be closed now.

On Sun, Sep 20, 2015 at 10:53:07PM -0700, Asias He wrote:

[fedora@ip-172-31-33-47 ~]$ rpm -qa|grep scylla
scylla-server-debuginfo-0.8-20150920.1a4c8db.fc22.x86_64
scylla-server-0.8-20150920.1a4c8db.fc22.x86_64
scylla-tools-0.8-20150920.cb4cbcb.fc22.noarch
scylla-jmx-0.8-20150920.197c8c7.fc22.noarch

scylla_2015-09-20T19-16-24Z (ami-b3abb583) Oregon

The instance memory is 60GB which is two large to get core dump to disk.

Sep 21 05:42:29 ip-172-31-33-48 systemd-journal[1378]: Journal started
Sep 21 05:41:14 ip-172-31-33-48 systemd-coredump[1371]: Coredump of 1173 (scylla) is larger than configured processing limit, refusing.
Sep 21 05:41:59 ip-172-31-33-48 systemd[1]: systemd-journald.service watchdog timeout (limit 1min)!
Sep 21 05:42:29 ip-172-31-33-48 audit[504]: auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:syslogd_t:s0 pid=504 comm="systemd-journal" exe="/usr/lib/systemd/systemd-journald" sig=6
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 audit: audit_enabled=1 old=1 auid=4294967295 ses=4294967295 subj=system_u:system_r:syslogd_t:s0 res=1
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter
Sep 21 05:42:29 ip-172-31-33-48 systemd[1]: Starting Flush Journal to Persistent Storage...
Sep 21 05:42:29 ip-172-31-33-48 systemd[1]: Started Flush Journal to Persistent Storage.
Sep 21 05:42:29 ip-172-31-33-48 audit[1]: pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journal-flush comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=
Sep 21 05:42:29 ip-172-31-33-48 systemd-coredump[1375]: Detected coredump of the journal daemon itself, diverted to /var/lib/systemd/coredump/core.systemd-journal.0.2bdb2c4069cd450fb0256b0759bd320c.504.1442814149000000.xz.
Sep 21 05:42:31 ip-172-31-33-48 systemd-coredump[1371]: Process 1173 (scylla) of user 992 dumped core.
Sep 21 05:42:31 ip-172-31-33-48 systemd[1]: scylla-server.service: main process exited, code=exited, status=134/n/a
Sep 21 05:42:31 ip-172-31-33-48 systemd[1]: Stopping User Manager for UID 992...


Reply to this email directly or view it on GitHub:
#380

        Gleb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants