Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glusterfsd segfaults sporadically on 7.5 client #1232

Closed
DennoVonDiesel opened this issue May 9, 2020 · 5 comments
Closed

glusterfsd segfaults sporadically on 7.5 client #1232

DennoVonDiesel opened this issue May 9, 2020 · 5 comments

Comments

@DennoVonDiesel
Copy link

DennoVonDiesel commented May 9, 2020

Description of problem:

We have clients in an AWS autoscaling group that install glusterfs-client from the Ubuntu PPA:

deb http://ppa.launchpad.net/gluster/glusterfs-7/ubuntu bionic main

The version in the repository was recently bumped from 7.4 to 7.5. Clients on the new version have been sporadically crashing. I have not been able to associate it with a specific application as of yet.

The full output of the command that failed:

/var/log/glusterfs/gfs-gv0.log prior and up to the crash:

[2020-05-01 00:20:02.331446] E [fuse-bridge.c:227:check_and_dump_fuse_W] (--> /usr/lib/x86_64-linux-gnu/libglusterfs.so
.0(_gf_log_callingfn+0x17a)[0x7f5344e408ca] (--> /usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x816a)[
0x7f534281516a] (--> /usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x827b)[0x7f534281527b] (--> /lib/x8
6_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f53445a86db] (--> /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f53442d188f
] ))))) 0-glusterfs-fuse: writing to fuse device failed: No such file or directory
[2020-05-01 01:00:05.991184] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote op
eration failed [Directory not empty]
[2020-05-01 01:00:05.992537] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote op
eration failed [Directory not empty]
[2020-05-01 01:00:05.993917] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote op
eration failed [Directory not empty]
[2020-05-01 03:00:02.732700] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote op
eration failed [Directory not empty]
[2020-05-01 03:00:02.737097] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote op
eration failed [Directory not empty]
[2020-05-01 03:00:02.739620] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote op
eration failed [Directory not empty]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [
Directory not empty]" repeated 12 times between [2020-05-01 03:00:02.732700] and [2020-05-01 03:00:07.245070]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [
Directory not empty]" repeated 12 times between [2020-05-01 03:00:02.739620] and [2020-05-01 03:00:07.245505]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [
Directory not empty]" repeated 12 times between [2020-05-01 03:00:02.737097] and [2020-05-01 03:00:07.246761]
[2020-05-01 04:32:14.436412] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:14.436592] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:14.437010] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote op
eration failed [Directory not empty]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [
Directory not empty]" repeated 41 times between [2020-05-01 04:32:14.436412] and [2020-05-01 04:32:16.446570]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [
Directory not empty]" repeated 41 times between [2020-05-01 04:32:14.436592] and [2020-05-01 04:32:16.446711]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [
Directory not empty]" repeated 41 times between [2020-05-01 04:32:14.437010] and [2020-05-01 04:32:16.447129]
[2020-05-01 04:32:49.046958] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:49.047344] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:49.047790] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:14.437010] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote op
eration failed [Directory not empty]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [
Directory not empty]" repeated 41 times between [2020-05-01 04:32:14.436412] and [2020-05-01 04:32:16.446570]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [
Directory not empty]" repeated 41 times between [2020-05-01 04:32:14.436592] and [2020-05-01 04:32:16.446711]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [
Directory not empty]" repeated 41 times between [2020-05-01 04:32:14.437010] and [2020-05-01 04:32:16.447129]
[2020-05-01 04:32:49.046958] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:49.047344] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote op
eration failed [Directory not empty]
[2020-05-01 04:32:49.047790] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote op
eration failed [Directory not empty]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [Directory not empty]" repeated 5 times between [2020-05-01 04:32:49.046958] and [2020-05-01 04:32:49.815254]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [Directory not empty]" repeated 5 times between [2020-05-01 04:32:49.047344] and [2020-05-01 04:32:49.815395]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [Directory not empty]" repeated 5 times between [2020-05-01 04:32:49.047790] and [2020-05-01 04:32:49.815799]
[2020-05-01 04:35:42.326768] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [Directory not empty]
[2020-05-01 04:35:42.328095] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [Directory not empty]
[2020-05-01 04:35:42.330867] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [Directory not empty]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [Directory not empty]" repeated 47 times between [2020-05-01 04:35:42.326768] and [2020-05-01 04:36:17.244992]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [Directory not empty]" repeated 47 times between [2020-05-01 04:35:42.330867] and [2020-05-01 04:36:17.245119]
The message "W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [Directory not empty]" repeated 47 times between [2020-05-01 04:35:42.328095] and [2020-05-01 04:36:17.245526]
[2020-05-01 05:00:05.956227] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-1: remote operation failed [Directory not empty]
[2020-05-01 05:00:05.957379] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-0: remote operation failed [Directory not empty]
[2020-05-01 05:00:05.957952] W [MSGID: 114031] [client-rpc-fops_v2.c:517:client4_0_rmdir_cbk] 2-gv0-client-2: remote operation failed [Directory not empty]
pending frames:
frame : type(0) op(0)
frame : type(1) op(OPEN)
[snip - lots of this]
patchset: git://git.gluster.org/glusterfs.git
signal received: 11
time of crash: 
2020-05-01 08:30:05
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 7.5
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x2298b)[0x7f5344e3e98b]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(gf_print_trace+0x306)[0x7f5344e49026]
/lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f53441eef20]
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_mutex_lock+0x0)[0x7f53445aafa0]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/performance/open-behind.so(+0x3495)[0x7f533c8d8495]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/performance/open-behind.so(+0x3da2)[0x7f533c8d8da2]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/performance/open-behind.so(+0x4022)[0x7f533c8d9022]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/performance/open-behind.so(+0x424f)[0x7f533c8d924f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_unlink+0xc0)[0x7f5344ecae50]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/performance/md-cache.so(+0x3f0c)[0x7f533c4a8f0c]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/debug/io-stats.so(+0x5f1a)[0x7f533c277f1a]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_unlink+0xc0)[0x7f5344ecae50]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x118a9)[0x7f534281e8a9]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7d05)[0x7f5342814d05]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7ab0)[0x7f5342814ab0]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7d47)[0x7f5342814d47]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7093)[0x7f5342814093]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7758)[0x7f5342814758]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7ac0)[0x7f5342814ac0]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7d27)[0x7f5342814d27]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x7d70)[0x7f5342814d70]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x8839)[0x7f5342815839]
/usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/mount/fuse.so(+0x21ec5)[0x7f534282eec5]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f53445a86db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f53442d188f]
---------
[2020-05-04 20:42:34.896007] I [MSGID: 100030] [glusterfsd.c:2867:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 7.5 (args: /usr/sbin/glusterfs --process-name fuse --volfile-server=nomad.service.kong-us-east-2-aws-prod --volfile-id=/gv0 /gfs/gv0)

gdb backtrace of the crash:


    [New LWP 2076]
    [New LWP 2067]
    [New LWP 2065]
    [New LWP 2068]
    [New LWP 2069]
    [New LWP 2070]
    [New LWP 2072]
    [New LWP 2074]
    [New LWP 2071]
    [New LWP 2077]
    [New LWP 2078]
    [New LWP 2073]
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
    Core was generated by `/usr/sbin/glusterfs --process-name fuse --volfile-server=nomad.service.kong-us-'.
    Program terminated with signal SIGSEGV, Segmentation fault.
    #0  0x00007f53445aafa0 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
    [Current thread is 1 (Thread 0x7f53374fd700 (LWP 2076))]
    (gdb) bt
    #0  0x00007f53445aafa0 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f533c8d8495 in ob_fd_free (ob_fd=0x7f532e953da0) at open-behind.c:198
    #2  0x00007f533c8d8da2 in ob_inode_wake (this=this@entry=0x7f533173caa0, ob_fds=ob_fds@entry=0x7f53374fbdc0)
        at open-behind.c:361
    #3  0x00007f533c8d9022 in open_all_pending_fds_and_resume (this=this@entry=0x7f533173caa0, inode=0x7f532c848fd8, 
        stub=0x7f532ec15348) at open-behind.c:447
    #4  0x00007f533c8d924f in ob_unlink (frame=frame@entry=0x7f532ebfe718, this=this@entry=0x7f533173caa0, 
        loc=loc@entry=0x7f532e796140, xflags=xflags@entry=0, xdata=xdata@entry=0x0) at open-behind.c:1001
    #5  0x00007f5344ecae50 in default_unlink (frame=frame@entry=0x7f532ebfe718, this=<optimized out>, 
        loc=loc@entry=0x7f532e796140, flags=flags@entry=0, xdata=xdata@entry=0x0) at defaults.c:2677
    #6  0x00007f533c4a8f0c in mdc_unlink (frame=0x7f532ebedd98, this=0x7f533173eba0, loc=0x7f532e796140, xflag=0, 
        xdata=0x0) at md-cache.c:1681
    #7  0x00007f533c277f1a in ?? () from /usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/debug/io-stats.so
    #8  0x00007f5344ecae50 in default_unlink (frame=frame@entry=0x7f532e957338, this=this@entry=0x7f5331740ca0, 
        loc=loc@entry=0x7f532e796140, flags=flags@entry=0, xdata=0x0) at defaults.c:2677
    #9  0x00007f534281e8a9 in fuse_unlink_resume (state=0x7f532e796120) at fuse-bridge.c:2212
    #10 0x00007f5342814d05 in fuse_resolve_done (state=<optimized out>) at fuse-resolve.c:629
    #11 fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:653
    #12 0x00007f5342814ab0 in fuse_resolve (state=0x7f532e796120) at fuse-resolve.c:620
    #13 0x00007f5342814d47 in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:650
    #14 0x00007f5342814093 in fuse_resolve_continue (state=state@entry=0x7f532e796120) at fuse-resolve.c:668
    #15 0x00007f5342814758 in fuse_resolve_parent (state=state@entry=0x7f532e796120) at fuse-resolve.c:306
    #16 0x00007f5342814ac0 in fuse_resolve (state=0x7f532e796120) at fuse-resolve.c:614
    #17 0x00007f5342814d27 in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:644
    #18 0x00007f5342814d70 in fuse_resolve_and_resume (state=0x7f532e796120, fn=0x7f534281e5f0 <fuse_unlink_resume>)
        at fuse-resolve.c:680
    #19 0x00007f5342815839 in fuse_dispatch (xl=<optimized out>, async=<optimized out>) at fuse-bridge.c:5838
    #20 0x00007f534282eec5 in gf_async (cbk=0x7f5342815810 <fuse_dispatch>, xl=0x560e177ea3c0, async=<optimized out>)
        at ../../../../libglusterfs/src/glusterfs/async.h:189
    #21 fuse_thread_proc (data=0x560e177ea3c0) at fuse-bridge.c:6059
    #22 0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #23 0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6
    (gdb) t a a bt

    Thread 12 (Thread 0x7f533ed26700 (LWP 2073)):
    #0  0x00007f53442d1bb7 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007f5344e9e80e in event_dispatch_epoll_worker (data=0x560e17844aa0) at event-epoll.c:753
    #2  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #3  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 11 (Thread 0x7f53364fb700 (LWP 2078)):
    #0  0x00007f53445ae9f3 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f534281520b in notify_kernel_loop (data=<optimized out>) at fuse-bridge.c:4828
    #2  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #3  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 10 (Thread 0x7f5336cfc700 (LWP 2077)):
    #0  0x00007f53445ae9f3 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f5342815b15 in timed_response_loop (data=<optimized out>) at fuse-bridge.c:4913
    #2  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #3  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 9 (Thread 0x7f5340808700 (LWP 2071)):
    #0  0x00007f53445aef85 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f5344e7c936 in syncenv_task (proc=proc@entry=0x560e177fbbe0) at syncop.c:524
    #2  0x00007f5344e7d730 in syncenv_processor (thdata=0x560e177fbbe0) at syncop.c:591
    #3  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #4  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 8 (Thread 0x7f533e525700 (LWP 2074)):
    #0  0x00007f53442d1bb7 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007f5344e9e80e in event_dispatch_epoll_worker (data=0x560e17844c40) at event-epoll.c:753
    #2  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #3  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 7 (Thread 0x7f5340007700 (LWP 2072)):
    #0  0x00007f53442c703f in select () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007f5344eb5f9c in runner (arg=0x560e177ff7d0) at ../../contrib/timer-wheel/timer-wheel.c:186
    #2  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #3  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 6 (Thread 0x7f5341009700 (LWP 2070)):
    #0  0x00007f53445aef85 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f5344e7c936 in syncenv_task (proc=proc@entry=0x560e177fb820) at syncop.c:524
    #2  0x00007f5344e7d730 in syncenv_processor (thdata=0x560e177fb820) at syncop.c:591
    #3  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #4  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 5 (Thread 0x7f534180a700 (LWP 2069)):
    #0  0x00007f53442949d0 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007f53442948aa in sleep () from /lib/x86_64-linux-gnu/libc.so.6
    #2  0x00007f5344e67b9d in pool_sweeper (arg=<optimized out>) at mem-pool.c:444
    #3  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #4  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 4 (Thread 0x7f534200b700 (LWP 2068)):
    #0  0x00007f53441f026c in sigtimedwait () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007f53445b345c in sigwait () from /lib/x86_64-linux-gnu/libpthread.so.0
    #2  0x0000560e17253e63 in glusterfs_sigwaiter (arg=<optimized out>) at glusterfsd.c:2414
    #3  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #4  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 3 (Thread 0x7f5345344480 (LWP 2065)):
    #0  0x00007f53445a9d2d in __pthread_timedjoin_ex () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f5344e9e02b in event_dispatch_epoll (event_pool=0x560e177de760) at event-epoll.c:848
    #2  0x0000560e172539e7 in main (argc=<optimized out>, argv=<optimized out>) at glusterfsd.c:2917

    Thread 2 (Thread 0x7f534280c700 (LWP 2067)):
    #0  0x00007f53445aeed9 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f5344e4d337 in gf_timer_proc (data=0x560e177fb300) at timer.c:141
    #2  0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #3  0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

    Thread 1 (Thread 0x7f53374fd700 (LWP 2076)):
    #0  0x00007f53445aafa0 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
    #1  0x00007f533c8d8495 in ob_fd_free (ob_fd=0x7f532e953da0) at open-behind.c:198
    #2  0x00007f533c8d8da2 in ob_inode_wake (this=this@entry=0x7f533173caa0, ob_fds=ob_fds@entry=0x7f53374fbdc0)
        at open-behind.c:361
    #3  0x00007f533c8d9022 in open_all_pending_fds_and_resume (this=this@entry=0x7f533173caa0, inode=0x7f532c848fd8, 
        stub=0x7f532ec15348) at open-behind.c:447
    #4  0x00007f533c8d924f in ob_unlink (frame=frame@entry=0x7f532ebfe718, this=this@entry=0x7f533173caa0, 
        loc=loc@entry=0x7f532e796140, xflags=xflags@entry=0, xdata=xdata@entry=0x0) at open-behind.c:1001
    #5  0x00007f5344ecae50 in default_unlink (frame=frame@entry=0x7f532ebfe718, this=<optimized out>, 
        loc=loc@entry=0x7f532e796140, flags=flags@entry=0, xdata=xdata@entry=0x0) at defaults.c:2677
    #6  0x00007f533c4a8f0c in mdc_unlink (frame=0x7f532ebedd98, this=0x7f533173eba0, loc=0x7f532e796140, xflag=0, 
        xdata=0x0) at md-cache.c:1681
    #7  0x00007f533c277f1a in ?? () from /usr/lib/x86_64-linux-gnu/glusterfs/7.5/xlator/debug/io-stats.so
    #8  0x00007f5344ecae50 in default_unlink (frame=frame@entry=0x7f532e957338, this=this@entry=0x7f5331740ca0, 
        loc=loc@entry=0x7f532e796140, flags=flags@entry=0, xdata=0x0) at defaults.c:2677
    #9  0x00007f534281e8a9 in fuse_unlink_resume (state=0x7f532e796120) at fuse-bridge.c:2212
    #10 0x00007f5342814d05 in fuse_resolve_done (state=<optimized out>) at fuse-resolve.c:629
    #11 fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:653
    #12 0x00007f5342814ab0 in fuse_resolve (state=0x7f532e796120) at fuse-resolve.c:620
    #13 0x00007f5342814d47 in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:650
    #14 0x00007f5342814093 in fuse_resolve_continue (state=state@entry=0x7f532e796120) at fuse-resolve.c:668
    #15 0x00007f5342814758 in fuse_resolve_parent (state=state@entry=0x7f532e796120) at fuse-resolve.c:306
    #16 0x00007f5342814ac0 in fuse_resolve (state=0x7f532e796120) at fuse-resolve.c:614
    #17 0x00007f5342814d27 in fuse_resolve_all (state=<optimized out>) at fuse-resolve.c:644
    #18 0x00007f5342814d70 in fuse_resolve_and_resume (state=0x7f532e796120, fn=0x7f534281e5f0 <fuse_unlink_resume>)
        at fuse-resolve.c:680
    #19 0x00007f5342815839 in fuse_dispatch (xl=<optimized out>, async=<optimized out>) at fuse-bridge.c:5838
    #20 0x00007f534282eec5 in gf_async (cbk=0x7f5342815810 <fuse_dispatch>, xl=0x560e177ea3c0, async=<optimized out>)
        at ../../../../libglusterfs/src/glusterfs/async.h:189
    #21 fuse_thread_proc (data=0x560e177ea3c0) at fuse-bridge.c:6059
    #22 0x00007f53445a86db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
    #23 0x00007f53442d188f in clone () from /lib/x86_64-linux-gnu/libc.so.6

- The output of the gluster volume info command:

$ sudo gluster volume info

Volume Name: gv0
Type: Replicate
Volume ID: 610bc7f3-33ff-4951-8e39-4b69670c9174
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: [REDACTED IP]:/data/glusterfs/gv0/brick1/brick
Brick2: [REDACTED IP]:/data/glusterfs/gv0/brick1/brick
Brick3: [REDACTED IP]:/data/glusterfs/gv0/brick1/brick
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
features.barrier: disable

- The operating system / glusterfs version:

Server:

$ uname -a
Linux [REDACTED HOSTNAME] 4.15.0-1054-aws #56-Ubuntu SMP Thu Nov 7 16:15:59 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.4 LTS"
$ apt show glusterfs-server
Package: glusterfs-server
Version: 7.5-ubuntu1~bionic1
Priority: optional
Section: admin
Source: glusterfs
Maintainer: Gluster Packager glusterpackager@gluster.org
[snip]

Client:

$ uname -a
Linux [REDACTED HOSTNAME] 4.15.0-1054-aws #56-Ubuntu SMP Thu Nov 7 16:15:59 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.4 LTS"
$ apt show glusterfs-client
Package: glusterfs-client
Version: 7.5-ubuntu1~bionic1
Priority: optional
Section: admin
Source: glusterfs
Maintainer: Gluster Packager glusterpackager@gluster.org
[snip]

@mohit84
Copy link
Contributor

mohit84 commented May 9, 2020

It is a known issue and we are working on it. To avoid the crash please disable open-behind option for the volume

gluster volume set open-behind off

@xhernandez
Copy link
Contributor

This seems to be the same as #1225.

@DennoVonDiesel
Copy link
Author

Thanks for the quick response! I will test the open-behind workaround starting today and follow-up later this week with more information if need be.

@DennoVonDiesel
Copy link
Author

The workaround is sufficient for us at this time. Please feel free to close or merge this with existing bugs reporting the same issue. Thank you!

@xhernandez
Copy link
Contributor

Closing this bug because it's the same issue as #1225.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants