-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AddressSanitizer: heap-use-after-free in GlusterFS clients #3945
Comments
Hi @mohit84 , I reported this bug several months ago. Recently, I'm trying to debug it and fix it. However, there are some issues still making me confused. Would you mind helping me? Thanks in advance. First, the Before the reply are sent back from servers, a another fuse operation Finally, when the replies are back and use the Here, I'm confused about how Do you have any idea about this? Looking forward to seeing your help. |
Can you please try to reproduce the same after disabling open-behind? |
Hi @mohit84 , Thanks a lot for your reply. I'm continuously fuzzing it, and this bug is triggered sometimes. The key confusion about this bug for me now is how can FUSE_OPEN and FUSE_RELEASE happens on the same file/fd concurrently. Do you have any idea about this? Thanks in advance. |
It can happen during graph switch. During graph switch it(fuse) opens a new fd on a new subvol and unref old fd. |
Thanks for your reply. Does here the Best, |
here fuse means gluster fuse, during fd_unref if refcount has reached 0 then fuse winds a release(dir) fop during fd cleanup(fd_destroy) so it can be possible. I don't think we can ignore it. I am not sure in what scenario you faced this issue. |
However, the memory release is done by 0x6040000949a8 is located 24 bytes inside of 44-byte region [0x604000094990,0x6040000949bc)
freed by thread T9 here:
#0 0x7ffff76a07cf in __interceptor_free (/lib/x86_64-linux-gnu/libasan.so.5+0x10d7cf)
#1 0x7ffff735be19 in __gf_free /root/glusterfs/libglusterfs/src/mem-pool.c:383
#2 0x7ffff2f2160f in fuse_fd_ctx_destroy /root/glusterfs/xlators/mount/fuse/src/fuse-bridge.c:141
#3 0x7ffff2f64205 in fuse_release /root/glusterfs/xlators/mount/fuse/src/fuse-bridge.c:3483
#4 0x7ffff2f5dad9 in fuse_dispatch /root/glusterfs/xlators/mount/fuse/src/fuse-bridge.c:6091
#5 0x7ffff2f6fd8d in gf_async ../../../../libglusterfs/src/glusterfs/async.h:187
#6 0x7ffff2f6fd8d in fuse_thread_proc /root/glusterfs/xlators/mount/fuse/src/fuse-bridge.c:6326
#7 0x7ffff71c5608 in start_thread /build/glibc-YYA7BZ/glibc-2.31/nptl/pthread_create.c:477 The gluster configuration is: This bug can sometimes be triggered by this PoC: r0 = open$dir(&(0x7f0000000000)='./file0\x00', 0x40040, 0x0)
r1 = open(&(0x7f0000000040)='./file0\x00', 0x2300, 0x0)
fsetxattr$security_ima(r1, &(0x7f0000000080), 0x0, 0x0, 0x0)
r2 = open(&(0x7f00000000c0)='./file0/file0\x00', 0x100, 0x24)
write$binfmt_aout(r0, &(0x7f0000000640)={{0x108, 0xd8, 0x3, 0x350, 0x22f, 0x9, 0x245, 0x9}, "a30fc845338b1fc576d17087199eeb89296aefe77a34cf64359bf31dcbb5dab07ca85b4b01a39c76def457575040a300a6ae1b78df0ee3b72eeed79c924dfcc320631d34e006729738e0d07c2091e8b22ea5055afc5caaf4f9eb0a2472197c32634d499da949189cd13b4cea467ba55317de10e83608ee5f49821a17c67a7a67f4f87866562f6a92783556ab9cb424887d1a27", ['\x00', '\x00', '\x00', '\x00', '\x00']}, 0x5b3) |
Yes in this case FUSE_RELEASE is triggered by the kernel. Is it possible to capture the fuse dump you need to pass the file-name to the option dump-fuse to a client process? |
Hi, I dumped the fuse dump below. But the issue is that the execution during the dump doesn't trigger this bug because we have no idea what the concurrency requirements are.
|
Found the issue: |
Is this order incorrect perhaps? fuse_fd_ctx_destroy(this, state->fd);
fd_unref(state->fd); |
Correct. I will send a fix for this. This release should happen when xlator release()/releasedir() are called. |
Something like this. Need to test this patch, when I get some time. diff --git a/xlators/mount/fuse/src/fuse-bridge.c b/xlators/mount/fuse/src/fuse-bridge.c
index 05eae9439c..027dedca38 100644
--- a/xlators/mount/fuse/src/fuse-bridge.c
+++ b/xlators/mount/fuse/src/fuse-bridge.c
@@ -3452,13 +3452,8 @@ fuse_release(xlator_t *this, fuse_in_header_t *finh, void *msg,
fd_close(state->fd);
- fuse_fd_ctx_destroy(this, state->fd);
- fd_unref(fd);
-
gf_fdptr_put(priv->fdtable, fd);
- state->fd = NULL;
-
out:
send_fuse_err(this, finh, 0);
@@ -3904,13 +3899,8 @@ fuse_releasedir(xlator_t *this, fuse_in_header_t *finh, void *msg,
gf_log("glusterfs-fuse", GF_LOG_TRACE,
"finh->unique: %" PRIu64 ": RELEASEDIR %p", finh->unique, state->fd);
- fuse_fd_ctx_destroy(this, state->fd);
- fd_unref(state->fd);
-
gf_fdptr_put(priv->fdtable, state->fd);
- state->fd = NULL;
-
out:
send_fuse_err(this, finh, 0);
@@ -7101,7 +7091,8 @@ struct xlator_fops fops;
struct xlator_cbks cbks = {.invalidate = fuse_invalidate,
.forget = fuse_forget_cbk,
- .release = fuse_internal_release};
+ .release = fuse_internal_release,
+ .releasedir = fuse_internal_release}; |
Problem: fuse_fd_ctx_destroy() is being called in fuse_release()/fuse_releasedir() even before all the refs on the fd are released. This can lead to race situations where the fd_ctx is accessed after freeing. Fix: Make fuse_release()/fuse_releasedir() do the unrefs and let the final unref call xlator's release()/releasedir() like they are supposed to. Fixes: gluster#3945 Change-Id: If01acae815dd7a2b99eb012fff17ce2d044aa9dc Signed-off-by: Pranith Kumar Karampuri <pranith.karampuri@phonepe.com>
Description of problem:
I met this heap use after free bug several times. But I can't reproduce it because it requires exact concurrency which I have no idea what it is.
The exact command to reproduce the issue:
GlusterFS cluster is configured with 3 servers and 1 client with this mode:
gluster volume create test-volume disperse 3 redundancy 1 $srvs force
This bug can sometimes be triggered by this PoC:
- Is there any crash ? Provide the backtrace and coredump
Yes, as below:
- The operating system / glusterfs version:
Ubuntu 20.04 LTS with kernel 5.15
GlusterFS 79154ae
The text was updated successfully, but these errors were encountered: