-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dht: fix use after free issue in dht_setxattr_mds_cbk #3744
Conversation
/run regression |
4187e6b
to
c4ea4b4
Compare
/run regression |
1 test(s) failed 0 test(s) generated core 8 test(s) needed retry 7 flaky test(s) marked as success even though they failed |
/run regression |
xlators/cluster/dht/src/dht-common.c
Outdated
STACK_WIND(frame, dht_common_mds_xattrop_cbk, local->mds_subvol, | ||
local->mds_subvol->fops->xattrop, &local->loc, | ||
GF_XATTROP_ADD_ARRAY, xattrop, NULL); | ||
STACK_WIND_COOKIE(frame, dht_common_mds_xattrop_cbk, frame, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this frame
the same that the callback receives as its first argument ? why do we need to pass it as a cookie ? we could just change the callback to use the frame instead of the cookie.
However it seems to me that the callbacks are using the cookie to get the name of the subvolume where the request was sent. In this case, not passing the cookie is the right thing to do because when STACK_WIND()
is called, it assigns the new frame created for the subvolume to the cookie, so the callback receives the frame of the subvolume, not its own.
I'm not sure what was the problem you saw, but maybe it was caused because the frame was destroyed by the subvolume itself. I don't know.
Anyway, I don't like this implicit way to passing frames that are already completed and could potentially be destroyed. Why not just pass local->mds_subvol
as the cookie ? it should be enough for the purposes of the callback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem was callback function was trying to access the volume name from the cookie but we are calling callback function through STACK_WIND not STACK_WIND_COOKIE. Yes there are both ways to solve the issue either we need to use STACK_WIND_COOKIE or change the callback function, the callback function is using a cookie to print the mds subvol name the same we can access through frame->local also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
STACK_WIND
also initializes the cookie to point to the frame created for the subvolume, so the code wasn't entirely wrong:
295 #define STACK_WIND_COMMON(frame, rfn, has_cookie, cky, obj, fn, params...) \
296 do { \
297 call_frame_t *_new = NULL; \
298 xlator_t *old_THIS = NULL; \
299 typeof(fn) next_xl_fn = fn; \
300 \
301 _new = mem_get0(frame->root->pool->frame_mem_pool); \
302 if (caa_unlikely(!_new)) { \
303 break; \
304 } \
305 typeof(fn##_cbk) tmp_cbk = rfn; \
306 _new->root = frame->root; \
307 _new->parent = frame; \
308 LOCK_INIT(&_new->lock); \
309 /* (void *) is required for avoiding gcc warning */ \
310 _new->cookie = ((has_cookie == 1) ? (void *)(cky) : (void *)_new); \
So what the code was doing was correct, cookie had the correct value. I'm not sure what codepath leads to a use after free given that we always call callbacks synchronously from the child xlator. I'll check it.
However, in your code you pass frame
as the cookie. frame
is the current frame, not the frame created for the subvolume, so when coookie->this->name is used, the name will be the dht's name, not the name of the subvolume, which was the intention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at the stack traces from the address sanitizer, I would say that the problem is that a removexattr request has been unwound, which will eventually destroy the entire stack, while a request sent to a subvolume was still pending.
However I haven't identified where this can happen yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking at the stack traces from the address sanitizer, I would say that the problem is that a removexattr request has been unwound, which will eventually destroy the entire stack, while a request sent to a subvolume was still pending.
However I haven't identified where this can happen yet.
Thanks for identifying it , I will check it again and update.
Hi @mohit84 , I am working on CVE-2022-48340. So I just want to check if there any update on this? Or any other commit that can fix this issue. |
Any update on this? @mohit84 |
The client is throwing below stacktrace while asan is enabled. The client is facing an issue while application is trying to call removexattr in 2x1 subvol and non-mds subvol is down. As we can see in below stacktrace dht_setxattr_mds_cbk is calling dht_setxattr_non_mds_cbk and dht_setxattr_non_mds_cbk is trying to wipe local because call_cnt is 0 but dht_setxattr_mds_cbk is trying to access frame->local that;s why it is crashed. x621000051c34 is located 1844 bytes inside of 4164-byte region [0x621000051500,0x621000052544) freed by thread T7 here: #0 0x7f916ccb9388 in __interceptor_free.part.0 (/lib64/libasan.so.8+0xb9388) GlusterFS#1 0x7f91654af204 in dht_local_wipe /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-helper.c:713 gluster#2 0x7f91654af204 in dht_setxattr_non_mds_cbk /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-common.c:3900 gluster#3 0x7f91694c1f42 in client4_0_removexattr_cbk /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:1061 gluster#4 0x7f91694ba26f in client_submit_request /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client.c:288 gluster#5 0x7f91695021bd in client4_0_removexattr /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:4480 gluster#6 0x7f91694a5f56 in client_removexattr /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client.c:1439 gluster#7 0x7f91654a1161 in dht_setxattr_mds_cbk /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-common.c:3979 gluster#8 0x7f91694c1f42 in client4_0_removexattr_cbk /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:1061 gluster#9 0x7f916cbc4340 in rpc_clnt_handle_reply /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:723 gluster#10 0x7f916cbc4340 in rpc_clnt_notify /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:890 gluster#11 0x7f916cbb7ec5 in rpc_transport_notify /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-transport.c:504 gluster#12 0x7f916a1aa5fa in socket_event_poll_in_async /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2358 gluster#13 0x7f916a1bd7c2 in gf_async ../../../../libglusterfs/src/glusterfs/async.h:187 gluster#14 0x7f916a1bd7c2 in socket_event_poll_in /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2399 gluster#15 0x7f916a1bd7c2 in socket_event_handler /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2790 gluster#16 0x7f916a1bd7c2 in socket_event_handler /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2710 gluster#17 0x7f916c946d22 in event_dispatch_epoll_handler /root/glusterfs_new/glusterfs/libglusterfs/src/event-epoll.c:614 gluster#18 0x7f916c946d22 in event_dispatch_epoll_worker /root/glusterfs_new/glusterfs/libglusterfs/src/event-epoll.c:725 gluster#19 0x7f916be8cdec in start_thread (/lib64/libc.so.6+0x8cdec) Solution: Use switch instead of using if statement to wind a operation, in case of switch the code will not try to access local after wind a operation for last dht subvol. Fixes: gluster#3732 Change-Id: I031bc814d6df98058430ef4de7040e3370d1c677 Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
c4ea4b4
to
11ff6f5
Compare
!!! Couldn't read commit file !!! |
It is a day one bug when a feature was implemented, the
issue was not caught earlier because the client crash
depends on the environment.The client was crashing because
during wind it was not passing cookie and in cbk it
was trying to access cookie.
Solution: Pass a xlator_t as a cookie during STACK_WIND_COOKIE
to avoid a crash.
Fixes: #3732
Change-Id: I5fd75e90a94d852093dfa40fb37d4211b4e9fdfb
Signed-off-by: Mohit Agrawal moagrawa@redhat.com