Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing condition when bricks not already mounted would lead to gluster mount failing. #13

Closed
wants to merge 1 commit into from

Conversation

sbauza
Copy link

@sbauza sbauza commented Feb 25, 2013

With XFS bricks, there is race condition at boot when bricks filesystem is not already mounted and glusterd fails to start up.
By adding "local-filesystems" event to the mounting script, it schedules mounting glusterfs mountpoint only after all bricks available.

Trigger gluster client mount with upstart event "local-filesystems" for making sure bricks are already mounted
@avati
Copy link
Member

avati commented Sep 7, 2013

Please submit this patch through review.gluster.org. More details at http://www.gluster.org/community/documentation/index.php/Development_Work_Flow

@avati avati closed this Sep 7, 2013
gluster-ant pushed a commit that referenced this pull request Aug 20, 2018
…re.t

Problem:
In line #13 of the test case, it checks whether the file is present
on first 2 bricks or not. If it is not present on even one of the bricks
it will break the loop and checks for the dirty marking on the parent
on the 3rd brick and checks for file not present on the 1st and 2nd
bricks. The below scenario can happen in this case:
- File gets created on 1st and 3rd bricks
- In line #13 it sees file is not present on both 1st & 2nd bricks and
  breaks the loop
- In line #51 test fails because the file will be present on the 1st brick
- In line #53 test will fail because the file creation was not failed on
  quorum bricks and dirty marking will not be there on the parent on 3rd
  brick

Fix:
Don't break from the loop if file is present on either brick 1 or brick 2.

Change-Id: I918068165e4b9124c1de86cfb373801b5b432bd9
fixes: bz#1612054
Signed-off-by: karthik-us <ksubrahm@redhat.com>
amarts pushed a commit to amarts/glusterfs_fork that referenced this pull request Sep 11, 2018
…re.t

Problem:
In line gluster#13 of the test case, it checks whether the file is present
on first 2 bricks or not. If it is not present on even one of the bricks
it will break the loop and checks for the dirty marking on the parent
on the 3rd brick and checks for file not present on the 1st and 2nd
bricks. The below scenario can happen in this case:
- File gets created on 1st and 3rd bricks
- In line gluster#13 it sees file is not present on both 1st & 2nd bricks and
  breaks the loop
- In line gluster#51 test fails because the file will be present on the 1st brick
- In line gluster#53 test will fail because the file creation was not failed on
  quorum bricks and dirty marking will not be there on the parent on 3rd
  brick

Fix:
Don't break from the loop if file is present on either brick 1 or brick 2.

Change-Id: I918068165e4b9124c1de86cfb373801b5b432bd9
fixes: bz#1612054
Signed-off-by: karthik-us <ksubrahm@redhat.com>
gluster-ant pushed a commit that referenced this pull request Dec 17, 2018
Traceback:

Direct leak of 765 byte(s) in 9 object(s) allocated from:
    #0 0x7ffb9cad2c48 in malloc (/lib64/libasan.so.5+0xeec48)
    #1 0x7ffb9c5f8949 in __gf_malloc ./libglusterfs/src/mem-pool.c:136
    #2 0x7ffb9c5f91bb in gf_vasprintf ./libglusterfs/src/mem-pool.c:236
    #3 0x7ffb9c5f938a in gf_asprintf ./libglusterfs/src/mem-pool.c:256
    #4 0x7ffb826714ab in afr_get_heal_info ./xlators/cluster/afr/src/afr-common.c:6204
    #5 0x7ffb825765e5 in afr_handle_heal_xattrs ./xlators/cluster/afr/src/afr-inode-read.c:1481
    #6 0x7ffb825765e5 in afr_getxattr ./xlators/cluster/afr/src/afr-inode-read.c:1571
    #7 0x7ffb9c635af7 in syncop_getxattr ./libglusterfs/src/syncop.c:1680
    #8 0x406c78 in glfsh_process_entries ./heal/src/glfs-heal.c:810
    #9 0x408555 in glfsh_crawl_directory ./heal/src/glfs-heal.c:898
    #10 0x408cc0 in glfsh_print_pending_heals_type ./heal/src/glfs-heal.c:970
    #11 0x408fc5 in glfsh_print_pending_heals ./heal/src/glfs-heal.c:1012
    #12 0x409546 in glfsh_gather_heal_info ./heal/src/glfs-heal.c:1154
    #13 0x403e96 in main ./heal/src/glfs-heal.c:1745
    #14 0x7ffb99bc411a in __libc_start_main ../csu/libc-start.c:308

The dictionary is referenced by caller to print the status.
So set it as dynstr, the last unref of dictionary will free it.

updates: bz#1633930
Change-Id: Ib5a7cb891e6f7d90560859aaf6239e52ff5477d0
Signed-off-by: Kotresh HR <khiremat@redhat.com>
gluster-ant pushed a commit that referenced this pull request Jul 24, 2019
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1730715
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
gluster-ant pushed a commit that referenced this pull request Aug 22, 2019
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1739427
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
gluster-ant pushed a commit that referenced this pull request Oct 24, 2019
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.


Backport of:
 > Patch:https://review.gluster.org/#/c/glusterfs/+/23066/
 > Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
 > BUG: 1730715
 > Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
 
(cherry picked from commit b01a435)

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1739451
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
guihecheng pushed a commit to guihecheng/glusterfs that referenced this pull request Nov 13, 2019
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test gluster#13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Upstream-patch: https://review.gluster.org/c/glusterfs/+/23066
Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
BUG: 1732779
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/176870
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
gluster-ant pushed a commit that referenced this pull request Feb 25, 2020
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1805053
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
dmantipov added a commit to dmantipov/glusterfs that referenced this pull request Aug 26, 2021
When handling RPC_CLNT_DISCONNECT event, glustershd may be already
disconnected and removed from the list of services, and an attempt
to extract an entry from empty list causes the following error:

==1364671==ERROR: AddressSanitizer: heap-buffer-overflow on address ...

READ of size 1 at 0x60d00001c48f thread T23
    #0 0x7ff1a5f6db8c in __interceptor_fopen64.part.0 (/lib64/libasan.so.6+0x53b8c)
    gluster#1 0x7ff1a5c63717 in gf_is_service_running libglusterfs/src/common-utils.c:4180
    gluster#2 0x7ff190178ad3 in glusterd_proc_is_running xlators/mgmt/glusterd/src/glusterd-proc-mgmt.c:157
    gluster#3 0x7ff19017ce29 in glusterd_muxsvc_common_rpc_notify xlators/mgmt/glusterd/src/glusterd-svc-mgmt.c:440
    gluster#4 0x7ff190176e75 in __glusterd_muxsvc_conn_common_notify xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c:172
    gluster#5 0x7ff18fee0940 in glusterd_big_locked_notify xlators/mgmt/glusterd/src/glusterd-handler.c:66
    gluster#6 0x7ff190176ec7 in glusterd_muxsvc_conn_common_notify xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c:183
    gluster#7 0x7ff1a5b57b60 in rpc_clnt_handle_disconnect rpc/rpc-lib/src/rpc-clnt.c:821
    gluster#8 0x7ff1a5b58082 in rpc_clnt_notify rpc/rpc-lib/src/rpc-clnt.c:882
    gluster#9 0x7ff1a5b4da47 in rpc_transport_notify rpc/rpc-lib/src/rpc-transport.c:520
    gluster#10 0x7ff18fba1d4f in socket_event_poll_err rpc/rpc-transport/socket/src/socket.c:1370
    gluster#11 0x7ff18fbb223c in socket_event_handler rpc/rpc-transport/socket/src/socket.c:2971
    gluster#12 0x7ff1a5d646ff in event_dispatch_epoll_handler libglusterfs/src/event-epoll.c:638
    gluster#13 0x7ff1a5d6539c in event_dispatch_epoll_worker libglusterfs/src/event-epoll.c:749
    gluster#14 0x7ff1a5917298 in start_thread /usr/src/debug/glibc-2.33-20.fc34.x86_64/nptl/pthread_create.c:481
    gluster#15 0x7ff1a5551352 in clone (/lib64/libc.so.6+0x100352)

0x60d00001c48f is located 12 bytes to the right of 131-byte region [0x60d00001c400,0x60d00001c483)
freed by thread T19 here:
    #0 0x7ff1a5fc8647 in free (/lib64/libasan.so.6+0xae647)

Signed-off-by: Dmitry Antipov <dantipov@cloudlinux.com>
Updates: gluster#1000
dmantipov added a commit to dmantipov/glusterfs that referenced this pull request Aug 26, 2021
When handling RPC_CLNT_DISCONNECT event, glustershd may be already
disconnected and removed from the list of services, and an attempt
to extract an entry from empty list causes the following error:

==1364671==ERROR: AddressSanitizer: heap-buffer-overflow on address ...

READ of size 1 at 0x60d00001c48f thread T23
    #0 0x7ff1a5f6db8c in __interceptor_fopen64.part.0 (/lib64/libasan.so.6+0x53b8c)
    gluster#1 0x7ff1a5c63717 in gf_is_service_running libglusterfs/src/common-utils.c:4180
    gluster#2 0x7ff190178ad3 in glusterd_proc_is_running xlators/mgmt/glusterd/src/glusterd-proc-mgmt.c:157
    gluster#3 0x7ff19017ce29 in glusterd_muxsvc_common_rpc_notify xlators/mgmt/glusterd/src/glusterd-svc-mgmt.c:440
    gluster#4 0x7ff190176e75 in __glusterd_muxsvc_conn_common_notify xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c:172
    gluster#5 0x7ff18fee0940 in glusterd_big_locked_notify xlators/mgmt/glusterd/src/glusterd-handler.c:66
    gluster#6 0x7ff190176ec7 in glusterd_muxsvc_conn_common_notify xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c:183
    gluster#7 0x7ff1a5b57b60 in rpc_clnt_handle_disconnect rpc/rpc-lib/src/rpc-clnt.c:821
    gluster#8 0x7ff1a5b58082 in rpc_clnt_notify rpc/rpc-lib/src/rpc-clnt.c:882
    gluster#9 0x7ff1a5b4da47 in rpc_transport_notify rpc/rpc-lib/src/rpc-transport.c:520
    gluster#10 0x7ff18fba1d4f in socket_event_poll_err rpc/rpc-transport/socket/src/socket.c:1370
    gluster#11 0x7ff18fbb223c in socket_event_handler rpc/rpc-transport/socket/src/socket.c:2971
    gluster#12 0x7ff1a5d646ff in event_dispatch_epoll_handler libglusterfs/src/event-epoll.c:638
    gluster#13 0x7ff1a5d6539c in event_dispatch_epoll_worker libglusterfs/src/event-epoll.c:749
    gluster#14 0x7ff1a5917298 in start_thread /usr/src/debug/glibc-2.33-20.fc34.x86_64/nptl/pthread_create.c:481
    gluster#15 0x7ff1a5551352 in clone (/lib64/libc.so.6+0x100352)

0x60d00001c48f is located 12 bytes to the right of 131-byte region [0x60d00001c400,0x60d00001c483)
freed by thread T19 here:
    #0 0x7ff1a5fc8647 in free (/lib64/libasan.so.6+0xae647)

Signed-off-by: Dmitry Antipov <dantipov@cloudlinux.com>
Updates: gluster#1000
amarts pushed a commit that referenced this pull request Sep 4, 2021
When handling RPC_CLNT_DISCONNECT event, glustershd may be already
disconnected and removed from the list of services, and an attempt
to extract an entry from empty list causes the following error:

==1364671==ERROR: AddressSanitizer: heap-buffer-overflow on address ...

READ of size 1 at 0x60d00001c48f thread T23
    #0 0x7ff1a5f6db8c in __interceptor_fopen64.part.0 (/lib64/libasan.so.6+0x53b8c)
    #1 0x7ff1a5c63717 in gf_is_service_running libglusterfs/src/common-utils.c:4180
    #2 0x7ff190178ad3 in glusterd_proc_is_running xlators/mgmt/glusterd/src/glusterd-proc-mgmt.c:157
    #3 0x7ff19017ce29 in glusterd_muxsvc_common_rpc_notify xlators/mgmt/glusterd/src/glusterd-svc-mgmt.c:440
    #4 0x7ff190176e75 in __glusterd_muxsvc_conn_common_notify xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c:172
    #5 0x7ff18fee0940 in glusterd_big_locked_notify xlators/mgmt/glusterd/src/glusterd-handler.c:66
    #6 0x7ff190176ec7 in glusterd_muxsvc_conn_common_notify xlators/mgmt/glusterd/src/glusterd-conn-mgmt.c:183
    #7 0x7ff1a5b57b60 in rpc_clnt_handle_disconnect rpc/rpc-lib/src/rpc-clnt.c:821
    #8 0x7ff1a5b58082 in rpc_clnt_notify rpc/rpc-lib/src/rpc-clnt.c:882
    #9 0x7ff1a5b4da47 in rpc_transport_notify rpc/rpc-lib/src/rpc-transport.c:520
    #10 0x7ff18fba1d4f in socket_event_poll_err rpc/rpc-transport/socket/src/socket.c:1370
    #11 0x7ff18fbb223c in socket_event_handler rpc/rpc-transport/socket/src/socket.c:2971
    #12 0x7ff1a5d646ff in event_dispatch_epoll_handler libglusterfs/src/event-epoll.c:638
    #13 0x7ff1a5d6539c in event_dispatch_epoll_worker libglusterfs/src/event-epoll.c:749
    #14 0x7ff1a5917298 in start_thread /usr/src/debug/glibc-2.33-20.fc34.x86_64/nptl/pthread_create.c:481
    #15 0x7ff1a5551352 in clone (/lib64/libc.so.6+0x100352)

0x60d00001c48f is located 12 bytes to the right of 131-byte region [0x60d00001c400,0x60d00001c483)
freed by thread T19 here:
    #0 0x7ff1a5fc8647 in free (/lib64/libasan.so.6+0xae647)

Signed-off-by: Dmitry Antipov <dantipov@cloudlinux.com>
Updates: #1000
dmantipov added a commit to dmantipov/glusterfs that referenced this pull request Nov 29, 2021
Unconditionally free serialized dict data
in '__glusterd_send_svc_configure_req()'.

Found with AddressSanitizer:

==273334==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 89 byte(s) in 1 object(s) allocated from:
    #0 0x7fc2ce2a293f in __interceptor_malloc (/lib64/libasan.so.6+0xae93f)
    gluster#1 0x7fc2cdff9c6c in __gf_malloc libglusterfs/src/mem-pool.c:201
    gluster#2 0x7fc2cdff9c6c in __gf_malloc libglusterfs/src/mem-pool.c:188
    gluster#3 0x7fc2cdf86bde in dict_allocate_and_serialize libglusterfs/src/dict.c:3285
    gluster#4 0x7fc2b8398843 in __glusterd_send_svc_configure_req xlators/mgmt/glusterd/src/glusterd-svc-helper.c:830
    gluster#5 0x7fc2b8399238 in glusterd_attach_svc xlators/mgmt/glusterd/src/glusterd-svc-helper.c:932
    gluster#6 0x7fc2b83a60f1 in glusterd_shdsvc_start xlators/mgmt/glusterd/src/glusterd-shd-svc.c:509
    gluster#7 0x7fc2b83a5124 in glusterd_shdsvc_manager xlators/mgmt/glusterd/src/glusterd-shd-svc.c:335
    gluster#8 0x7fc2b8395364 in glusterd_svcs_manager xlators/mgmt/glusterd/src/glusterd-svc-helper.c:143
    gluster#9 0x7fc2b82e3a6c in glusterd_op_start_volume xlators/mgmt/glusterd/src/glusterd-volume-ops.c:2412
    gluster#10 0x7fc2b835ec5a in gd_mgmt_v3_commit_fn xlators/mgmt/glusterd/src/glusterd-mgmt.c:329
    gluster#11 0x7fc2b8365497 in glusterd_mgmt_v3_commit xlators/mgmt/glusterd/src/glusterd-mgmt.c:1639
    gluster#12 0x7fc2b836ad30 in glusterd_mgmt_v3_initiate_all_phases xlators/mgmt/glusterd/src/glusterd-mgmt.c:2651
    gluster#13 0x7fc2b82d504b in __glusterd_handle_cli_start_volume xlators/mgmt/glusterd/src/glusterd-volume-ops.c:364
    gluster#14 0x7fc2b817465c in glusterd_big_locked_handler xlators/mgmt/glusterd/src/glusterd-handler.c:79
    gluster#15 0x7fc2ce020ff9 in synctask_wrap libglusterfs/src/syncop.c:385
    gluster#16 0x7fc2cd69184f  (/lib64/libc.so.6+0x5784f)

Signed-off-by: Dmitry Antipov <dantipov@cloudlinux.com>
Fixes: gluster#1000
xhernandez pushed a commit that referenced this pull request Mar 4, 2022
Unconditionally free serialized dict data
in '__glusterd_send_svc_configure_req()'.

Found with AddressSanitizer:

==273334==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 89 byte(s) in 1 object(s) allocated from:
    #0 0x7fc2ce2a293f in __interceptor_malloc (/lib64/libasan.so.6+0xae93f)
    #1 0x7fc2cdff9c6c in __gf_malloc libglusterfs/src/mem-pool.c:201
    #2 0x7fc2cdff9c6c in __gf_malloc libglusterfs/src/mem-pool.c:188
    #3 0x7fc2cdf86bde in dict_allocate_and_serialize libglusterfs/src/dict.c:3285
    #4 0x7fc2b8398843 in __glusterd_send_svc_configure_req xlators/mgmt/glusterd/src/glusterd-svc-helper.c:830
    #5 0x7fc2b8399238 in glusterd_attach_svc xlators/mgmt/glusterd/src/glusterd-svc-helper.c:932
    #6 0x7fc2b83a60f1 in glusterd_shdsvc_start xlators/mgmt/glusterd/src/glusterd-shd-svc.c:509
    #7 0x7fc2b83a5124 in glusterd_shdsvc_manager xlators/mgmt/glusterd/src/glusterd-shd-svc.c:335
    #8 0x7fc2b8395364 in glusterd_svcs_manager xlators/mgmt/glusterd/src/glusterd-svc-helper.c:143
    #9 0x7fc2b82e3a6c in glusterd_op_start_volume xlators/mgmt/glusterd/src/glusterd-volume-ops.c:2412
    #10 0x7fc2b835ec5a in gd_mgmt_v3_commit_fn xlators/mgmt/glusterd/src/glusterd-mgmt.c:329
    #11 0x7fc2b8365497 in glusterd_mgmt_v3_commit xlators/mgmt/glusterd/src/glusterd-mgmt.c:1639
    #12 0x7fc2b836ad30 in glusterd_mgmt_v3_initiate_all_phases xlators/mgmt/glusterd/src/glusterd-mgmt.c:2651
    #13 0x7fc2b82d504b in __glusterd_handle_cli_start_volume xlators/mgmt/glusterd/src/glusterd-volume-ops.c:364
    #14 0x7fc2b817465c in glusterd_big_locked_handler xlators/mgmt/glusterd/src/glusterd-handler.c:79
    #15 0x7fc2ce020ff9 in synctask_wrap libglusterfs/src/syncop.c:385
    #16 0x7fc2cd69184f  (/lib64/libc.so.6+0x5784f)

Signed-off-by: Dmitry Antipov <dantipov@cloudlinux.com>
Fixes: #1000
csabahenk pushed a commit to csabahenk/glusterfs that referenced this pull request Mar 7, 2023
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test gluster#13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Upstream-patch: https://review.gluster.org/c/glusterfs/+/23066
Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
BUG: 1732779
Signed-off-by: Xavi Hernandez <xhernandez@redhat.com>
Reviewed-on: https://code.engineering.redhat.com/gerrit/176870
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Sunil Kumar Heggodu Gopala Acharya <sheggodu@redhat.com>
mohit84 added a commit to mohit84/glusterfs that referenced this pull request Oct 20, 2023
The client is throwing below stacktrace while asan is enabled.
The client is facing an issue while application is trying
to call removexattr in 2x1 subvol and non-mds subvol is down.
As we can see in below stacktrace dht_setxattr_mds_cbk is calling
dht_setxattr_non_mds_cbk and dht_setxattr_non_mds_cbk is trying to
wipe local because call_cnt is 0 but dht_setxattr_mds_cbk is trying
to access frame->local that;s why it is crashed.

x621000051c34 is located 1844 bytes inside of 4164-byte region [0x621000051500,0x621000052544)
freed by thread T7 here:
    #0 0x7f916ccb9388 in __interceptor_free.part.0 (/lib64/libasan.so.8+0xb9388)
    GlusterFS#1 0x7f91654af204 in dht_local_wipe /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-helper.c:713
    gluster#2 0x7f91654af204 in dht_setxattr_non_mds_cbk /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-common.c:3900
    gluster#3 0x7f91694c1f42 in client4_0_removexattr_cbk /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:1061
    gluster#4 0x7f91694ba26f in client_submit_request /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client.c:288
    gluster#5 0x7f91695021bd in client4_0_removexattr /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:4480
    gluster#6 0x7f91694a5f56 in client_removexattr /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client.c:1439
    gluster#7 0x7f91654a1161 in dht_setxattr_mds_cbk /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-common.c:3979
    gluster#8 0x7f91694c1f42 in client4_0_removexattr_cbk /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:1061
    gluster#9 0x7f916cbc4340 in rpc_clnt_handle_reply /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:723
    gluster#10 0x7f916cbc4340 in rpc_clnt_notify /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:890
    gluster#11 0x7f916cbb7ec5 in rpc_transport_notify /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-transport.c:504
    gluster#12 0x7f916a1aa5fa in socket_event_poll_in_async /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2358
    gluster#13 0x7f916a1bd7c2 in gf_async ../../../../libglusterfs/src/glusterfs/async.h:187
    gluster#14 0x7f916a1bd7c2 in socket_event_poll_in /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2399
    gluster#15 0x7f916a1bd7c2 in socket_event_handler /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2790
    gluster#16 0x7f916a1bd7c2 in socket_event_handler /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2710
    gluster#17 0x7f916c946d22 in event_dispatch_epoll_handler /root/glusterfs_new/glusterfs/libglusterfs/src/event-epoll.c:614
    gluster#18 0x7f916c946d22 in event_dispatch_epoll_worker /root/glusterfs_new/glusterfs/libglusterfs/src/event-epoll.c:725
    gluster#19 0x7f916be8cdec in start_thread (/lib64/libc.so.6+0x8cdec)

Solution: Use switch instead of using if statement to wind a operation, in case of switch
          the code will not try to access local after wind a operation for last dht
          subvol.

Fixes: gluster#3732
Change-Id: I031bc814d6df98058430ef4de7040e3370d1c677
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
mohit84 added a commit that referenced this pull request Oct 23, 2023
The client is throwing below stacktrace while asan is enabled.
The client is facing an issue while application is trying
to call removexattr in 2x1 subvol and non-mds subvol is down.
As we can see in below stacktrace dht_setxattr_mds_cbk is calling
dht_setxattr_non_mds_cbk and dht_setxattr_non_mds_cbk is trying to
wipe local because call_cnt is 0 but dht_setxattr_mds_cbk is trying
to access frame->local that;s why it is crashed.

x621000051c34 is located 1844 bytes inside of 4164-byte region [0x621000051500,0x621000052544)
freed by thread T7 here:
    #0 0x7f916ccb9388 in __interceptor_free.part.0 (/lib64/libasan.so.8+0xb9388)
    #1 0x7f91654af204 in dht_local_wipe /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-helper.c:713
    #2 0x7f91654af204 in dht_setxattr_non_mds_cbk /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-common.c:3900
    #3 0x7f91694c1f42 in client4_0_removexattr_cbk /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:1061
    #4 0x7f91694ba26f in client_submit_request /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client.c:288
    #5 0x7f91695021bd in client4_0_removexattr /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:4480
    #6 0x7f91694a5f56 in client_removexattr /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client.c:1439
    #7 0x7f91654a1161 in dht_setxattr_mds_cbk /root/glusterfs_new/glusterfs/xlators/cluster/dht/src/dht-common.c:3979
    #8 0x7f91694c1f42 in client4_0_removexattr_cbk /root/glusterfs_new/glusterfs/xlators/protocol/client/src/client-rpc-fops_v2.c:1061
    #9 0x7f916cbc4340 in rpc_clnt_handle_reply /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:723
    #10 0x7f916cbc4340 in rpc_clnt_notify /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:890
    #11 0x7f916cbb7ec5 in rpc_transport_notify /root/glusterfs_new/glusterfs/rpc/rpc-lib/src/rpc-transport.c:504
    #12 0x7f916a1aa5fa in socket_event_poll_in_async /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2358
    #13 0x7f916a1bd7c2 in gf_async ../../../../libglusterfs/src/glusterfs/async.h:187
    #14 0x7f916a1bd7c2 in socket_event_poll_in /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2399
    #15 0x7f916a1bd7c2 in socket_event_handler /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2790
    #16 0x7f916a1bd7c2 in socket_event_handler /root/glusterfs_new/glusterfs/rpc/rpc-transport/socket/src/socket.c:2710
    #17 0x7f916c946d22 in event_dispatch_epoll_handler /root/glusterfs_new/glusterfs/libglusterfs/src/event-epoll.c:614
    #18 0x7f916c946d22 in event_dispatch_epoll_worker /root/glusterfs_new/glusterfs/libglusterfs/src/event-epoll.c:725
    #19 0x7f916be8cdec in start_thread (/lib64/libc.so.6+0x8cdec)

Solution: Use switch instead of using if statement to wind a operation, in case of switch
          the code will not try to access local after wind a operation for last dht
          subvol.

Fixes: #3732
Change-Id: I031bc814d6df98058430ef4de7040e3370d1c677

Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants