Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configure: add an option to build with tcmalloc library #2

Closed
wants to merge 1 commit into from

Conversation

amarts
Copy link
Owner

@amarts amarts commented Nov 14, 2019

With this patch, one can now use '--enable-tcmalloc' while running
configure, and they can see that their glusterfs is linked with
libtcmalloc.

[atumball@local build]$ ldd /usr/local/sbin/glusterfs | grep tcmalloc
libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007feec0b87000)

Once we establish a standard performance number with and without this option,
we will see how to make it default.

Updates: #237
Change-Id: I3377f57bfe4e17f101a212e1914a6d3c1687d528
Signed-off-by: Amar Tumballi amarts@redhat.com

With this patch, one can now use '--enable-tcmalloc' while running
configure, and they can see that their glusterfs is linked with
libtcmalloc.

[atumball@local build]$ ldd /usr/local/sbin/glusterfs | grep tcmalloc
	libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007feec0b87000)

Once we establish a standard performance number with and without this option,
we will see how to make it default.

Updates: #237
Change-Id: I3377f57bfe4e17f101a212e1914a6d3c1687d528
Signed-off-by: Amar Tumballi <amarts@redhat.com>
@amarts
Copy link
Owner Author

amarts commented Nov 14, 2019

check regression

@amarts
Copy link
Owner Author

amarts commented Nov 15, 2019

check centos

@amarts amarts added the work-in-progress Not ready to merge label Nov 15, 2019
@amarts
Copy link
Owner Author

amarts commented Nov 15, 2019

check centos

@amarts amarts closed this Nov 19, 2019
amarts pushed a commit that referenced this pull request May 5, 2020
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
amarts pushed a commit that referenced this pull request Jun 22, 2020
The general idea of the changes is to prevent resetting event generation
to zero in the inode ctx, since event gen is something that should
follow 'causal order'.

Change #1:
For a read txn, in inode refresh cbk, if event_generation is
found zero, we are failing the read fop. This is not needed
because change in event gen is only a marker for the next inode refresh to
happen and should not be taken into account by the current read txn.

Change #2:
The event gen being zero above can happen if there is a racing lookup,
which resets even get (in afr_lookup_done) if there are non zero afr
xattrs. The resetting is done only to trigger an inode refresh and a
possible client side heal on the next lookup. That can be acheived by
setting the need_refresh flag in the inode ctx. So replaced all
occurences of resetting even gen to zero with a call to
afr_inode_need_refresh_set().

Change #3:
In both lookup and discover path, we are doing an inode refresh which is
not required since all 3 essentially do the same thing- update the inode
ctx with the good/bad copies from the brick replies. Inode refresh also
triggers background heals, but I think it is okay to do it when we call
refresh during the read and write txns and not in the lookup path.

The .ts which relied on inode refresh in lookup path to trigger heals are
now changed to do read txn so that inode refresh and the heal happens.

Change-Id: Iebf39a9be6ffd7ffd6e4046c96b0fa78ade6c5ec
Fixes: #1179
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Erik Jacobson <erik.jacobson at hpe.com>
amarts pushed a commit that referenced this pull request Jun 22, 2020
Found with GCC ASan:

Direct leak of 202 byte(s) in 2 object(s) allocated from:
    #0 0x7fc6c6ef0667 in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb0667)
    #1 0x7fc6c6bd145b in __gf_malloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:175
    #2 0x7fc6c6bd17a3 in gf_vasprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:223
    #3 0x7fc6c6bd1993 in gf_asprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:243
    #4 0x7fc6b0dc92f6 in init /path/to/glusterfs/xlators/cluster/afr/src/afr.c:590
    ...

Change-Id: I29feb1d30a045fb70472758e6ed4e195888090b2
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1278
amarts pushed a commit that referenced this pull request Jun 22, 2020
Found with GCC's address sanitizer:

==67190==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24624 byte(s) in 6 object(s) allocated from:
    #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
    #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
    #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
    #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
    #4 0x7f623e2f9ed7 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2389

Direct leak of 8208 byte(s) in 2 object(s) allocated from:
    #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
    #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
    #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
    #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
    #4 0x7f623e2f9cf0 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2363
    #5 0x7fff5cb70bcf  ([stack]+0x15bcf)
    #6 0x7f623e309113 in glusterd_store_retrieve_volumes /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:3505
    #7 0xfffeb96e61d  (<unknown module>)
    #8 0x7f623e4586d7  (/usr/lib64/glusterfs/9dev/xlator/mgmt/glusterd.so+0x2f86d7)

Change-Id: I9b2a543dc095f4fa739cd664fd4d608bf8c87d60
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1263
amarts pushed a commit that referenced this pull request Jun 22, 2020
Found with GCC ThreadSanitizer:

WARNING: ThreadSanitizer: use of an invalid mutex (e.g. uninitialized or destroyed) (pid=188590)
    #0 pthread_mutex_lock <null> (libtsan.so.0+0x528ac)
    #1 client_ctx_del /path/to/glusterfs/libglusterfs/src/client_t.c:535 (libglusterfs.so.0+0xc681a)
    #2 client_destroy_cbk /path/to/glusterfs/xlators/protocol/server/src/server.c:944 (server.so+0xaf6e)
    #3 gf_client_destroy_recursive /path/to/glusterfs/libglusterfs/src/client_t.c:295 (libglusterfs.so.0+0xc5058)
    #4 client_destroy /path/to/glusterfs/libglusterfs/src/client_t.c:330 (libglusterfs.so.0+0xc60e4)
    ...

  Location is heap block of size 272 at 0x7b440001a180 allocated by thread T7:
    #0 calloc <null> (libtsan.so.0+0x3075a)
    #1 __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:151 (libglusterfs.so.0+0x6e42b)
    #2 gf_client_get /path/to/glusterfs/libglusterfs/src/client_t.c:155 (libglusterfs.so.0+0xc571a)
    ...

The problem is that client_destroy() may call client_ctx_del() (which attempts to lock
'sratch_ctx.lock') via recursive deletion from gf_client_destroy_recursive(), so
destroying mutex before entering recursive deletion is an error. It should be destroyed
later - just before the client context is freed.

Change-Id: I730a628714d2b404e3f019ae552403da16b51b68
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1285
amarts pushed a commit that referenced this pull request Jun 22, 2020
Found with GCC UBsan:

rpcsvc.c:102:36: runtime error: passing zero to ctz(), which is not a valid argument
    #0 0x7fcd1ff6faa4 in rpcsvc_get_free_queue_index /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:102
    #1 0x7fcd1ff81e12 in rpcsvc_handle_rpc_call /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:837
    #2 0x7fcd1ff833ad in rpcsvc_notify /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:1000
    #3 0x7fcd1ff8829d in rpc_transport_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-transport.c:520
    #4 0x7fcd0dd72f16 in socket_event_poll_in_async /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2502
    #5 0x7fcd0dd8986a in gf_async ../../../../libglusterfs/src/glusterfs/async.h:189
    #6 0x7fcd0dd8986a in socket_event_poll_in /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2543
    #7 0x7fcd0dd8986a in socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2934
    #8 0x7fcd0dd8986a in socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2854
    #9 0x7fcd2048aff7 in event_dispatch_epoll_handler /path/to/glusterfs/libglusterfs/src/event-epoll.c:640
    #10 0x7fcd2048aff7 in event_dispatch_epoll_worker /path/to/glusterfs/libglusterfs/src/event-epoll.c:751
    ...

Fix, simplify, and prefer 'unsigned long' as underlying bitmap type.

Change-Id: If3f24dfe7bef8bc7a11a679366e219a73caeb9e4
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1283
amarts pushed a commit that referenced this pull request Jun 22, 2020
Found with GCC ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=287943)
  Write of size 4 at 0x00000047dfa0 by thread T4:
    #0 cli_rpc_notify /path/to/glusterfs/cli/src/cli.c:313 (gluster+0x40a6df)
    #1 rpc_clnt_handle_disconnect /path/to/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:821 (libgfrpc.so.0+0x13f04)
    #2 rpc_clnt_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:882 (libgfrpc.so.0+0x13f04)
    #3 rpc_transport_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-transport.c:520 (libgfrpc.so.0+0xf070)
    #4 socket_event_poll_err /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:1364 (socket.so+0x812c)
    #5 socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2958 (socket.so+0xc453)
    #6 socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2854 (socket.so+0xc453)
    #7 event_dispatch_epoll_handler /path/to/glusterfs/libglusterfs/src/event-epoll.c:640 (libglusterfs.so.0+0xcaf23)
    #8 event_dispatch_epoll_worker /path/to/glusterfs/libglusterfs/src/event-epoll.c:751 (libglusterfs.so.0+0xcaf23)
    #9 <null> <null> (libtsan.so.0+0x2d33f)

  Previous read of size 4 at 0x00000047dfa0 by thread T3 (mutexes: write M3587):
    #0 cli_cmd_await_connected /path/to/glusterfs/cli/src/cli-cmd.c:321 (gluster+0x40ca37)
    #1 cli_cmd_process /path/to/glusterfs/cli/src/cli-cmd.c:123 (gluster+0x40cc74)
    #2 cli_batch /path/to/glusterfs/cli/src/input.c:29 (gluster+0x40c2b9)
    #3 <null> <null> (libtsan.so.0+0x2d33f)

  Location is global 'connected' of size 4 at 0x00000047dfa0 (gluster+0x00000047dfa0)

Change-Id: Ie85a8a80a2c5b82252c0c1d45e68ebe9938da2eb
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1311
amarts pushed a commit that referenced this pull request Aug 12, 2020
Found with GCC ASan:

Direct leak of 202 byte(s) in 2 object(s) allocated from:
    #0 0x7fc6c6ef0667 in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb0667)
    #1 0x7fc6c6bd145b in __gf_malloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:175
    #2 0x7fc6c6bd17a3 in gf_vasprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:223
    #3 0x7fc6c6bd1993 in gf_asprintf /path/to/glusterfs/libglusterfs/src/mem-pool.c:243
    #4 0x7fc6b0dc92f6 in init /path/to/glusterfs/xlators/cluster/afr/src/afr.c:590
    ...

Change-Id: I29feb1d30a045fb70472758e6ed4e195888090b2
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1278
amarts pushed a commit that referenced this pull request Aug 12, 2020
Found with GCC's address sanitizer:

==67190==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24624 byte(s) in 6 object(s) allocated from:
    #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
    #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
    #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
    #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
    #4 0x7f623e2f9ed7 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2389

Direct leak of 8208 byte(s) in 2 object(s) allocated from:
    #0 0x7f62535c0837 in __interceptor_calloc (/usr/lib64/libasan.so.6+0xb0837)
    #1 0x7f62532a1690 in __gf_default_calloc glusterfs/mem-pool.h:122
    #2 0x7f62532a20ca in __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:144
    #3 0x7f62532c8128 in gf_store_iter_new /path/to/glusterfs/libglusterfs/src/store.c:511
    #4 0x7f623e2f9cf0 in glusterd_store_retrieve_bricks /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:2363
    #5 0x7fff5cb70bcf  ([stack]+0x15bcf)
    #6 0x7f623e309113 in glusterd_store_retrieve_volumes /path/to/glusterfs/xlators/mgmt/glusterd/src/glusterd-store.c:3505
    #7 0xfffeb96e61d  (<unknown module>)
    #8 0x7f623e4586d7  (/usr/lib64/glusterfs/9dev/xlator/mgmt/glusterd.so+0x2f86d7)

Change-Id: I9b2a543dc095f4fa739cd664fd4d608bf8c87d60
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1263
amarts pushed a commit that referenced this pull request Aug 12, 2020
Found with GCC ThreadSanitizer:

WARNING: ThreadSanitizer: use of an invalid mutex (e.g. uninitialized or destroyed) (pid=188590)
    #0 pthread_mutex_lock <null> (libtsan.so.0+0x528ac)
    #1 client_ctx_del /path/to/glusterfs/libglusterfs/src/client_t.c:535 (libglusterfs.so.0+0xc681a)
    #2 client_destroy_cbk /path/to/glusterfs/xlators/protocol/server/src/server.c:944 (server.so+0xaf6e)
    #3 gf_client_destroy_recursive /path/to/glusterfs/libglusterfs/src/client_t.c:295 (libglusterfs.so.0+0xc5058)
    #4 client_destroy /path/to/glusterfs/libglusterfs/src/client_t.c:330 (libglusterfs.so.0+0xc60e4)
    ...

  Location is heap block of size 272 at 0x7b440001a180 allocated by thread T7:
    #0 calloc <null> (libtsan.so.0+0x3075a)
    #1 __gf_calloc /path/to/glusterfs/libglusterfs/src/mem-pool.c:151 (libglusterfs.so.0+0x6e42b)
    #2 gf_client_get /path/to/glusterfs/libglusterfs/src/client_t.c:155 (libglusterfs.so.0+0xc571a)
    ...

The problem is that client_destroy() may call client_ctx_del() (which attempts to lock
'sratch_ctx.lock') via recursive deletion from gf_client_destroy_recursive(), so
destroying mutex before entering recursive deletion is an error. It should be destroyed
later - just before the client context is freed.

Change-Id: I730a628714d2b404e3f019ae552403da16b51b68
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1285
amarts pushed a commit that referenced this pull request Aug 12, 2020
Found with GCC UBsan:

rpcsvc.c:102:36: runtime error: passing zero to ctz(), which is not a valid argument
    #0 0x7fcd1ff6faa4 in rpcsvc_get_free_queue_index /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:102
    #1 0x7fcd1ff81e12 in rpcsvc_handle_rpc_call /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:837
    #2 0x7fcd1ff833ad in rpcsvc_notify /path/to/glusterfs/rpc/rpc-lib/src/rpcsvc.c:1000
    #3 0x7fcd1ff8829d in rpc_transport_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-transport.c:520
    #4 0x7fcd0dd72f16 in socket_event_poll_in_async /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2502
    #5 0x7fcd0dd8986a in gf_async ../../../../libglusterfs/src/glusterfs/async.h:189
    #6 0x7fcd0dd8986a in socket_event_poll_in /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2543
    #7 0x7fcd0dd8986a in socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2934
    #8 0x7fcd0dd8986a in socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2854
    #9 0x7fcd2048aff7 in event_dispatch_epoll_handler /path/to/glusterfs/libglusterfs/src/event-epoll.c:640
    #10 0x7fcd2048aff7 in event_dispatch_epoll_worker /path/to/glusterfs/libglusterfs/src/event-epoll.c:751
    ...

Fix, simplify, and prefer 'unsigned long' as underlying bitmap type.

Change-Id: If3f24dfe7bef8bc7a11a679366e219a73caeb9e4
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1283
amarts pushed a commit that referenced this pull request Aug 12, 2020
Found with GCC ThreadSanitizer:

WARNING: ThreadSanitizer: data race (pid=287943)
  Write of size 4 at 0x00000047dfa0 by thread T4:
    #0 cli_rpc_notify /path/to/glusterfs/cli/src/cli.c:313 (gluster+0x40a6df)
    #1 rpc_clnt_handle_disconnect /path/to/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:821 (libgfrpc.so.0+0x13f04)
    #2 rpc_clnt_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-clnt.c:882 (libgfrpc.so.0+0x13f04)
    #3 rpc_transport_notify /path/to/glusterfs/rpc/rpc-lib/src/rpc-transport.c:520 (libgfrpc.so.0+0xf070)
    #4 socket_event_poll_err /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:1364 (socket.so+0x812c)
    #5 socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2958 (socket.so+0xc453)
    #6 socket_event_handler /path/to/glusterfs/rpc/rpc-transport/socket/src/socket.c:2854 (socket.so+0xc453)
    #7 event_dispatch_epoll_handler /path/to/glusterfs/libglusterfs/src/event-epoll.c:640 (libglusterfs.so.0+0xcaf23)
    #8 event_dispatch_epoll_worker /path/to/glusterfs/libglusterfs/src/event-epoll.c:751 (libglusterfs.so.0+0xcaf23)
    #9 <null> <null> (libtsan.so.0+0x2d33f)

  Previous read of size 4 at 0x00000047dfa0 by thread T3 (mutexes: write M3587):
    #0 cli_cmd_await_connected /path/to/glusterfs/cli/src/cli-cmd.c:321 (gluster+0x40ca37)
    #1 cli_cmd_process /path/to/glusterfs/cli/src/cli-cmd.c:123 (gluster+0x40cc74)
    #2 cli_batch /path/to/glusterfs/cli/src/input.c:29 (gluster+0x40c2b9)
    #3 <null> <null> (libtsan.so.0+0x2d33f)

  Location is global 'connected' of size 4 at 0x00000047dfa0 (gluster+0x00000047dfa0)

Change-Id: Ie85a8a80a2c5b82252c0c1d45e68ebe9938da2eb
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Fixes: #1311
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
work-in-progress Not ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant