Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable jemalloc in chdb shared lib for linux #20

Closed
auxten opened this issue Apr 24, 2023 · 4 comments
Closed

Enable jemalloc in chdb shared lib for linux #20

auxten opened this issue Apr 24, 2023 · 4 comments
Labels
enhancement New feature or request test wanted Feature requires testing and validation

Comments

@auxten
Copy link
Member

auxten commented Apr 24, 2023

As analyzed in #19 , we need to enable jemalloc in chdb shared lib.
With tips here jemalloc/jemalloc#1237

I disabled initial exec tls like this:

contrib/jemalloc-cmake/include_linux_x86_64/jemalloc/internal/jemalloc_internal_defs.h.in

@@ -139,7 +139,8 @@
 /* #undef JEMALLOC_MUTEX_INIT_CB */
 
 /* Non-empty if the tls_model attribute is supported. */
-#define JEMALLOC_TLS_MODEL __attribute__((tls_model("initial-exec")))
+// #define JEMALLOC_TLS_MODEL __attribute__((tls_model("initial-exec")))
+#define JEMALLOC_TLS_MODEL

But this will cause call_init error during dlopen:

Program received signal SIGSEGV, Segmentation fault.
0x00007f291e00fd29 in sallocx () from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
(gdb) bt
#0  0x00007f291e00fd29 in sallocx () from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#1  0x00007f2916555a36 in operator delete(void*, unsigned long) () from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#2  0x00007f291f45da44 in google::protobuf::internal::OnShutdownRun(void (*)(void const*), void const*) ()
   from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#3  0x00007f291f493172 in google::protobuf::(anonymous namespace)::GeneratedDatabase() () from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#4  0x00007f291f493370 in google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int) ()
   from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#5  0x00007f291f50b937 in google::protobuf::(anonymous namespace)::AddDescriptors(google::protobuf::internal::DescriptorTable const*) ()
   from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#6  0x00007f29212d0fe2 in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fff92a04d68, env=env@entry=0x1440b90) at dl-init.c:72
#7  0x00007f29212d10e9 in call_init (env=0x1440b90, argv=0x7fff92a04d68, argc=1, l=<optimized out>) at dl-init.c:30
#8  _dl_init (main_map=0x152ea80, argc=1, argv=0x7fff92a04d68, env=0x1440b90) at dl-init.c:119
#9  0x00007f292105caed in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:182
#10 0x00007f29212d5058 in dl_open_worker (a=a@entry=0x7fff92a034f0) at dl-open.c:758
#11 0x00007f292105ca90 in __GI__dl_catch_exception (exception=0x7fff92a034d0, operate=0x7f29212d4ca0 <dl_open_worker>, args=0x7fff92a034f0)
    at dl-error-skeleton.c:208
#12 0x00007f29212d48fa in _dl_open (file=0x7f2920d4e530 "/home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so", mode=-2147483646, caller_dlopen=0x61b611, 
    nsid=-2, argc=1, argv=0x7fff92a034d0, env=0x1440b90) at dl-open.c:837
#13 0x00007f2921293258 in dlopen_doit (a=a@entry=0x7fff92a03710) at dlopen.c:66
#14 0x00007f292105ca90 in __GI__dl_catch_exception (exception=exception@entry=0x7fff92a036b0, operate=0x7f2921293200 <dlopen_doit>, args=0x7fff92a03710)
    at dl-error-skeleton.c:208
#15 0x00007f292105cb4f in __GI__dl_catch_error (objname=0x1493150, errstring=0x1493158, mallocedp=0x1493148, operate=<optimized out>, args=<optimized out>)
    at dl-error-skeleton.c:227
#16 0x00007f2921293a65 in _dlerror_run (operate=operate@entry=0x7f2921293200 <dlopen_doit>, args=args@entry=0x7fff92a03710) at dlerror.c:170
#17 0x00007f29212932e4 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#18 0x000000000061b611 in ?? ()
#19 0x000000000061a2ca in ?? ()
#20 0x00000000005298c4 in ?? ()
#21 0x0000000000517b9b in _PyEval_EvalFrameDefault ()
#22 0x00000000005106ed in ?? ()
#23 0x0000000000528d21 in _PyFunction_Vectorcall ()
#24 0x0000000000516e76 in _PyEval_EvalFrameDefault ()
#25 0x0000000000528b63 in _PyFunction_Vectorcall ()
#26 0x0000000000512192 in _PyEval_EvalFrameDefault ()
#27 0x0000000000528b63 in _PyFunction_Vectorcall ()
#28 0x0000000000511fb5 in _PyEval_EvalFrameDefault ()
#29 0x0000000000528b63 in _PyFunction_Vectorcall ()
#30 0x0000000000511fb5 in _PyEval_EvalFrameDefault ()
#31 0x0000000000528b63 in _PyFunction_Vectorcall ()
#32 0x0000000000511fb5 in _PyEval_EvalFrameDefault ()
#33 0x0000000000528b63 in _PyFunction_Vectorcall ()
#34 0x000000000052842e in ?? ()
#35 0x000000000053f559 in _PyObject_CallMethodIdObjArgs ()
#36 0x000000000053e786 in PyImport_ImportModuleLevelObject ()
#37 0x00000000005144bd in _PyEval_EvalFrameDefault ()
#38 0x00000000005106ed in ?? ()
#39 0x0000000000510497 in _PyEval_EvalCodeWithName ()
#40 0x00000000005f5be3 in PyEval_EvalCode ()
#41 0x0000000000619de7 in ?? ()
#42 0x0000000000615610 in ?? ()
#43 0x0000000000459cb3 in ?? ()
#44 0x0000000000459911 in PyRun_InteractiveLoopFlags ()
#45 0x00000000006194f5 in PyRun_AnyFileExFlags ()
#46 0x000000000044bca9 in ?? ()
#47 0x00000000005ea6e9 in Py_BytesMain ()
#48 0x00007f2920f49d0a in __libc_start_main (main=0x5ea6b0, argc=1, argv=0x7fff92a04d68, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fff92a04d58) at ../csu/libc-start.c:308
#49 0x00000000005ea5ea in _start ()

need further dig...

@lmangani
Copy link
Contributor

Was -DENABLE_JEMALLOC=ON in the library build during this build test?

@auxten
Copy link
Member Author

auxten commented Apr 24, 2023

Yes. test code is here #22

@auxten
Copy link
Member Author

auxten commented Apr 25, 2023

After open debug mode build. I can see the detailed backtrace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f86d9c4d537 in __GI_abort () at abort.c:79
#2  0x00007f86d4d73af0 in sallocx (ptr=0x28ec390, flags=0) at ./contrib/jemalloc/src/jemalloc.c:3903
#3  0x00007f86c7ce9ac0 in Memory::untrackMemory<>(void*, unsigned long) (ptr=0x28ec390, size=16) at ./src/Common/memory.h:135
#4  operator delete (ptr=0x28ec390, size=16) at ./src/Common/new_delete.cpp:136
#5  0x00007f86bf0f2cfd in std::__1::__libcpp_operator_delete[abi:v15000]<void*, unsigned long>(void*, unsigned long) (__args=16, __args=16)
    at ./contrib/llvm-project/libcxx/include/new:256
#6  0x00007f86bf0f2c7d in std::__1::__do_deallocate_handle_size[abi:v15000]<>(void*, unsigned long) (__ptr=0x28ec390, __size=16)
    at ./contrib/llvm-project/libcxx/include/new:282
#7  0x00007f86bf0f2c18 in std::__1::__libcpp_deallocate[abi:v15000](void*, unsigned long, unsigned long) (__ptr=0x28ec390, __size=16, __align=8)
    at ./contrib/llvm-project/libcxx/include/new:296
#8  0x00007f86d6ce780a in std::__1::allocator<std::__1::pair<void (*)(void const*), void const*> >::deallocate[abi:v15000](std::__1::pair<void (*)(void const*), void const*>*, unsigned long) (this=0x28ed970, __p=0x28ec390, __n=1) at ./contrib/llvm-project/libcxx/include/__memory/allocator.h:128
#9  0x00007f86d6ce75c5 in std::__1::allocator_traits<std::__1::allocator<std::__1::pair<void (*)(void const*), void const*> > >::deallocate[abi:v15000](std::__1::allocator<std::__1::pair<void (*)(void const*), void const*> >&, std::__1::pair<void (*)(void const*), void const*>*, unsigned long) (__a=..., __p=0x28ec390, __n=1)
    at ./contrib/llvm-project/libcxx/include/__memory/allocator_traits.h:282
#10 0x00007f86d6ce7ebe in std::__1::__split_buffer<std::__1::pair<void (*)(void const*), void const*>, std::__1::allocator<std::__1::pair<void (*)(void const*), void const*> >&>::~__split_buffer (this=0x7ffde94dab90) at ./contrib/llvm-project/libcxx/include/__split_buffer:355
#11 0x00007f86d6ce7a2e in std::__1::vector<std::__1::pair<void (*)(void const*), void const*>, std::__1::allocator<std::__1::pair<void (*)(void const*), void const*> > >::__push_back_slow_path<std::__1::pair<void (*)(void const*), void const*> > (this=0x28ed960, __x=...) at ./contrib/llvm-project/libcxx/include/vector:1540
#12 0x00007f86d6ce6d34 in std::__1::vector<std::__1::pair<void (*)(void const*), void const*>, std::__1::allocator<std::__1::pair<void (*)(void const*), void const*> > >::push_back[abi:v15000](std::__1::pair<void (*)(void const*), void const*>&&) (this=0x28ed960, __x=...) at ./contrib/llvm-project/libcxx/include/vector:1567
#13 0x00007f86d6ce5889 in google::protobuf::internal::OnShutdownRun (
    f=0x7f86d6d81460 <google::protobuf::internal::OnShutdownDelete<google::protobuf::EncodedDescriptorDatabase>(google::protobuf::EncodedDescriptorDatabase*)::{lambda(void const*)#1}::__invoke(void const*)>, arg=0x2904ce0) at ./contrib/protobuf/src/google/protobuf/message_lite.cc:584
#14 0x00007f86d6d81421 in google::protobuf::internal::OnShutdownDelete<google::protobuf::EncodedDescriptorDatabase> (p=0x2904ce0)
    at ./contrib/protobuf/src/google/protobuf/message_lite.h:630
#15 0x00007f86d6d3c9f4 in google::protobuf::(anonymous namespace)::GeneratedDatabase () at ./contrib/protobuf/src/google/protobuf/descriptor.cc:1303
#16 0x00007f86d6d3cbd4 in google::protobuf::DescriptorPool::InternalAddGeneratedFile (
    encoded_file_descriptor=0x7f86b9a55750 <descriptor_table_protodef_orc_5fproto_2eproto>, size=3380) at ./contrib/protobuf/src/google/protobuf/descriptor.cc:1357
#17 0x00007f86d6e18fc7 in google::protobuf::(anonymous namespace)::AddDescriptorsImpl (table=0x7f86d8f8ce00 <descriptor_table_orc_5fproto_2eproto>)
    at ./contrib/protobuf/src/google/protobuf/generated_message_reflection.cc:2767
#18 0x00007f86d6e1710e in google::protobuf::(anonymous namespace)::AddDescriptors (table=0x7f86d8f8ce00 <descriptor_table_orc_5fproto_2eproto>)
    at ./contrib/protobuf/src/google/protobuf/generated_message_reflection.cc:2778
#19 0x00007f86d6e170d9 in google::protobuf::internal::AddDescriptorsRunner::AddDescriptorsRunner (this=0x7f86d99169a8 <dynamic_init_dummy_orc_5fproto_2eproto>, 
    table=0x7f86d8f8ce00 <descriptor_table_orc_5fproto_2eproto>) at ./contrib/protobuf/src/google/protobuf/generated_message_reflection.cc:2813
#20 0x00007f86d6af9537 in __cxx_global_var_init () at ./buildlib/contrib/arrow-cmake/orc_proto.pb.cc:819
#21 0x00007f86d6af9549 in global constructors keyed to 000102 () from /home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so
#22 0x00007f86d9fd5fe2 in call_init (l=<optimized out>, argc=argc@entry=3, argv=argv@entry=0x7ffde94dc928, env=env@entry=0x28ecbf0) at dl-init.c:72
#23 0x00007f86d9fd60e9 in call_init (env=0x28ecbf0, argv=0x7ffde94dc928, argc=3, l=<optimized out>) at dl-init.c:30
#24 _dl_init (main_map=0x2923e60, argc=3, argv=0x7ffde94dc928, env=0x28ecbf0) at dl-init.c:119
#25 0x00007f86d9d61aed in __GI__dl_catch_exception (exception=<optimized out>, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:182
#26 0x00007f86d9fda058 in dl_open_worker (a=a@entry=0x7ffde94db140) at dl-open.c:758
#27 0x00007f86d9d61a90 in __GI__dl_catch_exception (exception=0x7ffde94db120, operate=0x7f86d9fd9ca0 <dl_open_worker>, args=0x7ffde94db140)
    at dl-error-skeleton.c:208
#28 0x00007f86d9fd98fa in _dl_open (file=0x7f86d9a311d0 "/home/Clickhouse/chdb/_chdb.cpython-39-x86_64-linux-gnu.so", mode=-2147483646, caller_dlopen=0x61b611, 
    nsid=-2, argc=3, argv=0x7ffde94db120, env=0x28ecbf0) at dl-open.c:837
#29 0x00007f86d9f98258 in dlopen_doit (a=a@entry=0x7ffde94db360) at dlopen.c:66
#30 0x00007f86d9d61a90 in __GI__dl_catch_exception (exception=exception@entry=0x7ffde94db300, operate=0x7f86d9f98200 <dlopen_doit>, args=0x7ffde94db360)
    at dl-error-skeleton.c:208
#31 0x00007f86d9d61b4f in __GI__dl_catch_error (objname=0x293f310, errstring=0x293f318, mallocedp=0x293f308, operate=<optimized out>, args=<optimized out>)
    at dl-error-skeleton.c:227
#32 0x00007f86d9f98a65 in _dlerror_run (operate=operate@entry=0x7f86d9f98200 <dlopen_doit>, args=args@entry=0x7ffde94db360) at dlerror.c:170
#33 0x00007f86d9f982e4 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87

I think this might be an init order problem, as the protobuf lib marked __attribute__((init_priority(102))).
During dlopen the init function called google::protobuf::internal::OnShutdownRun the call into llvm (clickhouse use contrib/llvm libc implementation).
Jemalloc hooked the new, delete stuff but it seems jemalloc sallocx is not ready to be called.

I have tried to get rid of the Common/new_delete.cpp whole stuff and put lib_jemalloc.a in the first place of linking.
But this will cause jemalloc doesn't work according to the benchmark result.

I also found some interesting results of different allocator performances:

[root@ip-172-31-23-82 chdb]# LD_PRELOAD=/mnt/ClickBench/chdb/jemalloc/lib/libjemalloc.so.2 ./run.sh 
SELECT * FROM file("hits_*.parquet", Parquet) WHERE URL LIKE '%google%' ORDER BY EventTime LIMIT 10;
56.50740558200005
22.836103801000263
22.74220113599995

[root@ip-172-31-23-82 chdb]# LD_PRELOAD=/mnt/ClickBench/chdb/mimalloc/out/release/libmimalloc.so.2.1 ./run.sh 
SELECT * FROM file("hits_*.parquet", Parquet) WHERE URL LIKE '%google%' ORDER BY EventTime LIMIT 10;
mimalloc: warning: thread 0x7f9176777640: unable to allocate aligned OS memory directly, fall back to over-allocation (size: 0x80000000 bytes, address: 0x7f90d5754000, alignment: 0x2000000, commit: 1)
57.38256092699976
57.70822953300012
58.000434434

[root@ip-172-31-23-82 chdb]# LD_PRELOAD=/usr/lib64/libtcmalloc.so.4.5.9 ./run.sh 
SELECT * FROM file("hits_*.parquet", Parquet) WHERE URL LIKE '%google%' ORDER BY EventTime LIMIT 10;
55.732121871000345
19.627927295000063
19.346901031000016

It seems tcmalloc runs fastest, then jemalloc. mimalloc doesn't work well here.

Trying to init jemalloc before protobuf during dlopen

@lmangani
Copy link
Contributor

lmangani commented May 5, 2023

Great work! Implemented in #22 by auxten and ready for retesting in #19

@lmangani lmangani added enhancement New feature or request test wanted Feature requires testing and validation labels May 10, 2023
@auxten auxten closed this as completed Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request test wanted Feature requires testing and validation
Projects
None yet
Development

No branches or pull requests

2 participants