Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How would I build a shared object file? #27

Open
SamSaffron opened this issue Apr 28, 2020 · 7 comments
Open

How would I build a shared object file? #27

SamSaffron opened this issue Apr 28, 2020 · 7 comments

Comments

@SamSaffron
Copy link

Trying to get my head around bazel is there a way of building a tcmalloc.so shared object file from this project?

I know the recommendation is just to compile applications with tcmalloc directly but for my use case: https://github.com/SamSaffron/allocator_bench I would like to do a side by side comparison to perftools and jemalloc that are LD_PRELOADed.

Also not against experimenting with a statically compiled ruby including tcmalloc, if we can prove it is faster / better maybe Ruby folks would be open to adding it.

@ckennelly
Copy link
Collaborator

Modifying the BUILD file to no longer have the linkstatic=1 attribute should produce a libtcmalloc.so.

That said, all of TCMalloc's dependencies (in Abseil) will end up being dynamically linked in that case, so the performance overhead will be higher than necessary.

@gaffneyc
Copy link

@SamSaffron I had the same idea after seeing it on HackerNews yesterday. I was able to build a shared library using code in #16.

Benchmarks were run on Linux 5.6.7, AMD 2700, compiled with gcc 9.3.0

ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
built-in mem: 170800 duration: 4.936174764
built-in mem (MALLOC_ARENA_MAX=2): 139292 duration: 5.318339345
tcmalloc 2.7 mem: 172408 duration: 4.401484569
tcmalloc 2.7.90 mem: 173216 duration: 4.773377773
jemalloc 5.0.0 mem: 169692 duration: 4.470789708
jemalloc 5.0.1 mem: 135116 duration: 4.678362483
jemalloc 5.1.0 mem: 168668 duration: 4.454680265
jemalloc 5.2.1 mem: 139296 duration: 4.780960854
google/tcmalloc mem: 197184 duration: 6.906723601

@karanaggarwal1994
Copy link

How to link abseil in tcmalloc.so?

@MaskRay
Copy link
Contributor

MaskRay commented Oct 2, 2022

Having a documented way to build a self-contained tcmalloc.so or tcmalloc.a will be very useful.

I am playing with lld with different malloc implementations today (https://gist.github.com/MaskRay/219effe23a767b85059097f863ebc085). For many allocators using them is simply

ld.lld @response.txt -o lld.glibc
ld.lld ~/Dev/mimalloc/out/release/libmimalloc.a @response.txt -o lld.mi
ld.lld ~/Dev/jemalloc/out/release/lib/libjemalloc.a @response.txt -o lld.je
ld.lld ~/Dev/snmalloc/out/release/libsnmallocshim-static.a @response.txt /usr/lib/x86_64-linux-gnu/libatomic.so.1 -o lld.sn

It seems that if a project doesn't use Bazel, it's very difficult to plug tcmalloc into it. (Yeah I know llvm-project has an unofficial util/bazel/ but still it's very unclear how to make it use tcmalloc.)

@junyer
Copy link
Contributor

junyer commented Oct 4, 2022

AFAIK, creating a shared library with Bazel requires a cc_binary() rule with, at minimum, linkshared = True; see this part of the Bazel documentation. Depending on the situation, wrangling symbol visibility might also be necessary; see this part of the pybind_extension() implementation. One could start off by cloning //tcmalloc:tcmalloc as //tcmalloc:tcmalloc.so and then tweaking the latter until it works well enough for one's purposes.

@jesHrz
Copy link

jesHrz commented Jun 18, 2024

Following this instruction solves the problem.

Briefly, adding a new cc_binary target called libtcmalloc.so that depends on the the target tcmalloc with linkshared = 1 at the end of the BUILD, and then bazel build //tcmalloc:libtcmalloc.so will generate the dynamic library.

cc_binary(
    name = "libtcmalloc.so",
    deps = [":tcmalloc"],
    linkshared = 1,
    copts = TCMALLOC_DEFAULT_COPTS,
)

A new question: How can I access the headers outside of the tcmalloc project?

@lano1106
Copy link

lano1106 commented Aug 7, 2024

@SamSaffron I had the same idea after seeing it on HackerNews yesterday. I was able to build a shared library using code in #16.

Benchmarks were run on Linux 5.6.7, AMD 2700, compiled with gcc 9.3.0

ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
built-in mem: 170800 duration: 4.936174764
built-in mem (MALLOC_ARENA_MAX=2): 139292 duration: 5.318339345
tcmalloc 2.7 mem: 172408 duration: 4.401484569
tcmalloc 2.7.90 mem: 173216 duration: 4.773377773
jemalloc 5.0.0 mem: 169692 duration: 4.470789708
jemalloc 5.0.1 mem: 135116 duration: 4.678362483
jemalloc 5.1.0 mem: 168668 duration: 4.454680265
jemalloc 5.2.1 mem: 139296 duration: 4.780960854
google/tcmalloc mem: 197184 duration: 6.906723601

this is an old benchmark... Is there more recent benchmark that compares the latest TCMalloc version performance against other allocators? I had the assumption that the latest version was at least as good as the gperf one and certainly better with all the new bells and whistles such as rseq, THP support and C++14 sized delete... but since I am having such a hard time playing with the new version, I need to make sure that the effort is worth the trouble... I am currently doubting after having spent a full day learning bazel and getting results not working well...

I do not care if the allocation time is slightly slower if the use of THP means less TLB lookups and makes the overall process better but those things have to be measured. With all the rich documentation that the TCMalloc team is providing, I am surprised that there is no performance numbers published anywhere... gperf TCMalloc had a lot of benchmarks numbers to compare it against pt2malloc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants