Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mimalloc allocator for static build #7378

Merged
merged 4 commits into from
Mar 19, 2025

Conversation

danleh
Copy link
Contributor

@danleh danleh commented Mar 17, 2025

Which fixes performance problems, especially on heavily multi-threaded workloads (many functions, high core count machines), see #5561.

Which fixes performance problems, especially on heavily multi-threaded workloads (many functions, high core count machines), see WebAssembly#5561.
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @kripken?

@tlively tlively requested a review from kripken March 17, 2025 21:55
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I worry about this kind of thing because the pinned version can get stale, but given the huge benefit here, it seems reasonable!

@dschuff
Copy link
Member

dschuff commented Mar 18, 2025

So I tried this out on my macbook and got some odd errors (they are different depending on whether mimalloc is linked as a static archive or an object library). Haven't figured out what's going on yet but we might want to gate this behind a CMake flag for now (otherwise it may break when we try to do an emscripten-releases release build).

@danleh
Copy link
Contributor Author

danleh commented Mar 18, 2025

So I tried this out on my macbook and got some odd errors (they are different depending on whether mimalloc is linked as a static archive or an object library). Haven't figured out what's going on yet but we might want to gate this behind a CMake flag for now (otherwise it may break when we try to do an emscripten-releases release build).

My bad. Added a MIMALLOC_STATIC option, configured the mimalloc build to only produce the static library, and enabled MIMALLOC_STATIC only in the Alpine Linux build (in both ci.yml and create_release.yml). Could you rerun CI?

@arsnyder16
Copy link
Contributor

looking forward to this being pull through into emsdk

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but the CI error looks like it might be real?

@danleh
Copy link
Contributor Author

danleh commented Mar 18, 2025

https://github.com/WebAssembly/binaryen/actions/runs/13925295048/job/38975354336?pr=7378#step:10:2629

mimalloc: assertion failed: at "/src/third_party/mimalloc/src/segment.c":527, _mi_segments_collect
assertion: "tld->pages_purge.first == NULL"

Mh, it seems we are hitting microsoft/mimalloc#1031.

(@kripken Indeed. Seems we had a race with our comments.)

I'll try if a different commit/branch of mimalloc fixes the issue. Can reproduce locally (not sure why I didn't hit this yesterday locally. Maybe I only tested with LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libmimalloc.so.2 instead of static linking, not sure.)

which is the same version that Ubuntu 24.10 has in their mimalloc2.0 package, see https://launchpad.net/ubuntu/+source/mimalloc
@kripken
Copy link
Member

kripken commented Mar 18, 2025

Current error on CI looks like a warning from mimalloc. We don't want such warnings shown to our users - is there a way to disable them?

I must say, the warnings + that error do make me a little worried here. Is there perhaps a "stable" branch of mimalloc we should be using?

@danleh
Copy link
Contributor Author

danleh commented Mar 19, 2025

Is there perhaps a "stable" branch of mimalloc we should be using?

There is, sorry for not finding that earlier. The README says master: latest stable release (still based on dev2), so I checked the submodule out at the master branch now (which is essentially v2.2.2, so slightly newer than the v2.1.7 version from the Ubuntu package that I used earlier). In either case, the assertion failures are fixed.

With regards to the warning messages: Those are apparently benign/pedantic debug info and only shown because mimalloc builds with an extended debug option by default (MI_DEBUG=2). See microsoft/mimalloc#640 (comment):

Note that the warnings in this case are a bit "pedantic": they can be useful to debug performance problems but in principle the "overallocation" path works fine on Linux (just a tad slower). If Linux would have MAP_ALIGNED (as in BSD) this problem would go away.

I now set MI_DEBUG=0 which silences these messages. For interested users, those detailed messages can still be shown via the environment variable MIMALLOC_VERBOSE=1 wasm-opt ...

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kripken kripken merged commit d79639a into WebAssembly:main Mar 19, 2025
14 checks passed
@sbc100
Copy link
Member

sbc100 commented Mar 19, 2025

looking forward to this being pull through into emsdk

emsdk currently builds binaryen dynamically linked against glibc. Converting it to use mimalloc/musl/alpine would require some changes here: https://chromium.googlesource.com/emscripten-releases/+/refs/heads/main/src/build.py#900

@sbc100
Copy link
Member

sbc100 commented Mar 19, 2025

There is, sorry for not finding that earlier. The README says master: latest stable release (still based on dev2), so I checked the submodule out at the master branch now (which is essentially v2.2.2, so slightly newer than the v2.1.7 version from the Ubuntu package that I used earlier). In either case, the assertion failures are fixed.

I would recommend using the tagged v2.2.2 release rather than some un-tagged revision of master.

The master branch of mimalloc actually looks like it currently contains some conflict markers: https://github.com/microsoft/mimalloc/compare/v2.2.2..e1123000

@sbc100
Copy link
Member

sbc100 commented Mar 19, 2025

Filed: microsoft/mimalloc#1038

@arsnyder16
Copy link
Contributor

looking forward to this being pull through into emsdk

emsdk currently builds binaryen dynamically linked against glibc. Converting it to use mimalloc/musl/alpine would require some changes here: https://chromium.googlesource.com/emscripten-releases/+/refs/heads/main/src/build.py#900

Ok so wasm-opt in emsdk will be unchanged then?

So based on the performance improvements we should see non, but theoretically if we LD_PRELOAD mimalloc we should see the improvements shown here?
image

@danleh
Copy link
Contributor Author

danleh commented Mar 19, 2025

Ok so wasm-opt in emsdk will be unchanged then?

Correct.

if we LD_PRELOAD mimalloc we should see the improvements shown here?

Also correct. At least on my Debian system this works:

$ sudo apt install libmimalloc2.0
$ LD_PRELOAD=$(echo /usr/lib/*/libmimalloc.so) MIMALLOC_VERBOSE=1 ~/emsdk/upstream/bin/wasm-opt --version
mimalloc: option 'show_errors': 0 
[...]

@dschuff
Copy link
Member

dschuff commented Mar 19, 2025

@arsnyder16 if you try this out I would be very interested to hear the result. I don't necessarily mind also adjusting our EMSDK build if there are good performance gains.

@arsnyder16
Copy link
Contributor

I will collect the numbers and get back to you

@arsnyder16
Copy link
Contributor

@dschuff @danleh @sbc100 I compared glibc, jemalloc, mimalloc using emsdk 4.0.1

allocator wall time user time max rss (KB)
glibc 61s 420s 1867320
jemalloc 51s 370s 1861916
mimalloc 44s 347s 1680188
/root/emsdk/upstream/bin/wasm-opt \
--strip-target-features --post-emscripten  -Os \
--low-memory-unused --zero-filled-memory \
--pass-arg=directize-initial-contents-immutable --no-stack-ir \
 in.wasm -o out.wasm \
--mvp-features --enable-threads --enable-bulk-memory \
--enable-bulk-memory-opt --enable-call-indirect-overlong \
--enable-multivalue --enable-mutable-globals \
--enable-nontrapping-float-to-int --enable-reference-types --enable-sign-ext

@danleh
Copy link
Contributor Author

danleh commented Mar 20, 2025

Very cool, so in this particular case, mimalloc is both a runtime and max memory usage improvement. Is the .wasm binary available? (You have to zip it, if you want to attach/upload it to GitHub.)

@arsnyder16
Copy link
Contributor

@danleh Unfortunately it is not, but i can collect more information for you if you have anything in mind

@sbc100
Copy link
Member

sbc100 commented Mar 20, 2025

@dschuff @danleh @sbc100 I compared glibc, jemalloc, mimalloc using emsdk 4.0.1

Did you run those tests using the 4.0.1 binaryen binaries with the LD_PRELOAD trick?

If so then perhaps the simplest way to allow users to take advantage of this in emsdk would be to ship the malloc shared libraries and have emcc.py inject LD_PRELOAD when running wasm-opt?

@arsnyder16
Copy link
Contributor

Did you run those tests using the 4.0.1 binaryen binaries with the LD_PRELOAD trick?

Yes that is exactly what i did.

If so then perhaps the simplest way to allow users to take advantage of this in emsdk would be to ship the malloc shared libraries and have emcc.py inject LD_PRELOAD when running wasm-opt?

This seems reasonable.

@dschuff
Copy link
Member

dschuff commented Mar 20, 2025

I think just adding it to our existing static build would be much easier: https://chromium-review.googlesource.com/c/emscripten-releases/+/6377785

It means that only the LTO release builds will have it, but I think that's fine, the asserts builds are optimized more for package size.

@sbc100
Copy link
Member

sbc100 commented Mar 20, 2025

I think just adding it to our existing static build would be much easier: https://chromium-review.googlesource.com/c/emscripten-releases/+/6377785

It means that only the LTO release builds will have it, but I think that's fine, the asserts builds are optimized more for package size.

But that existing "static" build is not really static, in that it still links dynamically against glibc right? Its not static in the alpine linux sense is it? Converting to be actually static would involve either using musl libc, or trying to get a glibc to link statically, which I believe is not recommend.

@arsnyder16
Copy link
Contributor

But that existing "static" build is not really static, in that it still links dynamically against glibc right? Its not static in the alpine linux sense is it? Converting to be actually static would involve either using musl libc, or trying to get a glibc to link statically, which I believe is not recommend.

The documentation indicates this is possible when linking against the mimalloc object file https://github.com/microsoft/mimalloc?tab=readme-ov-file#static-override

@sbc100
Copy link
Member

sbc100 commented Mar 21, 2025

But that existing "static" build is not really static, in that it still links dynamically against glibc right? Its not static in the alpine linux sense is it? Converting to be actually static would involve either using musl libc, or trying to get a glibc to link statically, which I believe is not recommend.

The documentation indicates this is possible when linking against the mimalloc object file https://github.com/microsoft/mimalloc?tab=readme-ov-file#static-override

The terminology is little confusing here though. This static overrides does not require static linking of the program right? i.e. including mimalloc.o in your program is enough, even with dynamic linking? I think that means that we can just do this for all our binaryen builds, not just the LTO ones... if we want.

@arsnyder16
Copy link
Contributor

The terminology is little confusing here though. This static overrides does not require static linking of the program right? i.e. including mimalloc.o in your program is enough, even with dynamic linking? I think that means that we can just do this for all our binaryen builds, not just the LTO ones... if we want.

Agreed, I have never done this method of overriding but the explanation makes sense, but I can't confirm for sure it works.

@dschuff
Copy link
Member

dschuff commented Mar 21, 2025

Yes, you are right that we don't have a fully static build, and that we could use mimalloc in all of them. It would require changing the CMake code from this PR (to allow linking mimalloc into the dynamic-libbinaryen build) or using static libbinaryen in our asserts builds (both of which would make them a bit bigger) but I don't think it would be hard. I had just figured we could continue to optimize those builds for a smaller package size, since they are mostly unused other than for testing.
One argument against that would be that we should test as close to the production code as possible.

@dschuff
Copy link
Member

dschuff commented Mar 22, 2025

#7391 would enable it for dynamic linking too. We could just configure the emscripten-releases builder to use mimalloc unconditionally. It would increase the size of the asserts build by ~200-300k, but it would mean we'd be using the same allocator for those builds as for the release build, which I think would make me feel better about testing what our users are using.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants