Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mimalloc as global memory allocator #749

Merged
merged 1 commit into from
Mar 12, 2024

Conversation

aaronmondal
Copy link
Member

@aaronmondal aaronmondal commented Mar 10, 2024

This should make allocation behavior more consistent across different builds and reduce memory fragmentation issues.

The allocation behavior of the nativelink executable is now configurable with MIMALLOC_* environment variables as described here:

https://microsoft.github.io/mimalloc/environment.html


This change is Reviewable

Copy link
Member Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@allada +@adam-singer +@MarcusSorealheis +@zbirenbaum

@allada and I talked at length about how to fix the fragmentation issues that @chrisstaite-menlo brought up. Initially we attempted to fix this withjemalloc in #738, but the additional build requirements associated with it (notably requiring make and autotools) made this infeasible.

As future improvements, there is an additional patch that we could add to potentially improve fragmentation behavior, and we can also go down the dragonflydb route of implementing active defragmentation directly against mimalloc's API similar to this: dragonflydb/dragonfly#523

Reviewable status: 0 of 4 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, publish-image, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04 (waiting on @adam-singer, @allada, @MarcusSorealheis, and @zbirenbaum)

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: 1 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @adam-singer, @MarcusSorealheis, and @zbirenbaum)


flake.nix line 43 at r1 (raw file):

        ...
      }: let
        stable-rust = pkgs.pkgsMusl.rust-bin.stable."1.76.0";

that's it?!?!?! This is all it took!?!!

I'm so confused on nix 😜

Copy link
Collaborator

@chrisstaite-menlo chrisstaite-menlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow... Nice and easy swap. Want me to run this and see how my memory performs?

Reviewable status: 1 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @adam-singer, @MarcusSorealheis, and @zbirenbaum)

Copy link
Member

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed all commit messages.
Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisstaite-menlo that'd be great, but you'll need to build it with nix to see what this PR is really doing.

The redistribute binary we use is built with nix which uses musl. We found that musl was causing significant performance issues when we use it's malloc, so we wanted to use musl for everything but let another allacator deal with memory management. In this case we found jemalloc and mimalloc reduced fragmentation significantly and we choose mimalloc because it can produce reproducible binaries without much (if any) work, unlike jemalloc.

In the end, yes if you could test it, it'd be awesome!

to build it, @aaronmondal correct me if I'm wrong, but I believe you need to install nix then run nix build .#

If you want to just test mimalloc that'd also be useful and can use your current way of building, however, it is possible you need to play with environs to get it to be less fragmented found here: https://microsoft.github.io/mimalloc/environment.html

Once we it landed we'll likely tune it in the code manually, but this is one step closer.

Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)

Copy link
Collaborator

@chrisstaite-menlo chrisstaite-menlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current build uses the Docker container with Bazel: https://github.com/TraceMachina/nativelink/blob/main/deployment-examples/docker-compose/Dockerfile is there an easy way to adapt that use use the build as you suggest?

Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)

Copy link
Member Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisstaite-menlo The build via the Dockerfile should work as well.

To give a bit more insight:

  • The Bazel build uses whatever libc it finds on the system. Since the Dockerfile is based on an Ubuntu base image it'll default to linking glibc and its built-in malloc. This means you're currently using glibc's implementation. After testing we found that this allocator doesn't free memory the way we want which causes the issue you described.
  • At the moment the same is true for the "regular" Cargo build. Some CI workflows are currently failing because I'm trying to completely switch that to a static musl build to bring it closer to what the nix build does.
  • The nix build already runs Cargo with a special configuration to statically link musl. The container image built from that is a minimal image with an overall closure size of ~28MB. The images we publish from that only contain that closure, i.e. a static nativelink executable and some certificates. In comparison, the Dockerfile-based image is ~4x larger at ~120MB.
  • The nix/musl build doesn't show the same fragmentation behavior that we see with glibc. That is, technically it runs more stable and stays within expected memory bounds. However with musl's default allocator this comes at the cost of a fairly significant slowdown (~2x IIRC).

In conclusion:

  • You can test this with the Dockerfile as before. What you'll get is nativelink linked against Ubuntu's glibc but with mimalloc as allocator rather than glibc's default allocator.
  • The nix build would be nativelink+musl+mimalloc. I'd expect it to now behave fairly similar in performance to the nativelink+glibc+mimalloc builds.

For future work I'll try to get the Bazel build up to parity with the nix/cargo/musl build so that we have less variation and better portability between the builds.

Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)

Copy link
Member Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 4 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04 (waiting on @MarcusSorealheis and @zbirenbaum)


flake.nix line 43 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

that's it?!?!?! This is all it took!?!!

I'm so confused on nix 😜

There is a fair bit of nontrivial magic going on here. What we're doing is:

  1. Take nixpkgs
  2. Extend it with the rust-overlay
  3. nixpkgs has cross-compilation toolchains built-in. That is, you can use things like pkgs.pkgsStatic to get a fully static nixpkgs, pkgsMusl for musl-based stuff and a large number of pkgsCross.xxx toolchains for crosscompilation: https://nix.dev/tutorials/cross-compilation.html
  4. Take that cross-transitioned variant of the added rust toolchain
  5. Wrap it in a specialized builder (crane) which invokes cargo and executes the build.
  6. Re-export the cargo toolchain in the local devShell for development. This is still slighty broken in CI because even on musl-based systems some rust tools (e.g. rustdoc and clippy) are dynamically linked against libgcc_s. Shouldn't be too hard to work around though.

The original breakage with missing symbols came from the rust toolchains using a glibc-configured linker. The pkgsMusl change swaps that to a linker that uses Musl instead.

Technically, we could use pkgsStatic everywhere and things would almost always be guaranteed to be perfectly portable. The reason we're not doing this (yet) is because the static packageset isn't prebuilt by nixpkgs. So if we used that every user would have to wait for like 10 years for their local systems to build static GCC and LLVM toolchains lol

This should make allocation behavior more consistent across different
builds and reduce memory fragmentation issues.

The allocation behavior of the `nativelink` executable is now
configurable with `MIMALLOC_*` environment variables as described here:

https://microsoft.github.io/mimalloc/environment.html
Copy link
Member

@adam-singer adam-singer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: 2 of 4 LGTMs obtained (waiting on @MarcusSorealheis and @zbirenbaum)

Copy link
Member Author

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-@MarcusSorealheis -@zbirenbaum

Reviewable status: :shipit: complete! 2 of 2 LGTMs obtained

@aaronmondal aaronmondal merged commit 6c647d6 into TraceMachina:main Mar 12, 2024
25 checks passed
@chrisstaite-menlo
Copy link
Collaborator

I've had this running from the docker-compose for 16 hours in prod now and memory usage is at 5GB and holding. Hopefully sorted things out, will check again in a day or two. CPU usage appears to be significantly reduced also which is weird, but may be just my sampling.

@MarcusSorealheis
Copy link
Collaborator

MarcusSorealheis commented Mar 14, 2024

Wow, that is fantastic news @chrisstaite-menlo

What operating system/version so that we can check another?

@chrisstaite-menlo
Copy link
Collaborator

That's Ubuntu 20.04 with 4.15 kernel. I'm due to upgrade all of the build machines to 22.04 soon, so will hopefully be able to test there also.

@MarcusSorealheis
Copy link
Collaborator

MarcusSorealheis commented Mar 14, 2024 via email

@chrisstaite-menlo
Copy link
Collaborator

45 hours up and at 7GB RAM usage.

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 2 LGTMs obtained

a discussion (no related file):
@chrisstaite-menlo This is good, but not as good as I was hoping. When you have time, could you run nativelink with:

MIMALLOC_PAGE_RESET=1

I believe this is the flag that does most of the fragmentation reduction that we'll be setting, but we need to do more testing. As I mentioned before, we still need to tune mimalloc.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants