[PoC] Jemalloc profiler by squadgazzz · Pull Request #3533 · cowprotocol/services

squadgazzz · 2025-07-31T13:24:41Z

Description

This is a proof of concept for integrating the Jemalloc memory profiler into various services to collect memory dumps on demand. Profiling with the Jemalloc allocator is currently considered the most resource-efficient option (though this still needs to be validated in prod) and doesn’t require running additional applications, which is a major advantage.

The implementation is based on various open-source projects(e.g. https://github.com/tikv/tikv/blob/master/components/tikv_alloc/src/jemalloc.rs#L327)

Changes

Allocator selection: Mimalloc is currently used across all services and provides the best performance, so Jemalloc is only considered for collecting memory dumps. The allocator is selected based on the feature selected during the compilation. The Dockerfile is updated accordingly. This allows granular selection of the Jemalloc allocator. This PR contains implementation only for autopilot. Other crates will be supported in follow-up PRs. The major disadvantage for this approach is that the binary needs to be recompiled. Selecting the memory allocator after seems to be impossible.
Profiler activation: The Jemalloc profiler only records allocations that occur while profiling is active. By default, profiling is disabled. This is useful when allocations during service warm-up consume most of the memory and memory leaks slowly later.
Profiler control: ~~This is implemented using a combination of a USR2 signal and environment variables.~~ TODO: update it. This approach is much easier than introducing an HTTP API with auth, etc:
- Set the MEM_DUMP_PATH ENV param to specify the dump output directory.
- ~~Set the PROFILER_COMMAND ENV param with one of the values:~~ TODO: update it
  - enable - activates profiling
  - disable - disables profiling
  - dump - stores the recorded dump
  - run_for(<Duration, e.g. 1h>) - automatically activates profiling, records the dump for the provided duration, stores the dump, and disables profiling.
- Send USR2 using kill -USR2 <pid> to execute the command specified in the previous step.

Further automation

Based on the control flexibility, some infra automation can be implemented based on the resource consumption. For example, once memory reaches 50%, start profiling and record a dump when it reaches 90%.

How to test

Try running it locally and in staging. That would require an infra change with PVC creation(it was already done for the Heaptrack profiler)

Copilot

Pull Request Overview

This PR introduces jemalloc memory profiling capabilities to replace the existing mimalloc allocator. The implementation adds signal-based memory profiling that can be triggered via SIGUSR2 to collect heap dumps for performance analysis.

Replaces mimalloc with jemalloc as the global allocator
Adds a JemallocMemoryProfiler that responds to SIGUSR2 signals to collect heap dumps
Integrates the profiler into the autopilot service startup process

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
crates/shared/src/lib.rs	Adds new alloc module to shared library
crates/shared/src/alloc.rs	Implements JemallocMemoryProfiler with signal handling and dump functionality
crates/shared/Cargo.toml	Adds tikv-jemalloc-ctl dependency for profiling controls
crates/autopilot/src/run.rs	Integrates memory profiler into autopilot startup
crates/autopilot/src/main.rs	Switches global allocator from mimalloc to jemalloc
crates/autopilot/Cargo.toml	Replaces mimalloc with tikv-jemallocator dependency
Dockerfile	Adds make package for build dependencies

crates/shared/src/alloc.rs

github-actions · 2025-08-09T00:09:33Z

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

# Conflicts: # Cargo.lock # crates/autopilot/src/run.rs

# Description Even after #3499, a memory leak [was noticed](#3554) when the UniV3 liquidity fetching is enabled in the Baseline solver. Looking at the code, I don't see an obvious reason for it other than the UniswapV3QuoterV2 contract instance is being extensively cloned on each `/solve` request. # Changes - Use `Arc`'ed UniswapV3QuoterV2 contract instance. ## How to test I've tried to resurrect [this](a9ff88f) e2e test, but the liquidity fetching form subgraph sometimes takes a very long time, so the test is too flaky to enable it. ## Follow-ups In any case, I need to find a way to make the Jemalloc profiler work[#3533] to collect memory dumps later in case the memory leak happens again. ## Related Issues #3554

github-actions · 2025-09-04T00:08:34Z

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

# Description Even after #3499, a memory leak [was noticed](#3554) when the UniV3 liquidity fetching is enabled in the Baseline solver. Looking at the code, I don't see an obvious reason for it other than the UniswapV3QuoterV2 contract instance is being extensively cloned on each `/solve` request. # Changes - Use `Arc`'ed UniswapV3QuoterV2 contract instance. ## How to test I've tried to resurrect [this](a9ff88f) e2e test, but the liquidity fetching form subgraph sometimes takes a very long time, so the test is too flaky to enable it. ## Follow-ups In any case, I need to find a way to make the Jemalloc profiler work[#3533] to collect memory dumps later in case the memory leak happens again. ## Related Issues #3554

MartinquaXD · 2025-09-09T08:37:25Z

Given that you are waiting for the maintainers to unblock you, can this PR and the other one temporarily be closed?

squadgazzz added 7 commits July 31, 2025 16:24

Jemalloc profiler

878d3ca

Better file name

055764e

A bit safer approach

959c955

Naming

de3e72d

Init tracing

89a12a6

Missing binary

0adc8ba

Profiling duration

46d214a

squadgazzz requested a review from Copilot July 31, 2025 20:15

Copilot AI reviewed Jul 31, 2025

View reviewed changes

crates/shared/src/alloc.rs Outdated Show resolved Hide resolved

crates/shared/src/alloc.rs Outdated Show resolved Hide resolved

crates/shared/src/alloc.rs Outdated Show resolved Hide resolved

crates/shared/src/alloc.rs Show resolved Hide resolved

squadgazzz added 3 commits August 1, 2025 09:28

Missing return statement

131f753

System temp dir

bf57bba

Just in case

c8bf6cd

github-actions bot added the stale label Aug 9, 2025

github-actions bot closed this Aug 17, 2025

squadgazzz mentioned this pull request Aug 22, 2025

Avoid redundant UniswapV3QuoterV2 cloning #3578

Merged

squadgazzz removed the stale label Aug 22, 2025

squadgazzz reopened this Aug 22, 2025

squadgazzz added 4 commits August 22, 2025 17:22

Commands

23c1880

Merge branch 'main' into jemalloc-profiler

4a1c24b

# Conflicts: # Cargo.lock # crates/autopilot/src/run.rs

Process name

ef8eeec

Redundant Arc

3415c98

squadgazzz changed the title ~~Jemalloc profiler~~ [PoC] Jemalloc profiler Aug 22, 2025

squadgazzz added 3 commits August 22, 2025 19:18

Features

b1f262e

Rework dir path

76d6086

Naming

727e5f1

squadgazzz added 3 commits August 25, 2025 12:32

Default feature

14b1084

More granular allocator selection

9e001f4

Should be reverted

4a5a8e3

squadgazzz added 7 commits August 25, 2025 13:12

Formatting

d7024e3

Switch to sockets

b09e166

Merge branch 'main' into jemalloc-profiler

bf8cdb1

Merge branch 'main' into jemalloc-profiler

55d470f

Comments

02b23b9

Connection log

9ef421c

Merge branch 'main' into jemalloc-profiler

c8bdc0e

github-actions bot added the stale label Sep 4, 2025

squadgazzz closed this Sep 9, 2025

github-actions bot locked and limited conversation to collaborators Sep 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PoC] Jemalloc profiler#3533

[PoC] Jemalloc profiler#3533
squadgazzz wants to merge 27 commits intomainfrom
jemalloc-profiler

squadgazzz commented Jul 31, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Aug 9, 2025

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

MartinquaXD commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

squadgazzz commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Further automation

How to test

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Aug 9, 2025

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

MartinquaXD commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

squadgazzz commented Jul 31, 2025 •

edited

Loading