Skip to content

Add periodic malloc_trim to prevent unbounded RSS growth in API workers#7481

Open
amasolov wants to merge 1 commit intopulp:mainfrom
amasolov:fix/api-worker-memory-trim
Open

Add periodic malloc_trim to prevent unbounded RSS growth in API workers#7481
amasolov wants to merge 1 commit intopulp:mainfrom
amasolov:fix/api-worker-memory-trim

Conversation

@amasolov
Copy link

@amasolov amasolov commented Mar 18, 2026

Summary

PulpApiWorker (gunicorn SyncWorker) exhibits unbounded RSS growth over time due to glibc heap fragmentation. Django's per-request allocation pattern creates and destroys many small C-level objects (ORM compilers, SQL strings, psycopg cursor state), causing glibc's malloc to retain freed pages rather than returning them to the OS.

This PR adds periodic gc.collect() + malloc_trim(0) calls in PulpApiWorker.handle_request() every N requests (default 1024, configurable via PULP_MEMORY_TRIM_INTERVAL env var, set to 0 to disable).

The fix is Linux-only (glibc malloc_trim), graceful no-op on other platforms. No new dependencies.

Problem

Observed in Ansible Automation Platform 2.6 deployments running pulpcore 3.49 on OpenShift: hub-api worker RSS grows ~1 kB/request even with zero user activity (liveness/readiness probes alone drive growth). Over hours this leads to OOM kills and pod restarts.

Profiling on a live cluster confirmed:

  • Python object counts are completely stable (gc.get_objects() delta ~0)
  • gc.collect() recovers 0 bytes (no reference cycles)
  • malloc_trim(0) recovers ~2 MB immediately (heap fragmentation confirmed)
  • RSS grows linearly without trimming, stabilizes completely with trimming

The root cause is glibc's default malloc behavior: small allocations spread across many arenas cause heap fragmentation, and freed blocks are not returned to the OS until malloc_trim is explicitly called.

Changes

pulpcore/app/entrypoint.py:

  • At module load: detect Linux, load libc.malloc_trim via ctypes
  • PulpApiWorker.handle_request(): after each request, increment counter; every PULP_MEMORY_TRIM_INTERVAL requests (default 1024), call gc.collect() then malloc_trim(0)
  • Log at worker init when trimming is enabled

Configuration

Env var Default Description
PULP_MEMORY_TRIM_INTERVAL 1024 Run trim every N requests. Set to 0 to disable.

Test plan

  • Verify workers start normally with default settings (trim enabled)
  • Verify PULP_MEMORY_TRIM_INTERVAL=0 disables trimming (no log message)
  • Verify RSS growth stabilizes under sustained probe/request load
  • Verify no functional regression on macOS (trim is a no-op, no errors)
  • Run existing unit/functional test suite

📜 Checklist

  • Commits are cleanly separated with meaningful messages (simple features and bug fixes should be squashed to one commit)
  • A changelog entry or entries has been added for any significant changes
  • Follows the Pulp policy on AI Usage
  • (For new features) - User documentation and test coverage has been added

See: Pull Request Walkthrough

@dralley
Copy link
Contributor

dralley commented Mar 18, 2026

I think a safer and more practical approach might be to just configure a maximum number of requests for a Gunicorn worker to handle before being rebooted.

We expose the options to do so on the API entrypoint already: https://github.com/pulp/pulpcore/blob/main/pulpcore/app/entrypoint.py#L153-L154

https://gunicorn.org/guides/docker/?h=memory#out-of-memory
https://gunicorn.org/reference/settings/#worker_connections

@amasolov
Copy link
Author

I think a safer and more practical approach might be to just configure a maximum number of requests for a Gunicorn worker to handle before being rebooted.

I agree that --max-requests is a practical safety net and should be a part of the story. However I think these two approaches are complementary rather than alternatives.

--max-requests masks the symptom by recycling workers periodically but each worker still grows until it's replaced. Under heavier load in enterprise environments the recycling more frequent and add brief latency during worker replacement.

malloc_trim addresses the root cause by making glibc retaining freed pages in the process heap. Calling malloc_trim(0) returns them to the OS and RSS stabilises and never grows further. No worker restart needed.

Utilising --max-requests requires changes in the end products (for example AAP doesn't have an option to set it and keep it persistent) and malloc_trim would just work.

@dralley
Copy link
Contributor

dralley commented Mar 18, 2026

Can you briefly try using https://docs.python.org/3/library/tracemalloc.html (or something similar, like memray) to get a report on what is allocating memory during a standard liveness probe request?

I understand that that would measure what is going on with Python's own allocators rather than libc malloc, but still, I wouldn't think there would be much fragmentation accumulating on a service in a case where the same endpoint was merely being called over and over, using and then releasing approximately the same amount every time. So there is probably fragmentation, but it may also be triggered by other misbehavior

@pedro-psb pedro-psb linked an issue Mar 18, 2026 that may be closed by this pull request

logger = getLogger(__name__)

_MEMORY_TRIM_INTERVAL = int(os.environ.get("PULP_MEMORY_TRIM_INTERVAL", "1024"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is some discussion about this, but in any case, this setting should be defined in settings.py as the others and documented in settings.md. Settings defined there are automatically overridable via PULP_{NAME} envvar.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedro-psb Good call, updated in the latest push:

  • MEMORY_TRIM_INTERVAL = 1024 added to settings.py (so it picks up PULP_MEMORY_TRIM_INTERVAL via dynaconf automatically)
  • entrypoint.py now reads from settings.MEMORY_TRIM_INTERVAL in init_process() instead of os.environ.get()
  • Documented in docs/admin/reference/settings.md

@amasolov
Copy link
Author

Can you briefly try using https://docs.python.org/3/library/tracemalloc.html (or something similar, like memray) to get a report on what is allocating memory during a standard liveness probe request?

@dralley Sure

Here's tracemalloc data from a live AAP 2.6 cluster (pulpcore 3.49.49, Django 4.2.27, Python 3.12, glibc 2.34, OpenShift).

Baseline snapshot taken after lazy init settled, then 200 sequential curl requests to /pulp/api/v3/status/ from inside the pod, then diff snapshot.

RSS vs Python allocations (PID 2, 200 requests):

Metric Baseline After 200 reqs Delta
VmRSS 168,932 kB 181,028 kB +12,096 kB
tracemalloc traced 6,837,468 B ~6,851,690 B +14 KB
gc.get_objects() 306,151 ~306,000 ~0

tracemalloc top diffs (all negative = Python freeing, not leaking):
rest_framework/fields.py:625 -10,208 B (-81 objs) rest_framework/fields.py:341 -9,968 B (-64 objs) psycopg/_adapters_map.py:181 -9,376 B (-4 objs) psycopg/_adapters_map.py:156 -9,376 B (-4 objs) psycopg/_typeinfo.py:339 -9,304 B (-2 objs) rest_framework/fields.py:381 -7,488 B (-86 objs) django/utils/deconstruct.py:18 +6,664 B (+119 objs) <- largest positive, one-time lazy init

Earlier malloc_trim validation (same cluster, 4000 requests):

  • gc.collect() -> 0 bytes recovered (no reference cycles)
  • malloc_trim(0) -> ~2 MB recovered immediately
  • With periodic malloc_trim(0) every 1024 reqs: RSS stabilised at ~140 MB; one worker decreased 224 kB between req 2000 and 4000
    12 MB RSS growth with only 14 KB of Python allocation growth = the gap is glibc holding freed pages. malloc_trim returns them to the OS.

Gunicorn API workers exhibit unbounded RSS growth over time due to glibc
heap fragmentation. Django's per-request allocation pattern creates and
destroys many small C-level objects (ORM compilers, SQL strings, psycopg
cursor state) which causes glibc's malloc to retain freed pages in the
process heap rather than returning them to the OS.

Profiling on a live Ansible Automation Platform 2.6 deployment
(pulpcore 3.49.49, Django 4.2.27, Python 3.12) confirmed:
- Python object counts are completely stable (no object leak)
- gc.collect() recovers 0 bytes (no reference cycles)
- malloc_trim(0) recovers ~2 MB immediately (fragmentation confirmed)
- RSS grows ~1 kB/request without trimming

This adds periodic gc.collect() + malloc_trim(0) calls in
PulpApiWorker.handle_request() every MEMORY_TRIM_INTERVAL requests
(default 1024, configurable via PULP_MEMORY_TRIM_INTERVAL through
the standard Django/dynaconf settings, set 0 to disable).

The fix is Linux-only (glibc malloc_trim), graceful no-op on other
platforms. Testing shows RSS stabilizes completely after one-time lazy
initialization, eliminating unbounded growth.

closes pulp#7482

Assisted-by: Claude (Anthropic) - investigation, profiling, and code
Made-with: Cursor
@amasolov amasolov force-pushed the fix/api-worker-memory-trim branch from 4279638 to 86faabc Compare March 18, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PulpApiWorker RSS grows unbounded due to glibc heap fragmentation

3 participants