fix memleak when input contain large image data by grimoire · Pull Request #4610 · InternLM/lmdeploy

grimoire · 2026-05-21T12:54:00Z

This is a temp work around.
A better fix is making a sharemem pool and pass data handle to the engine.

Copilot

Pull request overview

This PR introduces a temporary workaround to mitigate host-memory growth when serving requests that include large multimodal (image) inputs, by periodically forcing Python GC + libc malloc trimming after a configurable number of multimodal sessions end. It also adds a small quality-of-life change to rename the ZMQ MP engine process for easier identification.

Changes:

Add LMDEPLOY_MULTIMODAL_SESSION_TRIM_COUNT env var to control how often memory trimming runs.
Track ended multimodal sessions in the PyTorch Engine and trigger gc.collect() + malloc_trim() after the configured threshold.
Route disaggregation “cache free” teardown through Engine.end_session() (instead of directly calling scheduler.end_session) so trimming logic can run; rename the ZMQ MP engine process via prctl.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`lmdeploy/pytorch/envs.py`	Adds env var for controlling multimodal-session trim frequency.
`lmdeploy/pytorch/engine/engine.py`	Implements multimodal session detection/counter and memory trimming via GC + `malloc_trim`.
`lmdeploy/pytorch/engine/mp_engine/zmq_engine.py`	Attempts to rename the MP process using `prctl` for easier debugging.
`lmdeploy/pytorch/disagg/conn/engine_conn.py`	Uses `Engine.end_session()` so teardown can trigger the new trim behavior.

Comments suppressed due to low confidence (3)

lmdeploy/pytorch/engine/engine.py:350

gc.collect()/malloc_trim() are invoked synchronously in the request-processing path (e.g., end_session() and ZMQ free handling). Full GC + malloc trimming can stall the asyncio loop and introduce latency spikes. Consider offloading the trim to a background thread/task (or rate-limit it with time-based throttling) so session teardown doesn’t block the engine loop.

    def _maybe_trim_multimodal_session(self, has_multimodal: bool):
        """Trim host memory after enough multimodal sessions have ended."""
        if not has_multimodal or self._multimodal_session_trim_count <= 0:
            return

        self._multimodal_session_end_count += 1
        if self._multimodal_session_end_count < self._multimodal_session_trim_count:
            return

        self._multimodal_session_end_count = 0
        self._try_mem_trim()

lmdeploy/pytorch/engine/engine.py:337

_has_multimodal_session() relies on HistoryMultiModals.empty(), but that method currently iterates over the dict keys (see HistoryMultiModals.empty in lmdeploy/pytorch/messages.py) and can report non-empty incorrectly. That can cause _maybe_trim_multimodal_session to trigger trims unexpectedly or miss them in edge cases. Either fix HistoryMultiModals.empty() to check self.multimodals.values() or avoid using it here by checking for any non-empty modal data directly.

    def _has_multimodal_session(session) -> bool:
        """Check whether session has multimodal history."""
        return any(not seq.history_multimodals.empty() for seq in session.sequences.values())

lmdeploy/pytorch/engine/engine.py:349

This introduces new behavior gated by LMDEPLOY_MULTIMODAL_SESSION_TRIM_COUNT, but there’s no unit test ensuring the counter resets and _try_mem_trim is invoked only after the configured number of multimodal session endings. Since there are existing Engine unit tests under tests/pytorch/engine/, consider adding a focused test around _maybe_trim_multimodal_session (mocking _try_mem_trim).

    def _maybe_trim_multimodal_session(self, has_multimodal: bool):
        """Trim host memory after enough multimodal sessions have ended."""
        if not has_multimodal or self._multimodal_session_trim_count <= 0:
            return

        self._multimodal_session_end_count += 1
        if self._multimodal_session_end_count < self._multimodal_session_trim_count:
            return

        self._multimodal_session_end_count = 0
        self._try_mem_trim()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    def _try_mem_trim():
+        """Try to trim memory."""
+        try:
+            gc.collect()
+            ctypes.CDLL('libc.so.6').malloc_trim(0)
+        except Exception as e:
+            logger.debug(f'Memory trim failed: {e}')


+        # try rename the process
+        try:
+            import ctypes
+            ctypes.CDLL(None).prctl(15, b'ZMQMPEngine', 0, 0, 0)


* fix memleak when input contain large image data * fix sleep

fix memleak when input contain large image data

76594f5

Copilot AI review requested due to automatic review settings May 21, 2026 12:54

Copilot started reviewing on behalf of grimoire May 21, 2026 12:54 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

fix sleep

9302687

lvhan028 requested a review from CUHKSZzxy May 25, 2026 04:03

lvhan028 added the Bug:P1 label May 25, 2026

CUHKSZzxy approved these changes May 26, 2026

View reviewed changes

lvhan028 merged commit f2f4bc2 into InternLM:main May 26, 2026
5 of 6 checks passed

lvhan028 pushed a commit to lvhan028/lmdeploy that referenced this pull request May 27, 2026

fix memleak when input contain large image data (InternLM#4610)

e8fb43e

* fix memleak when input contain large image data * fix sleep

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix memleak when input contain large image data#4610

fix memleak when input contain large image data#4610
lvhan028 merged 2 commits into
InternLM:mainfrom
grimoire:fix-largedata-memleak

grimoire commented May 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

grimoire commented May 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants