Skip to content

Memory leak in threadpool #8839

@fangmartin

Description

@fangmartin

What did you do?

I wrote a routing function where each request inputs an image, and the backend uses PIL for processing. Therefore, I simplified the code into a multi-threaded pool processing form. I noticed that memory usage keeps increasing as the number of requests grows.

import base64
import gc
import io
import os
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

import loguru
import psutil
from PIL import Image, ImageEnhance


def process_image(image_base64):
    """process image"""
    img_bytes = base64.b64decode(image_base64)

    with io.BytesIO(img_bytes) as img_stream:
        pil_image = Image.open(img_stream)
        pil_image.load()
        enhancer = ImageEnhance.Contrast(pil_image)
        image = enhancer.enhance(1.5)

        enhancer = ImageEnhance.Sharpness(image)
        image = enhancer.enhance(1.3)
        image = image.convert("RGB")

        image.close()

    del img_bytes
    return "processed"


def threadpool_infer(image_path):
    with Image.open(image_path) as image:
        with io.BytesIO() as output:
            image.save(output, format="JPEG")
            jpeg_bytes = output.getvalue()
            image_b64 = base64.b64encode(jpeg_bytes).decode("utf-8")

    total_requests = 100
    concurrent_requests = 10
    batch_size = concurrent_requests

    process = psutil.Process(os.getpid())
    start_time = time.time()
    start_mem = process.memory_info().rss / 1024 / 1024

    try:
        with ThreadPoolExecutor(max_workers=concurrent_requests) as pool:
            for i in range(0, total_requests, batch_size):
                current_batch = min(batch_size, total_requests - i)
                batch_num = i // batch_size + 1

                pre_batch_mem = process.memory_info().rss / 1024 / 1024
                loguru.logger.info(f"Batch {batch_num} starting. Memory: {pre_batch_mem:.2f}MB")

                futures = []
                for _ in range(current_batch):
                    future = pool.submit(process_image, image_b64)
                    futures.append(future)

                results = []
                for future in as_completed(futures):
                    results.append(future.result())

                gc.collect()

                post_batch_mem = process.memory_info().rss / 1024 / 1024
                batch_mem_diff = post_batch_mem - pre_batch_mem
                loguru.logger.info(
                    f"Batch {batch_num} completed. "
                    f"Processed {len(results)} images. "
                    f"Memory: {post_batch_mem:.2f}MB "
                    f"(batch diff: {batch_mem_diff:+.2f}MB)"
                )

    finally:
        end_time = time.time()
        end_mem = process.memory_info().rss / 1024 / 1024

        duration = end_time - start_time
        mem_diff = end_mem - start_mem

        loguru.logger.error(
            f"\nTotal Time: {duration:.2f}s, Memory: {mem_diff:+.2f}MB ({start_mem:.1f}MB -> {end_mem:.1f}MB)"
        )
        gc.collect()


if __name__ == "__main__":
    threadpool_infer("my_image.jpg")

Here are the results of executing 50 records:

2025-03-28 18:24:49.837 | INFO     | __main__:threadpool_infer:55 - Batch 1 starting. Memory: 93.73MB
2025-03-28 18:24:50.393 | INFO     | __main__:threadpool_infer:70 - Batch 1 completed. Processed 10 images. Memory: 434.20MB (batch diff: +340.47MB)
2025-03-28 18:24:50.393 | INFO     | __main__:threadpool_infer:55 - Batch 2 starting. Memory: 434.20MB
2025-03-28 18:24:50.893 | INFO     | __main__:threadpool_infer:70 - Batch 2 completed. Processed 10 images. Memory: 696.03MB (batch diff: +261.83MB)
2025-03-28 18:24:50.893 | INFO     | __main__:threadpool_infer:55 - Batch 3 starting. Memory: 696.03MB
2025-03-28 18:24:51.368 | INFO     | __main__:threadpool_infer:70 - Batch 3 completed. Processed 10 images. Memory: 738.77MB (batch diff: +42.73MB)
2025-03-28 18:24:51.368 | INFO     | __main__:threadpool_infer:55 - Batch 4 starting. Memory: 738.77MB
2025-03-28 18:24:51.849 | INFO     | __main__:threadpool_infer:70 - Batch 4 completed. Processed 10 images. Memory: 739.19MB (batch diff: +0.42MB)
2025-03-28 18:24:51.849 | INFO     | __main__:threadpool_infer:55 - Batch 5 starting. Memory: 739.19MB
2025-03-28 18:24:52.347 | INFO     | __main__:threadpool_infer:70 - Batch 5 completed. Processed 10 images. Memory: 739.36MB (batch diff: +0.17MB)
2025-03-28 18:24:52.348 | INFO     | __main__:threadpool_infer:55 - Batch 6 starting. Memory: 739.36MB
2025-03-28 18:24:52.845 | INFO     | __main__:threadpool_infer:70 - Batch 6 completed. Processed 10 images. Memory: 739.56MB (batch diff: +0.20MB)
2025-03-28 18:24:52.846 | INFO     | __main__:threadpool_infer:55 - Batch 7 starting. Memory: 739.56MB
2025-03-28 18:24:53.326 | INFO     | __main__:threadpool_infer:70 - Batch 7 completed. Processed 10 images. Memory: 754.36MB (batch diff: +14.80MB)
2025-03-28 18:24:53.326 | INFO     | __main__:threadpool_infer:55 - Batch 8 starting. Memory: 754.36MB
2025-03-28 18:24:53.808 | INFO     | __main__:threadpool_infer:70 - Batch 8 completed. Processed 10 images. Memory: 757.69MB (batch diff: +3.33MB)
2025-03-28 18:24:53.808 | INFO     | __main__:threadpool_infer:55 - Batch 9 starting. Memory: 757.69MB
2025-03-28 18:24:54.315 | INFO     | __main__:threadpool_infer:70 - Batch 9 completed. Processed 10 images. Memory: 764.31MB (batch diff: +6.62MB)
2025-03-28 18:24:54.315 | INFO     | __main__:threadpool_infer:55 - Batch 10 starting. Memory: 764.31MB
2025-03-28 18:24:54.815 | INFO     | __main__:threadpool_infer:70 - Batch 10 completed. Processed 10 images. Memory: 767.67MB (batch diff: +3.36MB)
2025-03-28 18:24:54.816 | ERROR    | __main__:threadpool_infer:84 - 
Total Time: 4.98s, Memory: +673.62MB (93.7MB -> 767.4MB)
Image

What did you expect to happen?

The memory, taken by a PIL image copy, should be released after each request.

What did you expect to happen?

The memory isn't released.

What are your OS, Python and Pillow versions?

  • OS: MacOS 15.3.2
  • Python: 3.11.9
  • Pillow: 10.4.0
--------------------------------------------------------------------
Pillow 10.4.0
Python 3.11.9 (main, Apr 19 2024, 11:43:47) [Clang 14.0.6 ]
--------------------------------------------------------------------
Python executable is /opt/anaconda3/envs/llm-composition-score/bin/python3
System Python files loaded from /opt/anaconda3/envs/llm-composition-score
--------------------------------------------------------------------
Python Pillow modules loaded from /opt/anaconda3/envs/llm-composition-score/lib/python3.11/site-packages/PIL
Binary Pillow modules loaded from /opt/anaconda3/envs/llm-composition-score/lib/python3.11/site-packages/PIL
--------------------------------------------------------------------
--- PIL CORE support ok, compiled for 10.4.0
--- TKINTER support ok, loaded 8.6
--- FREETYPE2 support ok, loaded 2.13.2
--- LITTLECMS2 support ok, loaded 2.16
--- WEBP support ok, loaded 1.4.0
--- WEBP Transparency support ok
--- WEBPMUX support ok
--- WEBP Animation support ok
--- JPEG support ok, compiled for libjpeg-turbo 3.0.3
--- OPENJPEG (JPEG2000) support ok, loaded 2.5.2
--- ZLIB (PNG/ZIP) support ok, loaded 1.3.1
--- LIBTIFF support ok, loaded 4.6.0
*** RAQM (Bidirectional Text) support not installed
*** LIBIMAGEQUANT (Quantization method) support not installed
--- XCB (X protocol) support ok
--------------------------------------------------------------------

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions