[https://nvbugs/6064029][perf] Use fast PNG compression for visual gen serving by karljang · Pull Request #13074 · NVIDIA/TensorRT-LLM

karljang · 2026-04-15T07:07:32Z

Summary

Replace optimize=True with compress_level=1 in PIL PNG encoding for visual gen serving. PNG is lossless at all compression levels — compress_level only trades encode speed for file size.

Follow-up to #12903 which eliminated the double PNG encode. This PR addresses the remaining encode overhead.

Benchmark

PNG encode microbenchmark (B200, FLUX.2-dev 1280x720 image):

Method	Time	Size
`optimize=True` (before)	0.816s	1,249 KB
`compress_level=1` (after)	0.075s	1,358 KB

10.9x faster, +8.8% file size. PNG is lossless at all levels.

End-to-end b64_json serving (A/B on same B200 node, FLUX.2-dev, 3 runs each):

	Run 1	Run 2	Run 3	Avg
Before (`optimize=True`)	12.44s	12.73s	12.96s	12.71s
After (`compress_level=1`)	10.75s	10.78s	10.81s	10.78s

1.93s saved per request (15.2% faster).

Test plan

# Start server
trtllm-serve <model> --port 8190

# Warmup
curl -s -o /dev/null http://localhost:8190/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cat", "response_format": "b64_json"}'

# Timed request
curl -s -o /tmp/resp.json -w "Total: %{time_total}s\n" \
  http://localhost:8190/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A sunset over mountains", "response_format": "b64_json"}'

# Verify valid image
python -c "
import json, base64
from PIL import Image
from io import BytesIO
data = json.load(open('/tmp/resp.json'))
img = Image.open(BytesIO(base64.b64decode(data['data'][0]['b64_json'])))
print(f'Valid: {img.size}, {img.mode}')
"

🤖 Generated with Claude Code

Summary by CodeRabbit

Refactor
- Improved PNG image compression with configurable compression level settings for better control over file size and quality.

…n serving Replace optimize=True with compress_level=1 in PIL PNG encoding. PNG is lossless at all compression levels — this only trades file size for encode speed. Benchmark on B200 with FLUX.2-dev (1280x720 image): optimize=True: 0.816s compress_level=1: 0.075s (10.9x faster, +8.8% file size) End-to-end b64_json serving (A/B on same node): Before: 12.71s avg After: 10.78s avg (1.93s saved, 15.2% faster) Signed-off-by: Kanghwan Jang <kanghwanj@nvidia.com> Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>

coderabbitai · 2026-04-15T07:11:22Z

📝 Walkthrough

Walkthrough

The _save_pil_image() function in the media storage module has been extended to accept a configurable png_compress_level parameter, replacing the hardcoded optimize=True behavior with explicit compression level control passed to PIL's image save method.

Changes

Cohort / File(s)	Summary
PNG Compression Configuration `tensorrt_llm/serve/media_storage.py`	Added `png_compress_level: int = 1` parameter to `_save_pil_image()` method; replaced `optimize=True` with `compress_level=png_compress_level` in PNG save operation; updated docstring to document the new parameter.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows the required template format with NVBugs ID, type [perf], and clearly summarizes the main change: optimizing PNG compression for visual generation serving.
Description check	✅ Passed	Description includes all major sections (Summary, Benchmark with detailed results, Test plan with reproducible steps) and clearly explains the change rationale, trade-offs, and performance impact.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tensorrt_llm/serve/media_storage.py (1)
602-608: ⚠️ Potential issue | 🟡 Minor

Add explicit return type annotation for _save_pil_image.

This touched function returns no value, so annotate it as -> None to satisfy repo typing rules.
Proposed fix
     def _save_pil_image(
         pil_image: Image.Image,
         output: Any,  # Can be path string or BytesIO
         format: str,
         quality: int,
         png_compress_level: int = 1,
-    ):
+    ) -> None:
As per coding guidelines: "Use static type annotations for all functions; always annotate return types including None if no return value."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/serve/media_storage.py` around lines 602 - 608, The function
_save_pil_image is missing an explicit return type; annotate its signature with
-> None to satisfy typing rules. Locate the _save_pil_image definition and add
the return annotation (def _save_pil_image(... ) -> None:) and ensure no return
values are introduced elsewhere in that function (keep behavior unchanged). This
targets the function named _save_pil_image in media_storage.py.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/serve/media_storage.py`:
- Around line 616-617: Add an explicit validation that the png_compress_level
parameter is an integer between 0 and 9 and raise a ValueError if it is not;
specifically, before any PNG write/save code that uses the png_compress_level
parameter, check "if not isinstance(png_compress_level, int) or not (0 <=
png_compress_level <= 9): raise ValueError(f'png_compress_level must be an int
in 0..9, got {png_compress_level!r}')" and apply the same guard to the other
function/location that accepts png_compress_level so invalid inputs fail fast
with a clear error.

---

Outside diff comments:
In `@tensorrt_llm/serve/media_storage.py`:
- Around line 602-608: The function _save_pil_image is missing an explicit
return type; annotate its signature with -> None to satisfy typing rules. Locate
the _save_pil_image definition and add the return annotation (def
_save_pil_image(... ) -> None:) and ensure no return values are introduced
elsewhere in that function (keep behavior unchanged). This targets the function
named _save_pil_image in media_storage.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ea92379a-f2a3-4f4f-a317-1e4eb3e00db9

📥 Commits

Reviewing files that changed from the base of the PR and between 4825da7 and b7cd725.

📒 Files selected for processing (1)

tensorrt_llm/serve/media_storage.py

karljang · 2026-04-15T07:16:51Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-15T07:22:45Z

PR_Github #43422 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

tensorrt-cicd · 2026-04-15T15:53:26Z

PR_Github #43422 [ run ] completed with state SUCCESS. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #33954 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-17T01:22:39Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-17T01:28:22Z

PR_Github #43869 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

tensorrt-cicd · 2026-04-17T05:37:31Z

PR_Github #43869 [ run ] completed with state FAILURE. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #34321 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-17T06:33:46Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-17T06:39:37Z

PR_Github #43987 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

venkywonka

Amazing such high !/$ change!

tensorrt-cicd · 2026-04-18T06:09:46Z

PR_Github #43987 [ run ] completed with state SUCCESS. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #34426 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-19T05:58:25Z

/bot run --disable-fail-fast --reuse-test

tensorrt-cicd · 2026-04-19T06:04:17Z

PR_Github #44161 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

tensorrt-cicd · 2026-04-19T07:31:08Z

PR_Github #44161 [ run ] completed with state SUCCESS. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #34588 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-19T22:24:56Z

/bot run --disable-fail-fast --reuse-test

tensorrt-cicd · 2026-04-19T22:32:01Z

PR_Github #44229 [ run ] triggered by Bot. Commit: c7e3f35 Link to invocation

tensorrt-cicd · 2026-04-20T03:42:45Z

PR_Github #44229 [ run ] completed with state FAILURE. Commit: c7e3f35
/LLM/main/L0_MergeRequest_PR pipeline #34651 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-20T06:03:45Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-20T06:11:31Z

PR_Github #44345 [ run ] triggered by Bot. Commit: c7e3f35 Link to invocation

tensorrt-cicd · 2026-04-20T06:27:15Z

PR_Github #44345 [ run ] completed with state FAILURE. Commit: c7e3f35
/LLM/main/L0_MergeRequest_PR pipeline #34765 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-21T18:25:07Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-21T18:31:29Z

PR_Github #44787 [ run ] triggered by Bot. Commit: 9bc6cdc Link to invocation

mzweilz · 2026-04-22T09:50:15Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-22T10:00:33Z

PR_Github #44787 [ run ] completed with state SUCCESS. Commit: 9bc6cdc
/LLM/main/L0_MergeRequest_PR pipeline #35141 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

tensorrt-cicd · 2026-04-22T10:15:54Z

PR_Github #44953 [ run ] triggered by Bot. Commit: 9bc6cdc Link to invocation

karljang · 2026-04-24T16:37:06Z

/bot kill

karljang · 2026-04-24T16:37:34Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-24T16:43:12Z

PR_Github #45418 [ kill ] triggered by Bot. Commit: d0ead79 Link to invocation

tensorrt-cicd · 2026-04-24T16:43:15Z

PR_Github #45418 [ kill ] completed with state SUCCESS. Commit: d0ead79
Successfully killed previous jobs for commit d0ead79

Link to invocation

tensorrt-cicd · 2026-04-24T16:44:11Z

PR_Github #45419 [ run ] triggered by Bot. Commit: d0ead79 Link to invocation

tensorrt-cicd · 2026-04-25T15:06:31Z

PR_Github #45419 [ run ] completed with state SUCCESS. Commit: d0ead79
/LLM/main/L0_MergeRequest_PR pipeline #35654 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

karljang · 2026-04-25T18:01:39Z

/bot run --disable-fail-fast

karljang · 2026-04-27T06:39:53Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-27T06:46:55Z

PR_Github #45673 [ run ] triggered by Bot. Commit: d0ead79 Link to invocation

tensorrt-cicd · 2026-04-27T10:13:52Z

PR_Github #45673 [ run ] completed with state SUCCESS. Commit: d0ead79
/LLM/main/L0_MergeRequest_PR pipeline #35882 completed with status: 'SUCCESS'

CI Report

Link to invocation

karljang requested a review from a team as a code owner April 15, 2026 07:07

karljang requested a review from venkywonka April 15, 2026 07:07

github-actions Bot assigned karljang Apr 15, 2026

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread tensorrt_llm/serve/media_storage.py

venkywonka approved these changes Apr 17, 2026

View reviewed changes

Merge branch 'main' into fix/png-compress-level

c7e3f35

karljang requested a review from a team as a code owner April 19, 2026 22:24

karljang requested review from moraxu and tijyojwad April 19, 2026 22:24

moraxu approved these changes Apr 20, 2026

View reviewed changes

Merge branch 'main' into fix/png-compress-level

9bc6cdc

Merge branch 'main' into fix/png-compress-level

d0ead79

karljang enabled auto-merge (squash) April 23, 2026 23:18

karljang merged commit 2ea0e63 into NVIDIA:main Apr 27, 2026
5 checks passed

karljang deleted the fix/png-compress-level branch April 27, 2026 15:49

Conversation

karljang commented Apr 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 15, 2026

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

karljang commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

tensorrt-cicd commented Apr 15, 2026

Uh oh!

karljang commented Apr 17, 2026

Uh oh!

tensorrt-cicd commented Apr 17, 2026

Uh oh!

tensorrt-cicd commented Apr 17, 2026

Uh oh!

karljang commented Apr 17, 2026

Uh oh!

tensorrt-cicd commented Apr 17, 2026

Uh oh!

venkywonka left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Apr 18, 2026

Uh oh!

karljang commented Apr 19, 2026

Uh oh!

tensorrt-cicd commented Apr 19, 2026

Uh oh!

tensorrt-cicd commented Apr 19, 2026

Uh oh!

karljang commented Apr 19, 2026

Uh oh!

tensorrt-cicd commented Apr 19, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

karljang commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

tensorrt-cicd commented Apr 20, 2026

Uh oh!

karljang commented Apr 21, 2026

Uh oh!

tensorrt-cicd commented Apr 21, 2026

Uh oh!

mzweilz commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

tensorrt-cicd commented Apr 22, 2026

Uh oh!

karljang commented Apr 24, 2026

Uh oh!

karljang commented Apr 24, 2026

Uh oh!

tensorrt-cicd commented Apr 24, 2026

Uh oh!

tensorrt-cicd commented Apr 24, 2026

Uh oh!

tensorrt-cicd commented Apr 24, 2026

Uh oh!

tensorrt-cicd commented Apr 25, 2026

Uh oh!

karljang commented Apr 25, 2026

Uh oh!

karljang commented Apr 15, 2026 •

edited by coderabbitai Bot

Loading