Skip to content

[https://nvbugs/6064029][perf] Use fast PNG compression for visual gen serving#13074

Merged
karljang merged 4 commits intoNVIDIA:mainfrom
karljang:fix/png-compress-level
Apr 27, 2026
Merged

[https://nvbugs/6064029][perf] Use fast PNG compression for visual gen serving#13074
karljang merged 4 commits intoNVIDIA:mainfrom
karljang:fix/png-compress-level

Conversation

@karljang
Copy link
Copy Markdown
Collaborator

@karljang karljang commented Apr 15, 2026

Summary

Replace optimize=True with compress_level=1 in PIL PNG encoding for visual gen serving. PNG is lossless at all compression levels — compress_level only trades encode speed for file size.

Follow-up to #12903 which eliminated the double PNG encode. This PR addresses the remaining encode overhead.

Benchmark

PNG encode microbenchmark (B200, FLUX.2-dev 1280x720 image):

Method Time Size
optimize=True (before) 0.816s 1,249 KB
compress_level=1 (after) 0.075s 1,358 KB

10.9x faster, +8.8% file size. PNG is lossless at all levels.

End-to-end b64_json serving (A/B on same B200 node, FLUX.2-dev, 3 runs each):

Run 1 Run 2 Run 3 Avg
Before (optimize=True) 12.44s 12.73s 12.96s 12.71s
After (compress_level=1) 10.75s 10.78s 10.81s 10.78s

1.93s saved per request (15.2% faster).

Test plan

# Start server
trtllm-serve <model> --port 8190

# Warmup
curl -s -o /dev/null http://localhost:8190/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A cat", "response_format": "b64_json"}'

# Timed request
curl -s -o /tmp/resp.json -w "Total: %{time_total}s\n" \
  http://localhost:8190/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A sunset over mountains", "response_format": "b64_json"}'

# Verify valid image
python -c "
import json, base64
from PIL import Image
from io import BytesIO
data = json.load(open('/tmp/resp.json'))
img = Image.open(BytesIO(base64.b64decode(data['data'][0]['b64_json'])))
print(f'Valid: {img.size}, {img.mode}')
"

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor
    • Improved PNG image compression with configurable compression level settings for better control over file size and quality.

…n serving

Replace optimize=True with compress_level=1 in PIL PNG encoding.
PNG is lossless at all compression levels — this only trades file size
for encode speed.

Benchmark on B200 with FLUX.2-dev (1280x720 image):
  optimize=True:   0.816s
  compress_level=1: 0.075s  (10.9x faster, +8.8% file size)

End-to-end b64_json serving (A/B on same node):
  Before: 12.71s avg
  After:  10.78s avg  (1.93s saved, 15.2% faster)

Signed-off-by: Kanghwan Jang <kanghwanj@nvidia.com>
Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
@karljang karljang requested a review from a team as a code owner April 15, 2026 07:07
@karljang karljang requested a review from venkywonka April 15, 2026 07:07
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

The _save_pil_image() function in the media storage module has been extended to accept a configurable png_compress_level parameter, replacing the hardcoded optimize=True behavior with explicit compression level control passed to PIL's image save method.

Changes

Cohort / File(s) Summary
PNG Compression Configuration
tensorrt_llm/serve/media_storage.py
Added png_compress_level: int = 1 parameter to _save_pil_image() method; replaced optimize=True with compress_level=png_compress_level in PNG save operation; updated docstring to document the new parameter.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title follows the required template format with NVBugs ID, type [perf], and clearly summarizes the main change: optimizing PNG compression for visual generation serving.
Description check ✅ Passed Description includes all major sections (Summary, Benchmark with detailed results, Test plan with reproducible steps) and clearly explains the change rationale, trade-offs, and performance impact.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tensorrt_llm/serve/media_storage.py (1)

602-608: ⚠️ Potential issue | 🟡 Minor

Add explicit return type annotation for _save_pil_image.

This touched function returns no value, so annotate it as -> None to satisfy repo typing rules.

Proposed fix
     def _save_pil_image(
         pil_image: Image.Image,
         output: Any,  # Can be path string or BytesIO
         format: str,
         quality: int,
         png_compress_level: int = 1,
-    ):
+    ) -> None:

As per coding guidelines: "Use static type annotations for all functions; always annotate return types including None if no return value."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/serve/media_storage.py` around lines 602 - 608, The function
_save_pil_image is missing an explicit return type; annotate its signature with
-> None to satisfy typing rules. Locate the _save_pil_image definition and add
the return annotation (def _save_pil_image(... ) -> None:) and ensure no return
values are introduced elsewhere in that function (keep behavior unchanged). This
targets the function named _save_pil_image in media_storage.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tensorrt_llm/serve/media_storage.py`:
- Around line 616-617: Add an explicit validation that the png_compress_level
parameter is an integer between 0 and 9 and raise a ValueError if it is not;
specifically, before any PNG write/save code that uses the png_compress_level
parameter, check "if not isinstance(png_compress_level, int) or not (0 <=
png_compress_level <= 9): raise ValueError(f'png_compress_level must be an int
in 0..9, got {png_compress_level!r}')" and apply the same guard to the other
function/location that accepts png_compress_level so invalid inputs fail fast
with a clear error.

---

Outside diff comments:
In `@tensorrt_llm/serve/media_storage.py`:
- Around line 602-608: The function _save_pil_image is missing an explicit
return type; annotate its signature with -> None to satisfy typing rules. Locate
the _save_pil_image definition and add the return annotation (def
_save_pil_image(... ) -> None:) and ensure no return values are introduced
elsewhere in that function (keep behavior unchanged). This targets the function
named _save_pil_image in media_storage.py.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ea92379a-f2a3-4f4f-a317-1e4eb3e00db9

📥 Commits

Reviewing files that changed from the base of the PR and between 4825da7 and b7cd725.

📒 Files selected for processing (1)
  • tensorrt_llm/serve/media_storage.py

Comment thread tensorrt_llm/serve/media_storage.py
@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43422 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43422 [ run ] completed with state SUCCESS. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #33954 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43869 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43869 [ run ] completed with state FAILURE. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #34321 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43987 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

Copy link
Copy Markdown
Collaborator

@venkywonka venkywonka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing such high !/$ change!

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43987 [ run ] completed with state SUCCESS. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #34426 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --reuse-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44161 [ run ] triggered by Bot. Commit: b7cd725 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44161 [ run ] completed with state SUCCESS. Commit: b7cd725
/LLM/main/L0_MergeRequest_PR pipeline #34588 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang karljang requested a review from a team as a code owner April 19, 2026 22:24
@karljang karljang requested review from moraxu and tijyojwad April 19, 2026 22:24
@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --reuse-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44229 [ run ] triggered by Bot. Commit: c7e3f35 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44229 [ run ] completed with state FAILURE. Commit: c7e3f35
/LLM/main/L0_MergeRequest_PR pipeline #34651 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44345 [ run ] triggered by Bot. Commit: c7e3f35 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44345 [ run ] completed with state FAILURE. Commit: c7e3f35
/LLM/main/L0_MergeRequest_PR pipeline #34765 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44787 [ run ] triggered by Bot. Commit: 9bc6cdc Link to invocation

@mzweilz
Copy link
Copy Markdown
Collaborator

mzweilz commented Apr 22, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44787 [ run ] completed with state SUCCESS. Commit: 9bc6cdc
/LLM/main/L0_MergeRequest_PR pipeline #35141 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44953 [ run ] triggered by Bot. Commit: 9bc6cdc Link to invocation

@karljang karljang enabled auto-merge (squash) April 23, 2026 23:18
@karljang
Copy link
Copy Markdown
Collaborator Author

/bot kill

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45418 [ kill ] triggered by Bot. Commit: d0ead79 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45418 [ kill ] completed with state SUCCESS. Commit: d0ead79
Successfully killed previous jobs for commit d0ead79

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45419 [ run ] triggered by Bot. Commit: d0ead79 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45419 [ run ] completed with state SUCCESS. Commit: d0ead79
/LLM/main/L0_MergeRequest_PR pipeline #35654 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

1 similar comment
@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45673 [ run ] triggered by Bot. Commit: d0ead79 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45673 [ run ] completed with state SUCCESS. Commit: d0ead79
/LLM/main/L0_MergeRequest_PR pipeline #35882 completed with status: 'SUCCESS'

CI Report

Link to invocation

@karljang karljang merged commit 2ea0e63 into NVIDIA:main Apr 27, 2026
5 checks passed
@karljang karljang deleted the fix/png-compress-level branch April 27, 2026 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants