[https://nvbugs/6064029][perf] Use fast PNG compression for visual gen serving#13074
[https://nvbugs/6064029][perf] Use fast PNG compression for visual gen serving#13074karljang merged 4 commits intoNVIDIA:mainfrom
Conversation
…n serving Replace optimize=True with compress_level=1 in PIL PNG encoding. PNG is lossless at all compression levels — this only trades file size for encode speed. Benchmark on B200 with FLUX.2-dev (1280x720 image): optimize=True: 0.816s compress_level=1: 0.075s (10.9x faster, +8.8% file size) End-to-end b64_json serving (A/B on same node): Before: 12.71s avg After: 10.78s avg (1.93s saved, 15.2% faster) Signed-off-by: Kanghwan Jang <kanghwanj@nvidia.com> Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tensorrt_llm/serve/media_storage.py (1)
602-608:⚠️ Potential issue | 🟡 MinorAdd explicit return type annotation for
_save_pil_image.This touched function returns no value, so annotate it as
-> Noneto satisfy repo typing rules.Proposed fix
def _save_pil_image( pil_image: Image.Image, output: Any, # Can be path string or BytesIO format: str, quality: int, png_compress_level: int = 1, - ): + ) -> None:As per coding guidelines: "Use static type annotations for all functions; always annotate return types including
Noneif no return value."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/serve/media_storage.py` around lines 602 - 608, The function _save_pil_image is missing an explicit return type; annotate its signature with -> None to satisfy typing rules. Locate the _save_pil_image definition and add the return annotation (def _save_pil_image(... ) -> None:) and ensure no return values are introduced elsewhere in that function (keep behavior unchanged). This targets the function named _save_pil_image in media_storage.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/serve/media_storage.py`:
- Around line 616-617: Add an explicit validation that the png_compress_level
parameter is an integer between 0 and 9 and raise a ValueError if it is not;
specifically, before any PNG write/save code that uses the png_compress_level
parameter, check "if not isinstance(png_compress_level, int) or not (0 <=
png_compress_level <= 9): raise ValueError(f'png_compress_level must be an int
in 0..9, got {png_compress_level!r}')" and apply the same guard to the other
function/location that accepts png_compress_level so invalid inputs fail fast
with a clear error.
---
Outside diff comments:
In `@tensorrt_llm/serve/media_storage.py`:
- Around line 602-608: The function _save_pil_image is missing an explicit
return type; annotate its signature with -> None to satisfy typing rules. Locate
the _save_pil_image definition and add the return annotation (def
_save_pil_image(... ) -> None:) and ensure no return values are introduced
elsewhere in that function (keep behavior unchanged). This targets the function
named _save_pil_image in media_storage.py.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: ea92379a-f2a3-4f4f-a317-1e4eb3e00db9
📒 Files selected for processing (1)
tensorrt_llm/serve/media_storage.py
|
/bot run --disable-fail-fast |
|
PR_Github #43422 [ run ] triggered by Bot. Commit: |
|
PR_Github #43422 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #43869 [ run ] triggered by Bot. Commit: |
|
PR_Github #43869 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #43987 [ run ] triggered by Bot. Commit: |
venkywonka
left a comment
There was a problem hiding this comment.
Amazing such high !/$ change!
|
PR_Github #43987 [ run ] completed with state
|
|
/bot run --disable-fail-fast --reuse-test |
|
PR_Github #44161 [ run ] triggered by Bot. Commit: |
|
PR_Github #44161 [ run ] completed with state
|
|
/bot run --disable-fail-fast --reuse-test |
|
PR_Github #44229 [ run ] triggered by Bot. Commit: |
|
PR_Github #44229 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44345 [ run ] triggered by Bot. Commit: |
|
PR_Github #44345 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44787 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-fail-fast |
|
PR_Github #44787 [ run ] completed with state
|
|
PR_Github #44953 [ run ] triggered by Bot. Commit: |
|
/bot kill |
|
/bot run --disable-fail-fast |
|
PR_Github #45418 [ kill ] triggered by Bot. Commit: |
|
PR_Github #45418 [ kill ] completed with state |
|
PR_Github #45419 [ run ] triggered by Bot. Commit: |
|
PR_Github #45419 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
1 similar comment
|
/bot run --disable-fail-fast |
|
PR_Github #45673 [ run ] triggered by Bot. Commit: |
|
PR_Github #45673 [ run ] completed with state |
Summary
Replace
optimize=Truewithcompress_level=1in PIL PNG encoding for visual gen serving. PNG is lossless at all compression levels —compress_levelonly trades encode speed for file size.Follow-up to #12903 which eliminated the double PNG encode. This PR addresses the remaining encode overhead.
Benchmark
PNG encode microbenchmark (B200, FLUX.2-dev 1280x720 image):
optimize=True(before)compress_level=1(after)10.9x faster, +8.8% file size. PNG is lossless at all levels.
End-to-end
b64_jsonserving (A/B on same B200 node, FLUX.2-dev, 3 runs each):optimize=True)compress_level=1)1.93s saved per request (15.2% faster).
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit