[None][doc] Gemma 4: usage examples by Hudayday · Pull Request #14303 · NVIDIA/TensorRT-LLM

Hudayday · 2026-05-19T10:27:50Z

Summary

Append a new "Run Gemma 4" section to examples/models/core/gemma/README.md
covering the PyTorch backend flow that we actually develop and test against:

variant table (E2B / E4B / 26B-A4B / 31B with modalities)
trtllm-serve (OpenAI-compatible REST) launch + curl /v1/chat/completions
quickstart_multimodal.py for image and audio inputs (E2B / E4B only)
trtllm-eval mmmu and trtllm-eval covost2 entry points

Gemma 4 runs on the PyTorch backend with HF checkpoints loaded directly;
the legacy convert_checkpoint.py / trtllm-build flow is intentionally
omitted since it does not apply to Gemma 4.

Also registers the new section and its three subsections in the
top-of-file Table Of Contents.

Summary by CodeRabbit

Documentation
- Added detailed Gemma 4 documentation including deployment setup, OpenAI-compatible API integration with usage examples, multimodal inference capabilities for image and audio processing, accuracy benchmarking and evaluation tools with sample commands, practical quick-start examples, and guidance on supported model variants.

…lm-eval) Append a "Run Gemma 4" section to examples/models/core/gemma/README.md covering the PyTorch backend flow: - variant table (E2B / E4B / 26B-A4B / 31B with modalities) - trtllm-serve (OpenAI-compatible API) + curl example - quickstart_multimodal.py for image and audio inputs - trtllm-eval mmmu and covost2 entry points Gemma 4 runs on the PyTorch backend with HF checkpoints loaded directly; the legacy convert_checkpoint.py / trtllm-build flow is intentionally omitted since it does not apply. Signed-off-by: Hudayday <tianruih@nvidia.com>

Follow-up to de5e5b4: register the new "Run Gemma 4" section and its three subsections in the Table Of Contents at the top of examples/models/core/gemma/README.md. Signed-off-by: Hudayday <tianruih@nvidia.com>

coderabbitai · 2026-05-19T10:29:03Z

📝 Walkthrough

Walkthrough

This PR adds documentation for Gemma 4 model inference and evaluation. A new "Run Gemma 4" section describes launching trtllm-serve with Gemma 4 checkpoints, querying the OpenAI-compatible API, running multimodal inference, and evaluating accuracy using trtllm-eval on MMMU and CoVoST 2 benchmarks. The Table of Contents is updated to reflect these subsections.

Changes

Gemma 4 Documentation

Layer / File(s)	Summary
Gemma 4 server and API documentation `examples/models/core/gemma/README.md`	Table of Contents updated with new "Run Gemma 4" entry and sub-links. New section documents supported Gemma 4 HuggingFace variants/modalities, provides `trtllm-serve` launch examples using PyTorch backend, `curl` chat-completions query examples with internal chat-template handling, multimodal inference commands for image/audio via quickstart script, and `trtllm-eval` benchmark commands for MMMU and CoVoST 2 evaluation.

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description explains the change clearly (new README section for Gemma 4) and provides sufficient detail about what is being added.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The title '[None][doc] Gemma 4: usage examples' is directly related to the main change, which is adding usage examples for Gemma 4 to the README documentation.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/models/core/gemma/README.md`:
- Around line 853-854: The README's MMMU support line incorrectly lists
text-only checkpoints (26B-A4B and 31B) as supported for this vision
multiple-choice task; update the MMMU support matrix so it only lists
multimodal-capable checkpoints (e.g., E2B / E4B) and remove or mark 26B-A4B and
31B as "text-only / not supported" in the MMMU entry so the modality claim is
consistent with the earlier note about text-only checkpoints.
- Around line 833-845: The commands in the README invoke
quickstart_multimodal.py using a repo-root-relative path which will fail when
run from examples/models/core/gemma/; update the examples to either prepend the
correct relative path to quickstart_multimodal.py or add an explicit instruction
to cd to the repository root before running. Specifically edit the README
entries that reference quickstart_multimodal.py (the multimodal example lines)
to use a correct relative path from examples/models/core/gemma/ to the script or
add a preceding "cd" instruction so the quickstart_multimodal.py invocation
resolves correctly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bbb42d75-0cae-47d7-8bfb-b07cfcff52dd

📥 Commits

Reviewing files that changed from the base of the PR and between 989671b and a704288.

📒 Files selected for processing (1)

examples/models/core/gemma/README.md

…age+video) The previous variant table mislabeled 26B-A4B and 31B as text-only. All four Gemma 4 checkpoints ship a vision tower covering image and video; only E2B and E4B additionally ship an audio tower. Update the modality table, the MM intro paragraph, the inference subsection, the table-of-contents anchor, and add a video example to quickstart_multimodal.py invocations. Signed-off-by: Tianrui Hu <tianruih@nvidia.com> Signed-off-by: Hudayday <tianruih@nvidia.com>

The quickstart_multimodal.py invocation in the previous revision hits `AttributeError: 'TrtllmAttentionMetadata' object has no attribute 'kv_layout'` on the current Gemma 4 flow, so the example is not safe to ship as-is. Remove the multimodal subsection and its table-of-contents entry; serve and trtllm-eval examples remain unchanged. Signed-off-by: Tianrui Hu <tianruih@nvidia.com> Signed-off-by: Hudayday <tianruih@nvidia.com>

- Drop --backend pytorch from trtllm-serve / trtllm-eval examples; pytorch is the default backend now. - Reword the apply_chat_template explanation per reviewer suggestion. Signed-off-by: Hudayday <tianruih@nvidia.com>

Hudayday · 2026-05-20T05:37:49Z

/bot skip --comment "document updates, all commands tested"

tensorrt-cicd · 2026-05-20T05:43:11Z

PR_Github #49345 [ skip ] triggered by Bot. Commit: f5383b8 Link to invocation

tensorrt-cicd · 2026-05-20T05:49:18Z

PR_Github #49345 [ skip ] completed with state SUCCESS. Commit: f5383b8
Skipping testing for commit f5383b8

Link to invocation

Hudayday added 2 commits May 19, 2026 03:04

[None][doc] Gemma 4: add Table Of Contents entries

a704288

Follow-up to de5e5b4: register the new "Run Gemma 4" section and its three subsections in the Table Of Contents at the top of examples/models/core/gemma/README.md. Signed-off-by: Hudayday <tianruih@nvidia.com>

Hudayday requested review from a team as code owners May 19, 2026 10:27

Hudayday requested review from Shixiaowei02, Wanli-Jiang and arysef May 19, 2026 10:27

github-actions Bot assigned Hudayday May 19, 2026

coderabbitai Bot reviewed May 19, 2026

View reviewed changes

Comment thread examples/models/core/gemma/README.md Outdated

Comment thread examples/models/core/gemma/README.md

Hudayday added 2 commits May 19, 2026 03:31

Hudayday changed the title ~~[None][doc] Gemma 4: usage examples (trtllm-serve / multimodal / trtllm-eval)~~ [None][doc] Gemma 4: usage examples May 19, 2026

nv-guomingz reviewed May 20, 2026

View reviewed changes

Comment thread examples/models/core/gemma/README.md Outdated

nv-guomingz reviewed May 20, 2026

View reviewed changes

Comment thread examples/models/core/gemma/README.md Outdated

nv-guomingz reviewed May 20, 2026

View reviewed changes

Comment thread examples/models/core/gemma/README.md Outdated

nv-guomingz approved these changes May 20, 2026

View reviewed changes

Hudayday enabled auto-merge (squash) May 20, 2026 05:18

[None][doc] Gemma 4: address review feedback

f5383b8

- Drop --backend pytorch from trtllm-serve / trtllm-eval examples; pytorch is the default backend now. - Reword the apply_chat_template explanation per reviewer suggestion. Signed-off-by: Hudayday <tianruih@nvidia.com>

Hudayday merged commit aac0d65 into NVIDIA:main May 20, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][doc] Gemma 4: usage examples#14303

[None][doc] Gemma 4: usage examples#14303
Hudayday merged 5 commits into
NVIDIA:mainfrom
Hudayday:docs/gemma4-examples-readme

Hudayday commented May 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hudayday commented May 20, 2026

Uh oh!

tensorrt-cicd commented May 20, 2026

Uh oh!

tensorrt-cicd commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Hudayday commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hudayday commented May 20, 2026

Uh oh!

tensorrt-cicd commented May 20, 2026

Uh oh!

tensorrt-cicd commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Hudayday commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading