[None][doc] Gemma 4: usage examples#14303
Conversation
…lm-eval) Append a "Run Gemma 4" section to examples/models/core/gemma/README.md covering the PyTorch backend flow: - variant table (E2B / E4B / 26B-A4B / 31B with modalities) - trtllm-serve (OpenAI-compatible API) + curl example - quickstart_multimodal.py for image and audio inputs - trtllm-eval mmmu and covost2 entry points Gemma 4 runs on the PyTorch backend with HF checkpoints loaded directly; the legacy convert_checkpoint.py / trtllm-build flow is intentionally omitted since it does not apply. Signed-off-by: Hudayday <tianruih@nvidia.com>
Follow-up to de5e5b4: register the new "Run Gemma 4" section and its three subsections in the Table Of Contents at the top of examples/models/core/gemma/README.md. Signed-off-by: Hudayday <tianruih@nvidia.com>
📝 WalkthroughWalkthroughThis PR adds documentation for Gemma 4 model inference and evaluation. A new "Run Gemma 4" section describes launching ChangesGemma 4 Documentation
🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@examples/models/core/gemma/README.md`:
- Around line 853-854: The README's MMMU support line incorrectly lists
text-only checkpoints (26B-A4B and 31B) as supported for this vision
multiple-choice task; update the MMMU support matrix so it only lists
multimodal-capable checkpoints (e.g., E2B / E4B) and remove or mark 26B-A4B and
31B as "text-only / not supported" in the MMMU entry so the modality claim is
consistent with the earlier note about text-only checkpoints.
- Around line 833-845: The commands in the README invoke
quickstart_multimodal.py using a repo-root-relative path which will fail when
run from examples/models/core/gemma/; update the examples to either prepend the
correct relative path to quickstart_multimodal.py or add an explicit instruction
to cd to the repository root before running. Specifically edit the README
entries that reference quickstart_multimodal.py (the multimodal example lines)
to use a correct relative path from examples/models/core/gemma/ to the script or
add a preceding "cd" instruction so the quickstart_multimodal.py invocation
resolves correctly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: bbb42d75-0cae-47d7-8bfb-b07cfcff52dd
📒 Files selected for processing (1)
examples/models/core/gemma/README.md
…age+video) The previous variant table mislabeled 26B-A4B and 31B as text-only. All four Gemma 4 checkpoints ship a vision tower covering image and video; only E2B and E4B additionally ship an audio tower. Update the modality table, the MM intro paragraph, the inference subsection, the table-of-contents anchor, and add a video example to quickstart_multimodal.py invocations. Signed-off-by: Tianrui Hu <tianruih@nvidia.com> Signed-off-by: Hudayday <tianruih@nvidia.com>
The quickstart_multimodal.py invocation in the previous revision hits `AttributeError: 'TrtllmAttentionMetadata' object has no attribute 'kv_layout'` on the current Gemma 4 flow, so the example is not safe to ship as-is. Remove the multimodal subsection and its table-of-contents entry; serve and trtllm-eval examples remain unchanged. Signed-off-by: Tianrui Hu <tianruih@nvidia.com> Signed-off-by: Hudayday <tianruih@nvidia.com>
- Drop --backend pytorch from trtllm-serve / trtllm-eval examples; pytorch is the default backend now. - Reword the apply_chat_template explanation per reviewer suggestion. Signed-off-by: Hudayday <tianruih@nvidia.com>
|
/bot skip --comment "document updates, all commands tested" |
|
PR_Github #49345 [ skip ] triggered by Bot. Commit: |
|
PR_Github #49345 [ skip ] completed with state |
Summary
Append a new "Run Gemma 4" section to
examples/models/core/gemma/README.mdcovering the PyTorch backend flow that we actually develop and test against:
trtllm-serve(OpenAI-compatible REST) launch +curl /v1/chat/completionsquickstart_multimodal.pyfor image and audio inputs (E2B / E4B only)trtllm-eval mmmuandtrtllm-eval covost2entry pointsGemma 4 runs on the PyTorch backend with HF checkpoints loaded directly;
the legacy
convert_checkpoint.py/trtllm-buildflow is intentionallyomitted since it does not apply to Gemma 4.
Also registers the new section and its three subsections in the
top-of-file Table Of Contents.
Summary by CodeRabbit