Add Gemma 4 text-only inference support to TileGym transformers examples


Hi TileGym team,

I’d like to propose adding initial **Gemma 4 text-only inference support** to the `modeling/transformers` path.
## Motivation

TileGym already has a Gemma integration path in `infer.py`, where `"gemma"` models are routed through `apply_tilegym_kernel_to_gemma3(...)` with existing kernel toggles for `rope`, `rms_norm`, `mlp`, and `attn`.

The repository also already includes **Gemma-3-4B-IT** in the supported transformer benchmark examples, so there is a clear precedent for Gemma-family model integration in the current workflow.

In the roadmap, **“More LLM models”** is explicitly marked as **Help Wanted**, and the maintainers recommend opening an issue first for significant end-to-end model support work to coordinate scope and avoid duplicate effort.

Gemma 4 also looks like a practical next target because Google positions it as a new Gemma family with strong reasoning/coding capability and efficient deployment, but I would keep the first contribution intentionally narrow and avoid multimodal scope in the initial PR.
## Proposed scope for v1

I propose a small first step:

- add **Gemma 4 text-only** support in `modeling/transformers`
- reuse the existing Gemma patch path where possible
- add a **Gemma 4 benchmark script**
- validate with the existing **profile** and **kernel coverage report** flow

This stays aligned with the current transformer example stack, which already supports benchmarking, Torch profiler output, and NSight Systems based cuTile kernel coverage reporting.
## Out of scope for the first PR

To keep the first contribution reviewable, I would exclude:

- multimodal image/audio/video support
- agent/function-calling workflows
- any broader API/runtime integration outside the existing `modeling/transformers` flow

Gemma 4 does advertise multimodal and agentic capabilities, but I think the first upstreamable version should focus on **text generation parity first**.
## Implementation sketch

Tentative plan:

1. identify a Gemma 4 model variant that fits the existing TileGym benchmark path
2. verify whether the current Gemma patch path can be reused directly or needs a small Gemma 4 specific adapter
3. add/update benchmark entry points
4. run:
    - baseline generation
    - TileGym generation
    - profiler output
    - kernel coverage report
## Validation expectations

I expect the PR to include:

- a runnable Gemma 4 example in `modeling/transformers`
- benchmark results against the baseline path
- profiling / kernel coverage output
- any minimal README updates needed for reproducibility
## Questions for maintainers

Before I start, I’d appreciate guidance on two points:

1. Would you prefer Gemma 4 support to extend the existing Gemma path, or would you rather have a dedicated Gemma 4 patch entry point?
2. Is there a preferred Gemma 4 model size to target first for the initial contribution?

If this direction looks good, I can start with a narrow text-only PR and keep the first version focused on compatibility, benchmarking, and coverage reporting.

---
## Sources

- [.py](https://github.com/NVIDIA/TileGym/blob/main/modeling/transformers/infer.py)
- [.md](https://github.com/NVIDIA/TileGym/blob/main/modeling/transformers/README.md)
- [.md](https://github.com/NVIDIA/TileGym/blob/main/ROADMAP.md)
- [Gemma 4 — Google DeepMind](https://deepmind.google/models/gemma/gemma-4/)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Gemma 4 text-only inference support to TileGym transformers examples #130

Motivation

Proposed scope for v1

Out of scope for the first PR

Implementation sketch

Validation expectations

Questions for maintainers

Sources

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add Gemma 4 text-only inference support to TileGym transformers examples #130

Description

Motivation

Proposed scope for v1

Out of scope for the first PR

Implementation sketch

Validation expectations

Questions for maintainers

Sources

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions