Add benchmarks for pytorch models #90

jainapurva · 2025-10-06T17:41:16Z

This pull request adds support for benchmarking Gemma 3 12B and 27B models with TorchAO quantization (FP8 and INT4) across latency, throughput, and serving tests on cuda devices. It also updates the workflow to ensure required dependencies of torchao and fbgemm-gpu-genai are installed when running on CUDA devices and bumps the torch version to 2.9.0 for compatibility.

huydhn

LGTM! Just FYI, I'm working on a fix for HPU at #100, but it will need some helps from Intel team who maintains the runner

Add latency benchmarks for pytorch models

2bc5a58

meta-cla bot added the cla signed label Oct 6, 2025

jainapurva temporarily deployed to pytorch-x-vllm October 6, 2025 17:41 — with GitHub Actions Inactive

jainapurva had a problem deploying to pytorch-x-vllm October 6, 2025 17:41 — with GitHub Actions Failure

jainapurva temporarily deployed to pytorch-x-vllm October 6, 2025 17:41 — with GitHub Actions Inactive

jainapurva had a problem deploying to pytorch-x-vllm October 6, 2025 17:41 — with GitHub Actions Failure

jainapurva temporarily deployed to pytorch-x-vllm October 6, 2025 17:41 — with GitHub Actions Inactive

jainapurva had a problem deploying to pytorch-x-vllm October 6, 2025 17:41 — with GitHub Actions Failure

jainapurva temporarily deployed to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Inactive

jainapurva had a problem deploying to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Failure

jainapurva temporarily deployed to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Inactive

jainapurva had a problem deploying to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Failure

jainapurva temporarily deployed to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Inactive

jainapurva had a problem deploying to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Failure

jainapurva temporarily deployed to pytorch-x-vllm November 3, 2025 19:00 — with GitHub Actions Inactive

jainapurva changed the title ~~Add latency benchmarks for pytorch models~~ Add benchmarks for pytorch models Nov 3, 2025

huydhn approved these changes Nov 3, 2025

View reviewed changes

huydhn merged commit 81c4dc6 into main Nov 3, 2025
34 of 38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add benchmarks for pytorch models #90

Add benchmarks for pytorch models #90

Uh oh!

jainapurva commented Oct 6, 2025 •

edited

Loading

Uh oh!

huydhn left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add benchmarks for pytorch models #90

Add benchmarks for pytorch models #90

Uh oh!

Conversation

jainapurva commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jainapurva commented Oct 6, 2025 •

edited

Loading