[GuideLLM Refactor] Reenablement of scenarios and fixes for benchmark package and CLI pathways #414

markurtz · 2025-10-16T03:14:26Z

Summary

Changed the benchmarking entrypoint to take in an Args object which is now used to load scenarios. It enables a single source of truth in addition to being able to save the exact configurations in the report output.

Details

[ ]

Test Plan

Related Issues

Resolves #

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

…or the latest state of refactor

Copilot

Pull Request Overview

This PR refactors the benchmarking system to use a unified BenchmarkGenerativeTextArgs configuration object for managing benchmark execution parameters. The key changes enable scenario-based configuration loading, improve the separation between internal benchmarking state and external configuration, and simplify the CLI/API interface.

Key Changes:

Introduced BenchmarkGenerativeTextArgs as the single source of truth for benchmark configuration, replacing scattered parameters
Renamed BenchmarkArgs to BenchmarkerArgs to distinguish internal runtime state from external configuration
Added scenario loading capabilities with built-in scenario discovery
Updated CLI to load configuration from scenario files with command-line overrides

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/guidellm/utils/cli.py	Enhanced `parse_json` to handle key=value pairs and plain strings in addition to JSON
src/guidellm/presentation/data_models.py	Added null checks for model name and request/output fields during report generation
src/guidellm/data/deserializers/deserializer.py	Modified deserializer to prioritize non-HuggingFace deserializers, saving HuggingFace as fallback
src/guidellm/benchmark/types.py	Removed legacy type aliases file (moved to entrypoints.py)
src/guidellm/benchmark/schemas.py	Added `BenchmarkGenerativeTextArgs` class, renamed `BenchmarkArgs` to `BenchmarkerArgs`, enhanced documentation
src/guidellm/benchmark/scenarios/rag.json	Wrapped `data` field value in array for consistency
src/guidellm/benchmark/scenarios/chat.json	Wrapped `data` field value in array for consistency
src/guidellm/benchmark/scenarios/init.py	New module for built-in scenario discovery and loading
src/guidellm/benchmark/scenario.py	Removed legacy scenario management code (replaced by new Args-based approach)
src/guidellm/benchmark/profile.py	Updated documentation, fixed rate handling for single values vs lists
src/guidellm/benchmark/entrypoints.py	Refactored `benchmark_generative_text` to accept `BenchmarkGenerativeTextArgs`, moved type aliases here
src/guidellm/benchmark/benchmarker.py	Updated to use `BenchmarkerArgs` instead of `BenchmarkArgs`
src/guidellm/benchmark/init.py	Updated exports to reflect renamed classes and new scenario functions
src/guidellm/main.py	Major CLI refactor to support scenario loading and simplified option handling

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/guidellm/benchmark/schemas.py

src/guidellm/benchmark/profile.py

src/guidellm/__main__.py

sjmonson

Few issues, but otherwise looks good. Tested with a synchronous text run.

src/guidellm/benchmark/scenarios/__init__.py

src/guidellm/benchmark/schemas.py

jaredoconnell

A lot of these changes are ones that I've wanted for a while. It's also nice to see the additional documentation.
Tested a scenario and a non-scenario run, and it works.

src/guidellm/benchmark/schemas.py

Updates for scenarios and benchmarking entrypoints to reenable them f…

44051aa

…or the latest state of refactor

markurtz requested review from Copilot, jaredoconnell and sjmonson October 16, 2025 03:14

Copilot AI reviewed Oct 16, 2025

View reviewed changes

src/guidellm/benchmark/schemas.py Outdated Show resolved Hide resolved

src/guidellm/benchmark/profile.py Show resolved Hide resolved

src/guidellm/__main__.py Show resolved Hide resolved

src/guidellm/__main__.py Show resolved Hide resolved

sjmonson requested changes Oct 16, 2025

View reviewed changes

src/guidellm/benchmark/scenarios/__init__.py Show resolved Hide resolved

src/guidellm/benchmark/schemas.py Outdated Show resolved Hide resolved

markurtz force-pushed the features/refactor/scenarios_reenablement branch from 7e83e07 to 44051aa Compare October 16, 2025 16:38

markurtz added 2 commits October 16, 2025 12:52

fixes from review

cc69203

fix for scenarios not passing data correctly

c4917bd

sjmonson approved these changes Oct 16, 2025

View reviewed changes

jaredoconnell approved these changes Oct 16, 2025

View reviewed changes

src/guidellm/benchmark/schemas.py Show resolved Hide resolved

Fixes from review

a375f6c

markurtz merged commit 57683a2 into features/refactor/base Oct 16, 2025
5 of 16 checks passed

markurtz deleted the features/refactor/scenarios_reenablement branch October 16, 2025 19:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GuideLLM Refactor] Reenablement of scenarios and fixes for benchmark package and CLI pathways #414

[GuideLLM Refactor] Reenablement of scenarios and fixes for benchmark package and CLI pathways #414

Uh oh!

markurtz commented Oct 16, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjmonson left a comment

Uh oh!

Uh oh!

Uh oh!

jaredoconnell left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[GuideLLM Refactor] Reenablement of scenarios and fixes for benchmark package and CLI pathways #414

[GuideLLM Refactor] Reenablement of scenarios and fixes for benchmark package and CLI pathways #414

Uh oh!

Conversation

markurtz commented Oct 16, 2025

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjmonson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jaredoconnell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants