Skip to content

Conversation

markurtz
Copy link
Collaborator

Summary

Changed the benchmarking entrypoint to take in an Args object which is now used to load scenarios. It enables a single source of truth in addition to being able to save the exact configurations in the report output.

Details

  • [ ]

Test Plan

Related Issues

  • Resolves #

  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes AI-assisted code completion
  • Includes code generated by an AI application
  • Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the benchmarking system to use a unified BenchmarkGenerativeTextArgs configuration object for managing benchmark execution parameters. The key changes enable scenario-based configuration loading, improve the separation between internal benchmarking state and external configuration, and simplify the CLI/API interface.

Key Changes:

  • Introduced BenchmarkGenerativeTextArgs as the single source of truth for benchmark configuration, replacing scattered parameters
  • Renamed BenchmarkArgs to BenchmarkerArgs to distinguish internal runtime state from external configuration
  • Added scenario loading capabilities with built-in scenario discovery
  • Updated CLI to load configuration from scenario files with command-line overrides

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/guidellm/utils/cli.py Enhanced parse_json to handle key=value pairs and plain strings in addition to JSON
src/guidellm/presentation/data_models.py Added null checks for model name and request/output fields during report generation
src/guidellm/data/deserializers/deserializer.py Modified deserializer to prioritize non-HuggingFace deserializers, saving HuggingFace as fallback
src/guidellm/benchmark/types.py Removed legacy type aliases file (moved to entrypoints.py)
src/guidellm/benchmark/schemas.py Added BenchmarkGenerativeTextArgs class, renamed BenchmarkArgs to BenchmarkerArgs, enhanced documentation
src/guidellm/benchmark/scenarios/rag.json Wrapped data field value in array for consistency
src/guidellm/benchmark/scenarios/chat.json Wrapped data field value in array for consistency
src/guidellm/benchmark/scenarios/init.py New module for built-in scenario discovery and loading
src/guidellm/benchmark/scenario.py Removed legacy scenario management code (replaced by new Args-based approach)
src/guidellm/benchmark/profile.py Updated documentation, fixed rate handling for single values vs lists
src/guidellm/benchmark/entrypoints.py Refactored benchmark_generative_text to accept BenchmarkGenerativeTextArgs, moved type aliases here
src/guidellm/benchmark/benchmarker.py Updated to use BenchmarkerArgs instead of BenchmarkArgs
src/guidellm/benchmark/init.py Updated exports to reflect renamed classes and new scenario functions
src/guidellm/main.py Major CLI refactor to support scenario loading and simplified option handling

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@sjmonson sjmonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few issues, but otherwise looks good. Tested with a synchronous text run.

@markurtz markurtz force-pushed the features/refactor/scenarios_reenablement branch from 7e83e07 to 44051aa Compare October 16, 2025 16:38
Copy link
Collaborator

@jaredoconnell jaredoconnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these changes are ones that I've wanted for a while. It's also nice to see the additional documentation.
Tested a scenario and a non-scenario run, and it works.

@markurtz markurtz merged commit 57683a2 into features/refactor/base Oct 16, 2025
5 of 16 checks passed
@markurtz markurtz deleted the features/refactor/scenarios_reenablement branch October 16, 2025 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants