Fix boolean nested params, add dict format support, and enhance plotting for vllm bench sweep #29025

Copilot · 2025-11-19T18:38:52Z

Purpose

This PR implements key enhancements to vllm bench sweep serve and plotting:

1. Fixed Boolean Nested Dot-Notation Parameters

Before: compilation_config.use_inductor_graph_partition: false → --no-compilation-config.use_inductor_graph_partition ❌
After: compilation_config.use_inductor_graph_partition: false → --compilation-config.use_inductor_graph_partition=false ✓

2. Fixed Nested Dictionary Values

Dict values properly JSON serialized instead of Python repr strings

3. Dict Format Support

Parameter files can now use dict format where keys are experiment names
Uses _benchmark_name field internally (not passed as CLI parameter)
Validates that all _benchmark_name values are unique

4. Enhanced Plot Options

--no-error-bars flag to disable error bars (default: enabled)
--fig-height to control subplot height in inches (default: 6.4)
--fig-dpi to control figure resolution (default: 300)
--fig-name parameter to customize figure prefix (default: "FIGURE")
Curve names are now alphabetically sorted in legends

5. Improved Data Handling

String values "inf", "-inf", and "nan" are automatically converted to floats when loading data

Backward Compatibility

✓ All changes are backward compatible - existing list format continues to work

Test Plan

# Run parameter sweep unit tests
pytest tests/benchmarks/test_param_sweep.py

# Run plot filter unit tests
pytest tests/benchmarks/test_plot_filters.py

# Test with example configs
vllm bench sweep serve \
  --serve-cmd "vllm serve meta-llama/Llama-3.2-1B-Instruct" \
  --bench-cmd "vllm bench serve --model meta-llama/Llama-3.2-1B-Instruct" \
  --serve-params serve_params_dict.json \
  --bench-params bench_params_dict.json \
  --dry-run

# Test custom figure options
vllm bench sweep plot results --fig-name custom_plot --fig-height 8 --fig-dpi 300 --no-error-bars

Test Result

All unit tests pass. Verified:

Nested boolean params generate correct =true/false syntax
Non-nested booleans still use --no- prefix (backward compatible)
Dict values properly JSON serialized with complex nested structures
Dict format loads and uses names in output paths
List format continues working unchanged
_benchmark_name uniqueness validation works correctly (raises ValueError on duplicates)
Plot filter tests validate all comparison operators work correctly
Figure naming, height, DPI, and error bar options work correctly
Curve names are alphabetically sorted in legends

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Fix boolean nested dot-notation params to use =true/false syntax - Fix nested dict parameter values to be JSON serialized - Add support for dict format in --bench-params and --serve-params - Use name field from dict format as benchmark name Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

The 'name' field is used for identification in dict format but should not be passed as a command-line parameter. Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

ProExpertProg

@copilot

tests/benchmarks/test_param_sweep.py

vllm/benchmarks/sweep/param_sweep.py

vllm/benchmarks/sweep/serve.py

ProExpertProg · 2025-11-19T19:34:14Z

@copilot also use _benchmark_name to store the name instead of name, and assert they are all unique (if provided via the list of records format).

- Converted boolean and string/numeric tests to parametrized tests - Simplified dict comprehension in read_from_dict - Extracted normalize_cmd_kv_pair method - Added name() method to ParameterSweepItem - Updated serve.py to use name() method Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

ProExpertProg · 2025-11-19T19:44:27Z

@copilot also use _benchmark_name to store the name instead of name, and assert they are all unique (if provided via the list of records format).

- Changed field name from 'name' to '_benchmark_name' throughout codebase - Added validation to ensure all _benchmark_name values are unique - Updated name() method to use _benchmark_name - Updated apply_to_cmd() and as_text() to skip _benchmark_name - Added comprehensive tests for uniqueness validation - Updated all existing tests to use _benchmark_name Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

Copilot · 2025-11-19T19:49:13Z

@copilot also use _benchmark_name to store the name instead of name, and assert they are all unique (if provided via the list of records format).

Done in cb81309. Changed to use _benchmark_name and added validation to ensure all values are unique when provided via list or dict format. Duplicates now raise a descriptive ValueError.

ProExpertProg

@copilot

tests/benchmarks/test_param_sweep.py

- Changed test_nested_dict_value to use compilation_config instead of env - Made it parametrized with multiple field combinations - Tests cudagraph_mode (full/piecewise), mode (2/3), use_inductor_graph_partition (True/False) Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

Signed-off-by: Luka Govedič <luka.govedic@gmail.com>

ProExpertProg · 2025-11-19T20:14:18Z

@copilot add tests for filtering including the "inf" string edge case

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/benchmarks/sweep/plot.py

ProExpertProg

@copilot

ProExpertProg · 2025-11-28T14:51:58Z

tests/benchmarks/test_plot_filters.py

+        })
+
+        # DataFrame with "inf" as string (edge case)
+        self.df_inf_string = pd.DataFrame({


@copilot remove the string test cases as we now coerce the inf/nan to float

Done in 59d9982. Removed the string test cases (df_inf_string, test_equal_to_inf_string, test_not_equal_to_inf_string, test_inf_string_mixed_dataframe) since inf/nan values are now coerced to floats when loading data.

… fig_height - Removed try/except import guard for pytest and pandas in test_plot_filters.py - Renamed fig_width to fig_height to correctly reflect that it controls the height parameter of seaborn's FacetGrid - Updated help text from "Width" to "Height" Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

Since inf/nan string values are now converted to floats when loading data, the string-based test cases are no longer needed. Updated tests to only use float('inf') and removed: - df_inf_string DataFrame - test_equal_to_inf_string - test_not_equal_to_inf_string - test_inf_string_mixed_dataframe Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

tests/benchmarks/test_param_sweep.py

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

ProExpertProg

@copilot

tests/benchmarks/test_param_sweep.py

- Refactored test_unique_benchmark_names_validation to use pytest.raises - Refactored test_unique_benchmark_names_multiple_duplicates to use pytest.raises - Cleaner and more idiomatic pytest error testing Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

vllm/benchmarks/sweep/serve.py

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

Signed-off-by: Luka Govedič <luka.govedic@gmail.com>

…ing for vllm bench sweep (vllm-project#29025) Signed-off-by: Luka Govedič <luka.govedic@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: Xingyu Liu <charlotteliu12x@gmail.com>

…ing for vllm bench sweep (vllm-project#29025) Signed-off-by: Luka Govedič <luka.govedic@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <luka.govedic@gmail.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

Initial plan

7bde782

Copilot AI assigned Copilot and ProExpertProg Nov 19, 2025

Copilot started work on behalf of ProExpertProg November 19, 2025 18:39 View session

mergify bot added the performance Performance-related issues label Nov 19, 2025

Exclude 'name' field from command generation and as_text output

96ed69e

The 'name' field is used for identification in dict format but should not be passed as a command-line parameter. Co-authored-by: ProExpertProg <11367180+ProExpertProg@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add support for bench and serve params as dict~~ Fix boolean nested params and add dict format support for vllm bench sweep serve Nov 19, 2025

Copilot finished work on behalf of ProExpertProg November 19, 2025 19:00

Copilot AI requested a review from ProExpertProg November 19, 2025 19:00

ProExpertProg requested changes Nov 19, 2025

View reviewed changes

Copilot started work on behalf of ProExpertProg November 19, 2025 19:32 View session

Copilot AI requested a review from ProExpertProg November 19, 2025 19:41

Copilot finished work on behalf of ProExpertProg November 19, 2025 19:41

Copilot stopped work on behalf of ProExpertProg due to an error November 19, 2025 19:44
Copilot has encountered an error. See logs for additional details.

Copilot started work on behalf of ProExpertProg November 19, 2025 19:45 View session

Copilot finished work on behalf of ProExpertProg November 19, 2025 19:51

ProExpertProg reviewed Nov 19, 2025

View reviewed changes

tests/benchmarks/test_param_sweep.py Outdated Show resolved Hide resolved

Copilot started work on behalf of ProExpertProg November 19, 2025 19:51 View session

Copilot AI requested a review from ProExpertProg November 19, 2025 19:55

Copilot finished work on behalf of ProExpertProg November 19, 2025 19:55

ProExpertProg added 2 commits November 19, 2025 15:08

format

61ee392

Signed-off-by: Luka Govedič <luka.govedic@gmail.com>

Add support for !=

f8e5e9a

Signed-off-by: Luka Govedič <luka.govedic@gmail.com>

chatgpt-codex-connector bot reviewed Nov 27, 2025

View reviewed changes

vllm/benchmarks/sweep/plot.py Show resolved Hide resolved

DarkLight1337 reviewed Nov 27, 2025

View reviewed changes

vllm/benchmarks/sweep/plot.py Outdated Show resolved Hide resolved

Copilot started work on behalf of ProExpertProg November 28, 2025 14:48 View session

ProExpertProg reviewed Nov 28, 2025

View reviewed changes

Copilot finished work on behalf of ProExpertProg November 28, 2025 14:54

Copilot AI requested a review from ProExpertProg November 28, 2025 14:54

Copilot started work on behalf of ProExpertProg November 28, 2025 14:54 View session

Copilot finished work on behalf of ProExpertProg November 28, 2025 15:50

ProExpertProg approved these changes Dec 1, 2025

View reviewed changes

ProExpertProg reviewed Dec 2, 2025

View reviewed changes

tests/benchmarks/test_param_sweep.py Outdated Show resolved Hide resolved

Apply suggestions from code review

1de67c5

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

ProExpertProg reviewed Dec 2, 2025

View reviewed changes

tests/benchmarks/test_param_sweep.py Outdated Show resolved Hide resolved

tests/benchmarks/test_param_sweep.py Outdated Show resolved Hide resolved

Copilot started work on behalf of ProExpertProg December 2, 2025 04:36 View session

Copilot finished work on behalf of ProExpertProg December 2, 2025 04:39

Copilot AI requested a review from ProExpertProg December 2, 2025 04:39

Merge branch 'main' into copilot/add-bench-params-dict-support

6c6bc34

ProExpertProg reviewed Dec 2, 2025

View reviewed changes

vllm/benchmarks/sweep/serve.py Outdated Show resolved Hide resolved

Apply suggestion from @ProExpertProg

a371c9b

Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com>

DarkLight1337 approved these changes Dec 2, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 2, 2025 04:51

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 2, 2025

Add support for !=

9d1e39e

Signed-off-by: Luka Govedič <luka.govedic@gmail.com>

ProExpertProg approved these changes Dec 2, 2025

View reviewed changes

DarkLight1337 merged commit 1c593e1 into main Dec 2, 2025
48 checks passed

DarkLight1337 deleted the copilot/add-bench-params-dict-support branch December 2, 2025 20:40

Uh oh!

Fix boolean nested params, add dict format support, and enhance plotting for vllm bench sweep #29025

Fix boolean nested params, add dict format support, and enhance plotting for vllm bench sweep #29025

Conversation

Copilot AI commented Nov 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

1. Fixed Boolean Nested Dot-Notation Parameters

2. Fixed Nested Dictionary Values

3. Dict Format Support

4. Enhanced Plot Options

5. Improved Data Handling

Backward Compatibility

Test Plan

Test Result

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ProExpertProg commented Nov 19, 2025

Uh oh!

ProExpertProg commented Nov 19, 2025

Uh oh!

Copilot AI commented Nov 19, 2025

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ProExpertProg commented Nov 19, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Nov 19, 2025 •

edited by github-actions bot

Loading