Fix B200 Dynamo vLLM recipe concurrencies#1305
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
3c67942 to
328272a
Compare
|
Claude finished @Oseltamivir's task in 47s —— View job Reviewing PR #1305
LGTM — no blocking issues found. All three recipe concurrency values now match the
The |
There was a problem hiding this comment.
LGTM — straightforward concurrency sync between recipe YAMLs and the master config.
Extended reasoning...
Overview
This PR updates the benchmark.concurrencies field in three B200 Dynamo vLLM recipe YAMLs (low-latency, high-tpt-megamoe, max-tpt-megamoe) under benchmarks/multi_node/srt-slurm-recipes/vllm/deepseek-v4/8k1k/ so they match the search space declared in .github/configs/nvidia-master.yaml, and appends a corresponding entry to perf-changelog.yaml.
Security risks
None. These are benchmark configuration values (numeric concurrency points) and a changelog text entry — no executable code, no auth/crypto/permissions surface.
Level of scrutiny
Low. The change is purely a values update to YAML files used by the benchmarking sweep system, with a clearly stated root cause (recipe-local concurrencies override the master config because the Slurm launcher applies the recipe directly). The diff is mechanical and small (3 single-line value changes plus a 7-line changelog append).
Other factors
The author validated via the documented sweep config generators and changelog processor. No prior reviewer feedback is outstanding, and no bugs were flagged by the bug hunting system.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25608571677 |
|
@claude rebase and resolve conflicts |
|
Claude finished @functionstackx's task in 1m 49s —— View job Rebase and resolve conflicts
Rebased onto |
328272a to
5371f4a
Compare
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25616590246 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25617236455 |
Summary
.github/configs/nvidia-master.yaml.perf-changelog.yamlentry fordsv4-fp4-b200-dynamo-vllmso the PR sweep targets this config.Root Cause
nvidia-master.yamlexportedCONC_LIST, but the B200 Slurm launcher applies the recipe file directly. The recipe-localbenchmark.concurrenciesvalues therefore controlled the actual benchmark run and only covered a subset of the configured concurrencies.Validation
python utils/matrix_logic/generate_sweep_configs.py full-sweep --config-files .github/configs/nvidia-master.yaml --framework dynamo-vllm --model-prefix dsv4 --runner-type b200-multinode --multi-nodepython utils/matrix_logic/generate_sweep_configs.py test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b200-dynamo-vllmpython utils/process_changelog.py --base-ref origin/main --head-ref HEAD --changelog-file perf-changelog.yamlgit diff --check origin/main..HEAD