[None][test] Decrease P1 models number and merge sanity test list into core#14952
Conversation
📝 WalkthroughWalkthroughThis PR updates the LLM performance test configurations by revising llm_perf_core.yml test entries across multiple GPU condition sections and removing llm_perf_sanity.yml entirely. Changes consolidate llama model coverage, add new model variants (gpt_oss_20b_fp4, nemotron_nano_12b_v2), and reduce timeout-prone deepseek test configurations. ChangesPerformance Test Configuration Updates
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…aced repo in quantization scripts Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
…IA#14925) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
…est_run_with_different_env (NVIDIA#14939) Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
…NVIDIA#14900) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
…y.yml file - Added new performance tests for various models including qwen3 and llama_v3.1. - Removed the llm_perf_sanity.yml file as it is no longer needed. Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
….5 8-GPU perf (NVIDIA#14613) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
…ter MLIR elementwise fusion (NVIDIA#14795) Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
|
/bot skip --comment "only test list modify" |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/integration/test_lists/qa/llm_perf_core.yml`:
- Line 28: The test matrix entry sets maxnt:2048 which is smaller than the
declared input_output_len (8000,1000) and will cause the workload to exceed the
token budget; update the perf/test_perf.py::test_perf[...] case so maxnt
(max_num_tokens) is increased to cover the larger sequence (e.g., >=8000 or
remove the explicit maxnt to rely on the default 8192) ensuring the token budget
matches input_output_len, and keep the test name/identifier intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0f13f275-b6d1-4113-aeb0-8579272e1f17
📒 Files selected for processing (2)
tests/integration/test_lists/qa/llm_perf_core.ymltests/integration/test_lists/qa/llm_perf_sanity.yml
💤 Files with no reviewable changes (1)
- tests/integration/test_lists/qa/llm_perf_sanity.yml
|
PR_Github #52037 [ skip ] triggered by Bot. Commit: |
|
PR_Github #52037 [ skip ] completed with state |
|
/bot skip --comment "only test list modify" |
|
PR_Github #52044 [ skip ] triggered by Bot. Commit: |
|
PR_Github #52044 [ skip ] completed with state |
Signed-off-by: yufeiwu-nv 230315618+yufeiwu-nv@users.noreply.github.com
Summary by CodeRabbit
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either
api-compatibleorapi-breaking. Forapi-breaking, includeBREAKINGin the PR title.Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.