[https://nvbugs/6221483][fix] AutoDeploy: Fix Eagle metadata host syncs#14714
[https://nvbugs/6221483][fix] AutoDeploy: Fix Eagle metadata host syncs#14714govind-ramnarayan wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughThis PR enables selective host-mirror D2H synchronization in KV-cache inference by adding optional ChangesHost-mirror selective synchronization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/unittest/auto_deploy/singlegpu/custom_ops/test_switch_to_generate_inplace.py (2)
378-460: QA list update is not needed for this PR scope.This change is unit-test-only (
tests/unittest/...), so notests/integration/test_lists/qa/*update is required.As per coding guidelines "If the PR only touches unittest/ or narrow unit scope, say explicitly whether QA list updates are unnecessary or optional."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/auto_deploy/singlegpu/custom_ops/test_switch_to_generate_inplace.py` around lines 378 - 460, The PR only modifies unit tests under tests/unittest/... (specifically test_switch_to_generate_inplace.py and functions like test_out_of_scope_active_host_mirror_not_synced_by_switch_to_generate and related tests), so explicitly state in the PR description that no QA list update is necessary per the guideline for changes limited to unittest/ or narrow unit scope; update the PR description (or add a short note to the PR checklist) saying "QA list update not required for unit-test-only changes" so reviewers see this decision without changing tests.
378-394: ⚡ Quick winAdd positive scoped-sync cases for
active_args_override.These additions validate the “excluded host arg stays stale” path, but they don’t explicitly validate the complementary “included host arg is synced” path when override contains
<arg>_host. Please add one positive scoped test for each API (switch_to_generate_,offset_pos_and_cache_) to lock down both sides of the contract.As per coding guidelines "Coverage expectations: Assess whether new/changed tests cover happy path, important edge cases, and failure modes relevant to the feature or fix."
Also applies to: 443-460
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unittest/auto_deploy/singlegpu/custom_ops/test_switch_to_generate_inplace.py` around lines 378 - 394, The test currently only checks that excluded host args remain stale after calling switch_to_generate_; add a complementary positive scoped-sync test that passes an active_args_override including the host placeholder name (e.g., "input_ids_host") and asserts the host mirror is synced to staging values (mirror in si._input_buffer.get_host_view("cu_seqlen") matches expected), and likewise add a matching positive case for offset_pos_and_cache_ to verify that when the override contains the host placeholder the host mirror is synchronized; locate and modify the test functions around test_out_of_scope_active_host_mirror_not_synced_by_switch_to_generate and the analogous block for lines ~443-460 to add these assertions calling si.switch_to_generate_ and si.offset_pos_and_cache_ with active_args_override that includes "<arg>_host" and assert the host view equals the expected staged values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@tests/unittest/auto_deploy/singlegpu/custom_ops/test_switch_to_generate_inplace.py`:
- Around line 378-460: The PR only modifies unit tests under tests/unittest/...
(specifically test_switch_to_generate_inplace.py and functions like
test_out_of_scope_active_host_mirror_not_synced_by_switch_to_generate and
related tests), so explicitly state in the PR description that no QA list update
is necessary per the guideline for changes limited to unittest/ or narrow unit
scope; update the PR description (or add a short note to the PR checklist)
saying "QA list update not required for unit-test-only changes" so reviewers see
this decision without changing tests.
- Around line 378-394: The test currently only checks that excluded host args
remain stale after calling switch_to_generate_; add a complementary positive
scoped-sync test that passes an active_args_override including the host
placeholder name (e.g., "input_ids_host") and asserts the host mirror is synced
to staging values (mirror in si._input_buffer.get_host_view("cu_seqlen") matches
expected), and likewise add a matching positive case for offset_pos_and_cache_
to verify that when the override contains the host placeholder the host mirror
is synchronized; locate and modify the test functions around
test_out_of_scope_active_host_mirror_not_synced_by_switch_to_generate and the
analogous block for lines ~443-460 to add these assertions calling
si.switch_to_generate_ and si.offset_pos_and_cache_ with active_args_override
that includes "<arg>_host" and assert the host view equals the expected staged
values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d2e01141-fcbf-410b-9fdf-cfbcbef6a2f8
📒 Files selected for processing (3)
tensorrt_llm/_torch/auto_deploy/custom_ops/attention_interface.pytensorrt_llm/_torch/auto_deploy/models/custom/modeling_eagle.pytests/unittest/auto_deploy/singlegpu/custom_ops/test_switch_to_generate_inplace.py
|
Note: fold in an explicit error when trying to run flashinfer + Eagle + cudagraph (instead of silently changing to torch-simple) |
8b58b36 to
fde6af6
Compare
9c31278 to
7cff219
Compare
|
Review follow-up pushed in Updates:
QA list update not required: this PR adds focused unittest coverage only and does not add, remove, or rename integration tests/test-list entries. |
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
7cff219 to
d8b9ad6
Compare
|
/bot run --stage-list "A10-Build_Docs, A10-PackageSanityCheck-PY310-UB2204, A100X-PackageSanityCheck-PY312-UB2404, A30-AutoDeploy-1, H100_PCIe-AutoDeploy-1, DGX_B200-AutoDeploy-1, A100X-PyTorch-1, DGX_H100-4_GPUs-AutoDeploy-1, DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-Post-Merge-1" |
|
PR_Github #51100 [ run ] triggered by Bot. Commit: |
|
PR_Github #51100 [ run ] completed with state
|
|
/bot run --stage-list "A10-Build_Docs, A10-PackageSanityCheck-PY310-UB2204, A100X-PackageSanityCheck-PY312-UB2404, A30-AutoDeploy-1, H100_PCIe-AutoDeploy-1, DGX_B200-AutoDeploy-1, A100X-PyTorch-1, DGX_H100-4_GPUs-AutoDeploy-1, DGX_B200-4_GPUs-AutoDeploy-1, DGX_H100-4_GPUs-AutoDeploy-Post-Merge-1" |
|
PR_Github #51129 [ run ] triggered by Bot. Commit: |
|
PR_Github #51129 [ run ] completed with state
|
Summary
active_args_overridefor metadata mutation helpers so callers can narrow host mirroring to the next consumer's graph inputs.Validation
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=$PWD pytest -sv tests/unittest/auto_deploy/singlegpu/custom_ops/test_switch_to_generate_inplace.pyPYTHONPATH=$PWD LLM_MODELS_ROOT=/lustre/fs1/portfolios/coreai/projects/coreai_comparch_autodeploy/autodeploy_data/llm-models-fake CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 pytest -sv tests/integration/defs/accuracy/test_llm_api_autodeploy.py::TestNemotronSuperV3::test_mtp[fp8_ws8_80gb-trtllm]indexSelectSmallIndex/CUDA error: device-side assert triggeredon the same full repro.Summary by CodeRabbit
Refactor
Tests