[https://nvbugs/6094066][fix] Skip Qwen3 skip-softmax on low-memory GPUs#13581
[https://nvbugs/6094066][fix] Skip Qwen3 skip-softmax on low-memory GPUs#13581xxi-nv merged 1 commit intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughThis change updates a test's hardware targeting configuration, replacing Hopper-specific hardware gating with Blackwell hardware gating and adding a device-memory threshold constraint for lower-memory devices. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Signed-off-by: xxi <xxi@nvidia.com>
b4174a5 to
af09721
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #46015 [ run ] triggered by Bot. Commit: |
|
PR_Github #46015 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #46141 [ run ] triggered by Bot. Commit: |
|
PR_Github #46141 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #46228 [ run ] triggered by Bot. Commit: |
|
PR_Github #46228 [ run ] completed with state |
…PUs (NVIDIA#13581) Signed-off-by: xxi <xxi@nvidia.com>
Description
TestQwen3_30B_A3B_Instruct_2507::test_skip_softmax_attentionon pre-Blackwell GPUs.skip_less_device_memory(140000)guard so low-memory GPUs do not try this Qwen3 30B skip-softmax case.waives.txtentry for the same target test so B200 can run it instead of being waived.Validation
git diff --check origin/main...HEADgit commit -s --amend --no-editpre-commit hooks passed after thewaives.txtcleanupflashinfer/NVRTCcuda.hinclude setup); no OOM observed.batch,qos=short, 2h: source build/import succeeded on GB200, pytest collection selected the target test and did not skip it with the 140GB guard. The target then failed at model loading because OCI model shards forQwen3/Qwen3-30B-A3B-Instruct-2507are Git LFS pointer files, not real safetensors; no OOM observed.NVBug: https://nvbugs/6094066