fix(cli): fix --task flag concatenation bug and three other issues#31
Merged
Conversation
Bug 1 (Critical): --task flag produced `find_task.pycd` due to missing `&&` separator between pre_cmd and `cd /client`. Every `run --task` invocation since v0.4.2 silently failed. Fixed by adding `&&`. Bug 2: --num-tasks defaulted to 1, silently limiting runs. Changed default to None (all tasks). Bug 3: probe --wait timeout of 1200s was too short for first boot (OOBE takes 18-22 min). Increased to 1800s. Bug 4: Default VM size (D4ds_v4, 16GB) OOMs with navi agent's GroundingDINO + SoM models. Changed default to D8ds_v5 (32GB). Added warning when standard mode is used explicitly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
D4ds_v4 (16GB) OOMs with navi agent's GroundingDINO + SoM models. Standardize on D8ds_v5 across all commands — no more --fast/--standard flags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes four bugs reported in the WAA benchmarks CLI:
Bug 1 (Critical):
--taskflag producesfind_task.pycdMissing
&&separator betweenpre_cmdandcd /clientcaused shell to receive:Every
run --task <uuid>invocation since v0.4.2 silently failed. Fixed by adding&&afterfind_task.py.Bug 2:
--num-tasksdefaults to 1Changed default from
1toNone(all tasks). Previously,runwithout task filtering silently ran only 1 task.Bug 3:
probe --waittimeout too short for first bootIncreased default from 1200s (20min) to 1800s (30min). Windows OOBE + WAA startup routinely takes 18-22 min on first boot.
Bug 4: Default VM size OOMs with navi agent
Changed default VM from
Standard_D4ds_v4(16GB) toStandard_D8ds_v5(32GB). The navi agent's GroundingDINO + SoM models exhaust 16GB RAM, triggering OOM killer on QEMU. Added runtime warning when standard mode is used explicitly.Test plan
find_task.py &&separator verified in code--num-tasksdefault is None, display shows "all tasks"--timeoutdefault is 1800🤖 Generated with Claude Code