Skip to content

fix(mcp/mobile): learn-tap-module Branch-B scoping + adb-restart pre-step ordering#355

Merged
jjackson merged 1 commit into
mainfrom
emdash/e2e-malaria-fgd-7yoah
May 19, 2026
Merged

fix(mcp/mobile): learn-tap-module Branch-B scoping + adb-restart pre-step ordering#355
jjackson merged 1 commit into
mainfrom
emdash/e2e-malaria-fgd-7yoah

Conversation

@jjackson
Copy link
Copy Markdown
Owner

Summary

Two orthogonal bugs surfaced on malaria-itn-fgd/20260515-1645 Phase 6 attempt 12. Both unrelated to PR #354's Learn-Deliver chain scope.

Bug A — learn-tap-module.yaml Branch-B over-fires when form-name ≠ module-name

Branch-B (CommCare's same-name-suppressed-auto-skip case) guarded only on screen_suite_menu_list visible AND nav_btn_next NOT visible, then re-tapped text:${MODULE_NAME}. On J1 (module "Briefing Acknowledgement" → form "Acknowledge Readiness"), the form-list screen's toolbar still reads ${MODULE_NAME}, so Branch-B fired and re-tapped the non-tappable toolbar TextView instead of the form row. The downstream extendedWaitUntil nav_btn_next visible then expired against the unchanged form-list.

Fix: nested runFlow with inner guard visible: text: ${MODULE_NAME}, below: id: screen_suite_menu_list so the text-match is scoped to the list body. When form-name ≠ module-name no body row matches, Branch-B skips, and the caller's next learn-tap-module(FORM_NAME) drills the form row by its own label.

Bug B — sweepStaleEmulatorState ordering (PR #349 follow-up)

PR #349 wired up the orphan-qemu sweep and adb-server restart but:

  • (a) used a conservative qemuPids.length >= liveCount + 2 kill threshold that left the attempt-12 signature (2 orphan qemu + 1 stale adb-devices entry: 2 NOT >= 1+2) below the kill bar;
  • (b) ran the adb-restart immediately after the kill without waiting for the kernel to release the emulator-NNNN TCP sockets — letting the freshly-restarted daemon adopt the wedged-port state.

Result: 2 of the next 3 ensureAvdRunning calls still failed with package service did not bind until a second manual adb kill-server/start-server fired inside the dispatch.

Fix:

  1. Loosen the orphan kill to fire whenever qemuPids.length > liveCount.
  2. Add a ~500ms socket-release wait between the last orphan SIGKILL and adb kill-server.

Tests

  • test/mcp/mobile/static-recipe-invariants.test.ts asserts Branch-B's inner runFlow has visible: text: ${MODULE_NAME}, below: id: screen_suite_menu_list.
  • test/mcp/mobile/avd.test.ts asserts orphan-qemu kill precedes adb kill-server AND there's a ≥400ms gap between them (we wait 500ms; 100ms scheduling slack). A second test asserts the loosened threshold fires kills on the attempt-12 2-PID + 1-live-device signature.

Live reproducer: ~/.maestro/tests/2026-05-19_151650/screenshot-❌-1779218294468-(chunk-0.yaml).png.

Test plan

  • npm test — 1238 passed, 43 skipped (incl. new regression tests above)
  • Phase 6 attempt 13 on malaria-itn-fgd/20260515-1645: J1 module/form drill completes Branch-B-correctly; sweepStaleEmulatorState clears orphan qemu without operator intervention

🤖 Generated with Claude Code

…step ordering

Two orthogonal bugs surfaced on malaria-itn-fgd/20260515-1645 Phase 6
attempt 12. Both unrelated to PR #354's Learn-Deliver chain scope.

Bug A — learn-tap-module.yaml Branch-B over-fires when form-name != module-name:
The same-name-suppressed-auto-skip branch guarded only on
`screen_suite_menu_list visible AND nav_btn_next NOT visible`, then re-tapped
`text:${MODULE_NAME}`. On J1 (module "Briefing Acknowledgement" → form
"Acknowledge Readiness"), the form-list screen's toolbar still reads
${MODULE_NAME}, so Branch-B fired and re-tapped the non-tappable toolbar
TextView instead of the form row. Fix: nested runFlow with inner guard
`visible: text: ${MODULE_NAME}, below: id: screen_suite_menu_list` so the
text-match is scoped to the list body. When form-name != module-name no
body row matches, Branch-B skips, caller's next learn-tap-module(FORM_NAME)
drills the form row by its own label.

Bug B — sweepStaleEmulatorState ordering (PR #349 follow-up):
PR #349 wired up the orphan-qemu sweep and adb-server restart but
(a) used a conservative `qemuPids.length >= liveCount + 2` kill threshold
that left the attempt-12 signature (2 orphan qemu + 1 stale adb-devices
entry, 2 NOT >= 1+2) below the kill bar, and (b) ran the adb-restart
immediately after the kill without waiting for the kernel to release the
emulator-NNNN TCP sockets — letting the freshly-restarted daemon adopt
the wedged-port state. Result: 2 of the next 3 ensureAvdRunning calls
still failed with "package service did not bind" until a second manual
adb kill-server/start-server fired inside the dispatch. Fix:
  1. Loosen the orphan kill to fire whenever qemuPids.length > liveCount.
  2. Add a 500ms socket-release wait between the last orphan SIGKILL and
     adb kill-server.

Tests:
  - static-recipe-invariants.test.ts asserts Branch-B's inner runFlow has
    `visible: text: ${MODULE_NAME}, below: id: screen_suite_menu_list`.
  - avd.test.ts asserts orphan-qemu kill precedes adb kill-server AND
    there's a ≥400ms gap between them (we wait 500ms, 100ms scheduling
    slack). Second test asserts the loosened threshold actually fires
    kills on the attempt-12 2-PID + 1-live-device signature.

Live reproducer: /Users/jjackson/.maestro/tests/2026-05-19_151650/
  screenshot-❌-1779218294468-(chunk-0.yaml).png.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jjackson jjackson enabled auto-merge May 19, 2026 19:47
@jjackson jjackson merged commit 9e38308 into main May 19, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant