feat: classify adb failures with actionable hints at the executor level#1067
Conversation
Size Report
Startup median (7 runs, lower is better):
Top changed chunks:
|
|
CI needs attention before review.
Please run |
3d6e239 to
5438520
Compare
|
I found two classifier coverage gaps that still leave common adb failures on the generic path.
Refs:
|
|
Addressed both classifier coverage gaps in fa3d70d: 1. 2. Semantic provider methods. Enrichment moved into Tests added:
Verified: format/typecheck/lint clean, |
Android COMMAND_FAILED errors surfaced raw adb stderr with only the generic retry-with-debug hint; the sole stderr classification lived in snapshot.ts and was retry-only. Add a central classifier in adb-executor.ts that recognizes the common adb failure families (device offline/unauthorized/not found, more than one device, no devices, server version mismatch, connection drops, INSTALL_FAILED_* variants) and attaches the resolved hint — plus retriable for clearly transient families — to the AppError details in both executor funnels (local serial exec and provider-scoped exec), so every adb call site benefits without per-site changes. The adb devices -l discovery call is wrapped too, where server-mismatch/no-devices failures actually surface. snapshot.ts now consumes the shared classifier for its retry decision, keeping only its snapshot-specific dump-race patterns. normalizeError lifts details.retriable to the top-level error (mirroring the hint lift) and the daemon typed-error graft lets a throw-site classification win over the code-level policy. Also fix the port-reverse removal path throwing a bare Error that surfaced as UNKNOWN. Update session-appstate-input-perf.test.ts: its Android sampling-failure fixture uses stderr "error: device offline", which the classifier now recognizes, so error.hint carries the actionable adb-reconnect hint instead of the generic "retry with --debug". The result shape (hint string, details.metric/package, code) is unchanged — only the hint text improved, which is this change's intent — so the test expectation moves to the new hint rather than reverting production.
|
Rechecked Focused validation passed locally: pnpm exec vitest run src/platforms/android/__tests__/adb-executor.test.ts src/platforms/android/__tests__/device-input-state.test.tsStill not ready for a final |
fa3d70d to
a302af7
Compare
Review follow-up closing two classifier coverage gaps: 1. allowFailure paths bypassed the classifier: the executor wrapper only enriches THROWN errors, but many Android flows run adb with allowFailure, inspect the nonzero result, and throw a fresh AppError built from its stdout/stderr — so e.g. "device offline" during uninstall still surfaced the generic hint. There was no common construction point for these errors, so androidAdbResultError in adb-executor.ts now IS that point: it builds the COMMAND_FAILED error from a tolerated result and runs it through the classifier, keeping hint strings central. The result-inspecting throw sites (app launch/uninstall, appearance read, keyboard/clipboard queries, package listing, logcat capture, helper APK installs, heap-dump pull, uiautomator dump) route through it; a site hint (heap-dump pull) still wins because attachAdbFailureHint never overwrites. Composes with #1072 rather than replacing it: androidAdbResultError builds its details via execFailureDetails for nonzero exits, so normalizeError still suffixes the curated message with the first stderr line while the classified hint/retriable/adbFailure ride along. Semantic failures thrown at exit 0 (the am start error-on-stdout path) stay unflagged, matching #1072's deliberate exit-0 exclusion. 2. Provider-scoped enrichment only wrapped exec, but the transfer helpers prefer the semantic provider methods, so a provider whose install rejects with INSTALL_FAILED output kept the generic hint. Enrichment now lives in normalizeAndroidAdbProvider — the single funnel every provider passes through (scope installation and explicitly passed providers alike) — wrapping exec plus pull/install/installBundle; a WeakSet keeps repeated normalization from stacking wrappers. The local provider needs no wrap since its methods delegate to the already enriched serial executor. Tests: androidAdbResultError classification, hint-plus-excerpt composition through normalizeError, exit-0 no-excerpt guard, site-hint precedence, and provider install/pull classification in adb-executor.test.ts; an allowFailure regression on a real production path (keyboard state query with a nonzero device-offline result) in device-input-state.test.ts.
a302af7 to
ff45566
Compare
|
Final refresh on |
|
What
Adds a central adb-stderr → hint classifier in
src/platforms/android/adb-executor.tsand attaches the classified hint (plusretriablefor clearly transient families) toAppErrordetails at the executor level, so all ~65 AndroidCOMMAND_FAILEDsites surface actionable guidance instead of raw adb stderr — with no per-site changes.Why
Previously the only adb stderr classification lived in
snapshot.ts(RETRYABLE_ADB_STDERR_PATTERNS), was snapshot-local, and was retry-only. Every other adb failure reached the agent as a raw stderr line with the generic "retry with --debug" hint — a wasted round-trip for failures with a well-known fix (accept the USB-debugging prompt, pass--serial, uninstall before install, etc.).How
classifyAdbFailure(stderr, stdout)recognizes the common failure families, each mapped to an actionable hint and a machine-readableadbFailurereason:device offline,device unauthorized(USB-debugging authorization prompt),device not foundmore than one device/emulator(points at--serial),no devices/emulators foundINSTALL_FAILED_*: insufficient storage, update-incompatible (uninstall first), version downgrade, plus a generic fallbackretriable: true.pmputs them); transport families match stderr only, so arbitraryadb shelloutput can't be misread as a transport failure.withAndroidAdbProvider(tunnel-backed adb gets the same hints). An existing site-provided hint is never overwritten. The directadb devices -ldiscovery call is wrapped too, where server-mismatch/no-devices failures actually surface.snapshot.tsconsumes the shared classifier for its retry decision, keeping only its snapshot-specific dump-race patterns (timed out,no such file or directory). Retry behavior unchanged.retriablereaches the wire:normalizeErrorliftsdetails.retriableto the top-level error (mirroring thedetails.hintlift, stripped from details), and the daemon's Phase-2 typed-error graft lets a throw-site classification win over the conservative code-level policy. Wire shape is unchanged when nothing classified ('retriable' in errorstays false).Errorin the port-reverse removal path that surfaced asUNKNOWN— now a classifiedCOMMAND_FAILEDAppError with stdout/stderr/exitCode details.Notes for reviewers
docs/adr/0010-error-system.mdis referenced by the July error audit but was never committed to any branch; this follows the kernel's observable conventions instead (details-hint lift, additive typed-error signals).fillAndroidprocess-spawn tests tripped once under load during verification; per-test timings against a clean tree are equivalent and five consecutive runs of that file are green — no regression.Testing
pnpm check:quick(oxlint + tsc) passes.