Android CI: harden retry against PackageManager indexing race + reduce decode flake#4858
Closed
shai-almog wants to merge 1 commit intomasterfrom
Closed
Android CI: harden retry against PackageManager indexing race + reduce decode flake#4858shai-almog wants to merge 1 commit intomasterfrom
shai-almog wants to merge 1 commit intomasterfrom
Conversation
The instrumentation test runner already retries decode-only failures (logcat occasionally drops a chunk line, breaking PNG reassembly) by restarting the app and re-emitting from the on-device suite. The retry itself was failing in two ways: 1. After `adb install -r`, `am start -W -a MAIN -c LAUNCHER -p <pkg>` returned "Activity not started, unable to resolve Intent" because PackageManager hadn't finished indexing the freshly-installed APK. The script gave up immediately and skipped the 10-minute retry wait, so the failed test never got a second chance. 2. The original logcat capture used the device's default ring buffer (256K-1M), which can wrap mid-suite when 90+ tests each emit ~70 chunk lines. That's the root cause of the decode flakes the retry was supposed to recover from. Changes: - Bump the device-side logcat ring buffer to 16M with `adb logcat -G` before clearing it. Mitigates buffer wrap during long suites. - After `adb install`, poll `cmd package resolve-activity --brief` (max 30s) until pm reports the launcher activity is registered. - Retry `am start` up to 3 times with a 2s backoff to absorb residual indexing race. - Fall back to `monkey -p <pkg> -c LAUNCHER 1` if `am start` still refuses to resolve the Intent. `pidof` after launch remains the source of truth for whether the app actually came up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
Folding into #4856 per request. |
Collaborator
Author
|
Compared 86 screenshots: 86 matched. Native Android coverage
✅ Native Android screenshot tests passed. Native Android coverage
Benchmark ResultsDetailed Performance Metrics
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two complementary mitigations for the Android instrumentation test flake observed on PR #4856 (and several recent master runs):
Logcat ring buffer bumped to 16 MiB at startup (
adb logcat -G 16M). The default 256K-1M is too small for our 90+ test suite where each screenshot emits ~70 chunk lines — the buffer wraps mid-suite, dropping a chunk and makingCn1ssChunkToolsfail reassembly with a gap error. This is the root cause of the decode flake the existing retry block tries to recover from.Retry's
am startstep now waits for PackageManager indexing. Afteradb install -rreports Success, the launcher Intent isn't immediately resolvable — pm needs a moment to register every activity in the manifest. The previous version raced and got "Activity not started, unable to resolve Intent" on the first call, then skipped the 10-minute retry wait. New flow:cmd package resolve-activity --brief -a MAIN -c LAUNCHER <pkg>for up to 30s until pm reports the component.am startup to 3× with a 2s backoff if the first call still races.monkey -p <pkg> -c LAUNCHER 1(different code path inside pm) if allam startretries fail.pidofremains the source of truth for whether the app actually launched.Why two separate fixes
The first reduces the probability of needing the retry (smaller chance of a chunk drop in the first place). The second makes sure the retry actually works when the drop still slips through. Together they should turn the flake from "fails outright" into "occasionally takes longer to pass."
Observed failure that prompted this
PR #4856 build https://github.com/codenameone/CodenameOne/actions/runs/25282838085/job/74122673963 —
FlipTransitionTestemitted identical bytes (png_bytes=25546, chunks=69, total_b64_len=34064) as the most recent successful master run, but the original decode failed (chunk drop), and the retry'sam startreturnedError: Activity not started, unable to resolve Intent.Test plan
STAGE:RETRY -> ... am start exit=0 ... PackageManager resolved launcherlines in the log.am startfailures after the resolve-activity wait, the monkey fallback should fire and bring the app up.connectedAndroidTest-retry.log) for retry-emitted CN1SS chunks.🤖 Generated with Claude Code