[tests] Verbose apkdiff output for BuildReleaseArm64 regressions#11759
[tests] Verbose apkdiff output for BuildReleaseArm64 regressions#11759simonrozsival wants to merge 1 commit into
Conversation
When the `BuildReleaseArm64` APK size regression check fails, the test
only reported:
apkdiff regression test failed with exit code: 3. See test attachments.
Expected: True
But was: False
...but the referenced `apkdiff.log` was written to the build directory
and never actually added as a test attachment, so diagnosing a
regression meant digging through CI build artifacts.
This makes the failure self-explanatory and copy-paste-ready:
* `RunApkDiffCommand` now adds `apkdiff.log` (the full apkdiff report)
as a real test attachment.
* On failure, `BuildReleaseArm64` now fails with a detailed message
containing the full apkdiff output (stdout + stderr) and a per-entry
`.apkdesc` diff (reference -> current) with `"Size"` +/- lines and
byte deltas, so the regression is obvious at a glance.
* The freshly-generated `.apkdesc` is attached too, so the updated
reference can be grabbed straight from the test results.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Improves diagnosability of APK size regression failures in BuildReleaseArm64 by ensuring apkdiff output and generated .apkdesc data are surfaced directly in NUnit test results (attachments + detailed failure text), reducing reliance on CI artifact digging.
Changes:
- Attach
apkdiff.logas an NUnit test attachment fromRunApkDiffCommand. - On
apkdiffregression failures, failBuildReleaseArm64with full stdout/stderr plus a per-entry.apkdescsize diff, and attach the newly generated.apkdesc. - Add helpers to parse
.apkdescJSON and compute a readable size-delta summary.
Show a summary per file
| File | Description |
|---|---|
| src/Xamarin.Android.Build.Tasks/Tests/Xamarin.Android.Build.Tests/Utilities/BaseTest.cs | Adds apkdiff.log as a test attachment when running apkdiff. |
| src/Xamarin.Android.Build.Tasks/Tests/Xamarin.Android.Build.Tests/BuildTest2.cs | Enhances BuildReleaseArm64 failure diagnostics: attaches generated .apkdesc and includes verbose output + computed .apkdesc diffs in the assertion failure. |
Copilot's findings
- Files reviewed: 2/2 changed files
- Comments generated: 1
|
/review |
|
✅ Android PR Reviewer completed successfully! |
There was a problem hiding this comment.
🤖 Code Review — ⚠️ Needs Changes
Nice quality-of-life improvement. Turning the opaque exit code: 3 failure into a self-contained, copy-paste-ready report (apkdiff output + per-entry .apkdesc diff + an accept-the-change hint) is genuinely useful, and registering apkdiff.log as a real NUnit attachment in RunApkDiffCommand is the right call. The Assert.IsTrue (code == 0) → if (code != 0) { ... Assert.Fail (...) } refactor preserves the original pass/fail behavior, and scope is test-only with no product impact.
Findings
⚠️ 1 warning —GetApkDescDiff/ReadApkDescEntriescan throw while building the failure message (existing-but-malformed/unreadable.apkdesc), which would mask the very diagnostics this PR adds — including the apkdiffstdOut/stdErralready captured — becauseAssert.Failis never reached. The PR description claims graceful degradation, but only missing files are handled. See inline comment on theGetApkDescDiffcall.- 💡 1 suggestion — culture-dependent byte-delta formatting in the diff output. See inline comment.
CI
Not green yet: the Linux/macOS/Windows Azure DevOps legs for 369c61e show cancelled (superseded) and a fresh dotnet-android run is queued. Please confirm the internal pipeline goes green before merge.
The happy path looks correct — the warning is about hardening the failure path so the new diagnostics can't be swallowed by an exception. Once that's addressed (and CI is green), this is good to go.
Generated by Android PR Reviewer for issue #11759 · 507.9 AIC · ⌖ 47.9 AIC · ⊞ 40K
Comment /review to run again
| } | ||
| message.AppendLine (); | ||
| message.AppendLine ("== .apkdesc diff (reference -> current) =="); | ||
| message.AppendLine (GetApkDescDiff (apkDescReferencePath, apkDescPath)); |
There was a problem hiding this comment.
🤖 GetApkDescDiff → ReadApkDescEntries runs JsonDocument.Parse (File.ReadAllText (path)) with no try/catch. If either .apkdesc exists but is unreadable or malformed JSON, the exception is thrown here — before Assert.Fail (...) on line 337 — so the test fails with a raw JsonException/IOException and the entire rich message (including the apkdiff stdOut/stdErr already appended just above) is discarded, making diagnostics worse than the original one-line assert. This is exactly the failure path the feature targets: when apkdiff exits non-zero it may have written a truncated/partial current .apkdesc. The PR description says the helper "degrades gracefully ... instead of throwing," but only missing files are guarded — not unreadable/malformed ones. Suggest wrapping the body of GetApkDescDiff in a try/catch that returns a placeholder, e.g. $"(failed to compute apkdesc diff: {ex.Message})".
Rule: Diagnostics must degrade gracefully, not throw
| if (oldSize != newSize) { | ||
| sb.AppendLine ($" {key}"); | ||
| sb.AppendLine ($"- \"Size\": {oldSize}"); | ||
| sb.AppendLine ($"+ \"Size\": {newSize} ({(newSize - oldSize):+#,0;-#,0;0} bytes)"); |
There was a problem hiding this comment.
🤖 💡 Testing — The byte-delta format {(newSize - oldSize):+#,0;-#,0;0} uses the current culture's group separator, so the same regression prints +1,024 on an en-US agent but +1.024 on a de-DE one. For stable, copy-pasteable diagnostics regardless of the CI machine's locale, format with CultureInfo.InvariantCulture — e.g. (newSize - oldSize).ToString ("+#,0;-#,0;0", CultureInfo.InvariantCulture) (needs using System.Globalization;). Optional, since CI is typically invariant/en-US.
Rule: Deterministic, locale-independent test output
Why
The
BuildReleaseArm64test guards against APK size regressions by runningapkdiffagainst a checked-in.apkdescreference. When a regression trips the check, the test used to fail with nothing but an exit code:The "test attachments" it pointed at didn't actually exist: the
apkdiff.logwas written to the build directory but never registered as an NUnit test attachment, and the freshly-generated.apkdescwasn't surfaced either. In practice, diagnosing a size regression meant SSH-ing into the build, hunting through CI artifacts to find which entry grew, by how much, and then manually regenerating the reference. That's slow and error-prone — especially for first-time contributors who just bumped a dependency.This makes the failure self-explanatory and copy-paste-ready straight from the test output.
What
BaseTest.RunApkDiffCommandnow registers theapkdiff.logit writes as a real test attachment, so the full apkdiff report is always available in the test results (pass or fail).BuildTest2.BuildReleaseArm64now, on failure (code != 0):.apkdescso the updated reference can be downloaded directly from the test results.Assert.Failwith a rich message that includes:.apkdescdiff (reference → current) showing"Size"-/+lines with signed, thousands-separated byte deltas, and[added]/[removed]markers for entries that appear or disappear,.apkdesc/ runbuild-tools/scripts/UpdateApkSizeReference.sh).The diff is computed by parsing the
Entriesobject of both.apkdescJSON files (GetApkDescDiff/ReadApkDescEntries) and comparing theSizeof each entry, so only entries that actually changed are shown.Example of the new failure output
Attachments on the failed test:
apkdiff.log(full report) and the current*.apkdesc(the new reference candidate).Notes
Checklist
BuildReleaseArm64regression test; no new behavior to cover beyond the richer failure reporting.