Skip to content

[tests] Verbose apkdiff output for BuildReleaseArm64 regressions#11759

Closed
simonrozsival wants to merge 1 commit into
dotnet:mainfrom
simonrozsival:android-improve-trimmable-typemap-incrementality
Closed

[tests] Verbose apkdiff output for BuildReleaseArm64 regressions#11759
simonrozsival wants to merge 1 commit into
dotnet:mainfrom
simonrozsival:android-improve-trimmable-typemap-incrementality

Conversation

@simonrozsival

@simonrozsival simonrozsival commented Jun 26, 2026

Copy link
Copy Markdown
Member

Why

The BuildReleaseArm64 test guards against APK size regressions by running apkdiff against a checked-in .apkdesc reference. When a regression trips the check, the test used to fail with nothing but an exit code:

apkdiff regression test failed with exit code: 3. See test attachments.
Expected: True
But was:  False

The "test attachments" it pointed at didn't actually exist: the apkdiff.log was written to the build directory but never registered as an NUnit test attachment, and the freshly-generated .apkdesc wasn't surfaced either. In practice, diagnosing a size regression meant SSH-ing into the build, hunting through CI artifacts to find which entry grew, by how much, and then manually regenerating the reference. That's slow and error-prone — especially for first-time contributors who just bumped a dependency.

This makes the failure self-explanatory and copy-paste-ready straight from the test output.

What

BaseTest.RunApkDiffCommand now registers the apkdiff.log it writes as a real test attachment, so the full apkdiff report is always available in the test results (pass or fail).

BuildTest2.BuildReleaseArm64 now, on failure (code != 0):

  • Attaches the freshly-generated .apkdesc so the updated reference can be downloaded directly from the test results.
  • Fails via Assert.Fail with a rich message that includes:
    • the full apkdiff output (stdout + stderr),
    • a per-entry .apkdesc diff (reference → current) showing "Size" -/+ lines with signed, thousands-separated byte deltas, and [added]/[removed] markers for entries that appear or disappear,
    • a one-line hint on how to accept the change (update the reference .apkdesc / run build-tools/scripts/UpdateApkSizeReference.sh).

The diff is computed by parsing the Entries object of both .apkdesc JSON files (GetApkDescDiff / ReadApkDescEntries) and comparing the Size of each entry, so only entries that actually changed are shown.

Example of the new failure output

apkdiff regression test failed with exit code: 3.

== apkdiff output ==
Size difference in bytes ([reference] [current] [difference]):
  + assemblies/UnnamedProject.dll : 18432 19456 1024
  + lib/arm64-v8a/libxamarin-app.so : 1359872 1372160 12288
Total size difference: +13312 bytes
Error: size increased by 13312 bytes, which is a regression.

== .apkdesc diff (reference -> current) ==
  assemblies/UnnamedProject.dll
-     "Size": 18432
+     "Size": 19456   (+1,024 bytes)
  lib/arm64-v8a/libxamarin-app.so
-     "Size": 1359872
+     "Size": 1372160   (+12,288 bytes)
+ lib/arm64-v8a/libnew-feature.so ("Size": 4096) [added]

If this change is intended, update the reference 'UnnamedProject.apkdesc' with the current '.apkdesc' attached to this test (or run build-tools/scripts/UpdateApkSizeReference.sh).

Attachments on the failed test: apkdiff.log (full report) and the current *.apkdesc (the new reference candidate).

Notes

  • Pure test-infrastructure change — no product code is touched, so there is no runtime/behavioral impact on shipped builds.
  • When the apkdesc files can't be read (missing reference/current, or no per-entry size differences), the diff helper degrades gracefully with an explanatory placeholder instead of throwing.

Checklist

  • Useful description of why the change is necessary.
  • Links to issues fixed
  • Unit tests — improves the diagnostics of the existing BuildReleaseArm64 regression test; no new behavior to cover beyond the richer failure reporting.

When the `BuildReleaseArm64` APK size regression check fails, the test
only reported:

    apkdiff regression test failed with exit code: 3. See test attachments.
    Expected: True
    But was:  False

...but the referenced `apkdiff.log` was written to the build directory
and never actually added as a test attachment, so diagnosing a
regression meant digging through CI build artifacts.

This makes the failure self-explanatory and copy-paste-ready:

* `RunApkDiffCommand` now adds `apkdiff.log` (the full apkdiff report)
  as a real test attachment.
* On failure, `BuildReleaseArm64` now fails with a detailed message
  containing the full apkdiff output (stdout + stderr) and a per-entry
  `.apkdesc` diff (reference -> current) with `"Size"` +/- lines and
  byte deltas, so the regression is obvious at a glance.
* The freshly-generated `.apkdesc` is attached too, so the updated
  reference can be grabbed straight from the test results.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 26, 2026 14:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves diagnosability of APK size regression failures in BuildReleaseArm64 by ensuring apkdiff output and generated .apkdesc data are surfaced directly in NUnit test results (attachments + detailed failure text), reducing reliance on CI artifact digging.

Changes:

  • Attach apkdiff.log as an NUnit test attachment from RunApkDiffCommand.
  • On apkdiff regression failures, fail BuildReleaseArm64 with full stdout/stderr plus a per-entry .apkdesc size diff, and attach the newly generated .apkdesc.
  • Add helpers to parse .apkdesc JSON and compute a readable size-delta summary.
Show a summary per file
File Description
src/Xamarin.Android.Build.Tasks/Tests/Xamarin.Android.Build.Tests/Utilities/BaseTest.cs Adds apkdiff.log as a test attachment when running apkdiff.
src/Xamarin.Android.Build.Tasks/Tests/Xamarin.Android.Build.Tests/BuildTest2.cs Enhances BuildReleaseArm64 failure diagnostics: attaches generated .apkdesc and includes verbose output + computed .apkdesc diffs in the assertion failure.

Copilot's findings

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

@simonrozsival

Copy link
Copy Markdown
Member Author

/review

@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

Android PR Reviewer completed successfully!

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Code Review — ⚠️ Needs Changes

Nice quality-of-life improvement. Turning the opaque exit code: 3 failure into a self-contained, copy-paste-ready report (apkdiff output + per-entry .apkdesc diff + an accept-the-change hint) is genuinely useful, and registering apkdiff.log as a real NUnit attachment in RunApkDiffCommand is the right call. The Assert.IsTrue (code == 0)if (code != 0) { ... Assert.Fail (...) } refactor preserves the original pass/fail behavior, and scope is test-only with no product impact.

Findings

  • ⚠️ 1 warningGetApkDescDiff/ReadApkDescEntries can throw while building the failure message (existing-but-malformed/unreadable .apkdesc), which would mask the very diagnostics this PR adds — including the apkdiff stdOut/stdErr already captured — because Assert.Fail is never reached. The PR description claims graceful degradation, but only missing files are handled. See inline comment on the GetApkDescDiff call.
  • 💡 1 suggestion — culture-dependent byte-delta formatting in the diff output. See inline comment.

CI

Not green yet: the Linux/macOS/Windows Azure DevOps legs for 369c61e show cancelled (superseded) and a fresh dotnet-android run is queued. Please confirm the internal pipeline goes green before merge.

The happy path looks correct — the warning is about hardening the failure path so the new diagnostics can't be swallowed by an exception. Once that's addressed (and CI is green), this is good to go.

Generated by Android PR Reviewer for issue #11759 · 507.9 AIC · ⌖ 47.9 AIC · ⊞ 40K
Comment /review to run again

}
message.AppendLine ();
message.AppendLine ("== .apkdesc diff (reference -> current) ==");
message.AppendLine (GetApkDescDiff (apkDescReferencePath, apkDescPath));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 ⚠️ Error handling — On failure, GetApkDescDiffReadApkDescEntries runs JsonDocument.Parse (File.ReadAllText (path)) with no try/catch. If either .apkdesc exists but is unreadable or malformed JSON, the exception is thrown here — before Assert.Fail (...) on line 337 — so the test fails with a raw JsonException/IOException and the entire rich message (including the apkdiff stdOut/stdErr already appended just above) is discarded, making diagnostics worse than the original one-line assert. This is exactly the failure path the feature targets: when apkdiff exits non-zero it may have written a truncated/partial current .apkdesc. The PR description says the helper "degrades gracefully ... instead of throwing," but only missing files are guarded — not unreadable/malformed ones. Suggest wrapping the body of GetApkDescDiff in a try/catch that returns a placeholder, e.g. $"(failed to compute apkdesc diff: {ex.Message})".

Rule: Diagnostics must degrade gracefully, not throw

if (oldSize != newSize) {
sb.AppendLine ($" {key}");
sb.AppendLine ($"- \"Size\": {oldSize}");
sb.AppendLine ($"+ \"Size\": {newSize} ({(newSize - oldSize):+#,0;-#,0;0} bytes)");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 💡 Testing — The byte-delta format {(newSize - oldSize):+#,0;-#,0;0} uses the current culture's group separator, so the same regression prints +1,024 on an en-US agent but +1.024 on a de-DE one. For stable, copy-pasteable diagnostics regardless of the CI machine's locale, format with CultureInfo.InvariantCulture — e.g. (newSize - oldSize).ToString ("+#,0;-#,0;0", CultureInfo.InvariantCulture) (needs using System.Globalization;). Optional, since CI is typically invariant/en-US.

Rule: Deterministic, locale-independent test output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants