Skip to content

Release Android X509 chain certificate GREFs#128284

Open
simonrozsival wants to merge 5 commits into
dotnet:mainfrom
simonrozsival:dev/simonrozsival/android-x509-gref-leak-fix
Open

Release Android X509 chain certificate GREFs#128284
simonrozsival wants to merge 5 commits into
dotnet:mainfrom
simonrozsival:dev/simonrozsival/android-x509-gref-leak-fix

Conversation

@simonrozsival
Copy link
Copy Markdown
Member

@simonrozsival simonrozsival commented May 16, 2026

Fixes a JNI global-reference leak in the Android X509 chain PAL.

AndroidCryptoNative_X509ChainGetCertificates returns caller-owned JNI global refs for the certificates in the built chain. Constructing X509Certificate2 from those pointers duplicates the refs for managed ownership, but the original returned refs were never released. This releases those temporary refs after the managed certificates have been created.

Regression test

BuildChainRepeatedly_DoesNotExhaustGlobalReferences is a new [OuterLoop] Android-only test that builds 8,600 6-certificate PKI chains via CertificateAuthority.BuildPrivatePki. Without this PR each successful build leaks 6 JNI global references, so 8,600 iterations would leak 51,600 — past Android's default 51,200 entry limit.

Validation

Run on a local Android emulator (API 36, arm64-v8a, Apple Silicon host, Mono interpreter):

  • With this PR's fix: ✅ passes (xharness exit 0) in ~9.5 minutes.
  • Without the fix (verified by reverting the new try/finally block in Interop.X509Chain.cs): ❌ crashes between iter ~7,750 and ~8,500 with JNI ERROR (app bug): global reference table overflow (max=51200) and a tombstone; xharness exit 80 (APP_CRASH).

Copilot AI review requested due to automatic review settings May 16, 2026 13:01
@github-actions github-actions Bot added the area-crossgen2-coreclr only use for closed issues label May 16, 2026
@simonrozsival simonrozsival marked this pull request as draft May 16, 2026 13:03
The Android X509 chain PAL returns JNI global references for chain certificates. Creating X509Certificate2 from those pointers duplicates the references for managed ownership, leaving the native-returned references caller-owned. Release those temporary references after conversion so repeated chain builds do not exhaust ART global refs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@simonrozsival simonrozsival force-pushed the dev/simonrozsival/android-x509-gref-leak-fix branch from 5853fff to cce39fd Compare May 16, 2026 13:05
@simonrozsival simonrozsival reopened this May 16, 2026
@simonrozsival simonrozsival force-pushed the dev/simonrozsival/android-x509-gref-leak-fix branch from 5853fff to cce39fd Compare May 16, 2026 13:10
@simonrozsival simonrozsival removed the request for review from MichalStrehovsky May 16, 2026 13:10
@simonrozsival simonrozsival requested a review from vcsjones May 16, 2026 13:11
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @dotnet/ncl, @bartonjs, @vcsjones
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…ref limit

Adds [OuterLoop] regression test
BuildChainRepeatedly_DoesNotExhaustGlobalReferences that builds a
6-certificate chain (root + 4 intermediates + endCert) 8,600 times.

Without the gref-release fix in PR dotnet#128284,
AndroidCryptoNative_X509ChainGetCertificates returns caller-owned JNI
global references that X509Certificate2 then duplicates, leaving the
native-returned refs orphaned. A 6-cert chain leaks 6 grefs per
successful build; 8,600 builds therefore leak 51,600 references, which
exceeds Android's default global-reference table limit (51,200) and
aborts the process with 'global reference table overflow (max=51200)'.

Threshold validation against the unfixed code path:
  - 8,400 iterations completed successfully
  - 8,500 iterations crashed with the gref-table overflow

With the managed try/finally cleanup in place
(Interop.X509Chain.X509ChainGetCertificates), 8,600 iterations complete
cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
simonrozsival added a commit to simonrozsival/runtime that referenced this pull request May 18, 2026
…ref limit

Adds [OuterLoop] companion test
BuildChainRepeatedlyOnFailure_DoesNotExhaustGlobalReferences that builds
a 6-certificate chain and forces it to fail (VerificationTime far in the
future, well outside cert validity) 60,000 times.

The success-path test (BuildChainRepeatedly_DoesNotExhaustGlobalReferences)
exercises X509ChainGetCertificates, which PR dotnet#128284 fixed. The failure
path is a different code path: managed code skips GetCertificates entirely
when chain.Build returns false, and instead invokes X509ChainGetErrors
and walks the resulting error list. The pal_x509chain.c side allocates
xmalloc'd UTF-16 message buffers per error; the managed side is expected
to Marshal.FreeHGlobal each one.

The failure path doesn't currently leak — the test is defensive against
future regressions that introduce a per-iteration leak on this path
(e.g., a forgotten Marshal.FreeHGlobal, a missing ReleaseGRef on a
throwable, or a regression in X509ChainBuild's exception-handling flow).
60,000 iterations is chosen to overflow the 51,200-entry JNI global
reference table even under a worst-case 1-gref-per-iteration regression.
The 6-cert chain mirrors the success-path test so the two tests cover
the same chain shape.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 13:05
@simonrozsival simonrozsival force-pushed the dev/simonrozsival/android-x509-gref-leak-fix branch from 6b38047 to c36b4e7 Compare May 18, 2026 13:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread src/libraries/System.Security.Cryptography/tests/X509Certificates/ChainTests.cs Outdated
@simonrozsival
Copy link
Copy Markdown
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simonrozsival
Copy link
Copy Markdown
Member Author

Failing tests in runtime-extra-platforms are unrelated

Copilot AI review requested due to automatic review settings May 19, 2026 14:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

}
}

[PlatformSpecific(TestPlatforms.Android)]
// PAL cleanup, so 8,600 builds would leak 51,600 certificate refs. 8,400 iterations
// completed without the fix during threshold testing, while 8,500 iterations crashed
// with "global reference table overflow (max=51200)".
// This tests runs for ~10 minutes on an Android emulator.
This was referenced May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants