Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

android: more detail for test config verify exception #75

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

cpu
Copy link
Member

@cpu cpu commented Mar 12, 2024

Recently CI started failing the real world verification suite on Android (#68). The root cause was the vendored Let's Encrypt end entity certificate expiring. Figuring this out was hard for two main reasons:

  1. CI was eating the Android logs we needed to see the error message. This has since been fixed (Update Android test action and fix outstanding CI issues #72).

  2. The Android verifier was surfacing the expiration as an unknown issuer error with a generic "Chain validation failed" message.

This branch attempts to fix item 2 by surfacing a more relevant error message.

Unfortunately doing so is a little bit annoying. The checkServerTrusted call throws an exception with ~3 layers of wrapping before you get at the CertificateExpiredException that's the root cause.

Since non-test configurations should never have an expired exception thrown at this stage (recall we check this explicitly earlier in processing), we gate the more complicated exception "digging" based on BuildConfig.TEST.

This should be a good balance of:

  • not changing existing behaviour, or introducing complexity for normal validation
  • surfacing a better error message when test certificates expire and start to break CI.

Before this fix running the verification test suite back at 6cd0232 before 15d487a landed would produce output like:

03-12 15:05:38.900 18229 18250 W rustls_platform_verif..: certificate was not trusted: java.security.cert.CertificateException: Chain validation failed
03-12 15:05:38.900 18229 18250 E rustls_platform_verif..: failed to verify TLS certificate: invalid peer certificate: UnknownIssuer
03-12 15:05:38.900 18229 18250 E rustls_platform_verif..: test panic: assertion `left == right` failed
03-12 15:05:38.900 18229 18250 E rustls_platform_verif..:   left: Err(InvalidCertificate(UnknownIssuer))
03-12 15:05:38.900 18229 18250 E rustls_platform_verif..:  right: Ok(())
03-12 15:05:38.901 18229 18248 E rustls_platform_verif..: real world: test failed

Afterwards it logs output like:

03-12 15:06:05.297 18482 18503 E rustls_platform_verif..: failed to verify TLS certificate: invalid peer certificate: Expired
03-12 15:06:05.297 18482 18503 E rustls_platform_verif..: test panic: assertion `left == right` failed
03-12 15:06:05.297 18482 18503 E rustls_platform_verif..:   left: Err(InvalidCertificate(Expired))
03-12 15:06:05.297 18482 18503 E rustls_platform_verif..:  right: Ok(())
03-12 15:06:05.298 18482 18501 E rustls_platform_verif..: real world: test failed

@cpu cpu mentioned this pull request Mar 12, 2024
@cpu cpu force-pushed the cpu-obvious-ci-expiry-err_dev branch from f5d114b to 75964ec Compare March 15, 2024 13:51
@cpu
Copy link
Member Author

cpu commented Mar 15, 2024

I'm starting to wonder if instead of these gymnastics we should just back out the time argument being fed through to the CertificateVerifier until we work out a broader solution to #59. That would let the early checkValidity call surface the better error early on.

Recently CI started failing the real world verification suite on
Android. The root cause was the vendored Let's Encrypt end entity
certificate expiring. Figuring this out was hard for two main reasons:

1. CI was eating the Android logs we needed to see the error message.
   This has since been fixed.

2. The Android verifier was surfacing the expiration as an unknown
   issuer error with a generic "Chain validation failed" message.

This branch attempts to fix item 2 by surfacing a more relevant error
message.

Unfortunately doing so is a little bit annoying. The
`checkServerTrusted` call throws an exception with ~3 layers of wrapping
before you get at the `CertificateExpiredException` that's the root
cause.

Since non-test configurations should never have an expired exception
thrown at this stage (recall we check this explicitly earlier in
processing), we gate the more complicated excepting "digging" based on
`BuildConfig.TEST`.

This should be a good balance of:

* not changing existing behaviour, or introducing complexity for normal
  validation
* surfacing a better error message when test certificates expire and
  start to break CI.
@cpu cpu force-pushed the cpu-obvious-ci-expiry-err_dev branch from 75964ec to 0bd1947 Compare March 15, 2024 15:03
@cpu cpu self-assigned this Mar 25, 2024
@cpu
Copy link
Member Author

cpu commented Mar 29, 2024

I'm starting to wonder if instead of these gymnastics we should just back out the time argument being fed through to the CertificateVerifier until we work out a broader solution to #59.

@complexspaces WDYT ☝️ vs this branch? In a perfect world we'd fix #59 instead of doing either but I don't think I'll personally be able to find time to work on that solution and it would be nice to have CI report the correct failure before it happens on April 26th (and every 90d afterwards).

@complexspaces
Copy link
Collaborator

WDYT ☝️ vs this branch?

I'm not pushed either way, honestly. If we backed the time parameter out we should keep the root error "digging" because I think no matter the route, we should try and return a clear expiry error as the main priority. What do you think about just merging this to keep CI nicer and maintain the "good" errors most end users will see?

@cpu
Copy link
Member Author

cpu commented Apr 10, 2024

What do you think about just merging this to keep CI nicer and maintain the "good" errors most end users will see?

That works for me :-) If you have a chance to put a review on this branch I'll merge when ready.

@cpu cpu merged commit 7d334a7 into rustls:main Apr 10, 2024
15 checks passed
@cpu cpu deleted the cpu-obvious-ci-expiry-err_dev branch April 10, 2024 21:43
@cpu
Copy link
Member Author

cpu commented Apr 10, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants