Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/x509: "certificate is not standards compliant" on MacOS #51991

Open
dims opened this issue Mar 28, 2022 · 23 comments
Open

crypto/x509: "certificate is not standards compliant" on MacOS #51991

dims opened this issue Mar 28, 2022 · 23 comments
Labels
NeedsInvestigation OS-Darwin
Milestone

Comments

@dims
Copy link

@dims dims commented Mar 28, 2022

We hit an error with a unit test we had in Kubernetes and started looking at the impact on end users of kubernetes if the problem is not resolved by the time kubernetes 1.24 is released. More context: please see Kubernetes issue - kubernetes/kubernetes#108956

What version of Go are you using (go version)?

$ go version
go version go1.18 darwin/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/dims/Library/Caches/go-build"
GOENV="/Users/dims/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/dims/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/dims/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.18/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.18/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.18"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/qw/pkzvlrfs7rn7h6r1x7r57_rw0000gn/T/go-build1513460199=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Please see https://go.dev/play/p/w4rr43vQv7d

What did you expect to see?

success

What did you see instead?

error: x509: “no-sct.badssl.com” certificate is not standards compliant

@dims
Copy link
Author

@dims dims commented Mar 28, 2022

The error seems be introduced here: feb024f#diff-9e2a37df9605e8b207365b51999e6b14e1f5db72b27ad33514dbac502d477c25R212

@liggitt summarized the ask here : kubernetes/kubernetes#108956 (comment) and kubernetes/kubernetes#108956 (comment) on what impact this will have on kubernetes users for whom there was no issues before we switched to go1.17 when they try the kubernetes kubectl command built using go1.18.

Worst case we would like to document scenarios under which users will hit the certificate is not standards compliant error that they were not hitting before.

thanks!

@seankhliao seankhliao added the NeedsInvestigation label Mar 28, 2022
@seankhliao
Copy link
Member

@seankhliao seankhliao commented Mar 28, 2022

cc @golang/security

@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Mar 28, 2022

Can you share the actual affected certificate? You are unlikely to be hitting the same root cause as no-sct.badssl.com in a unit test, because SCT rules are only enforced for WebPKI certificates.

@liggitt
Copy link
Contributor

@liggitt liggitt commented Mar 28, 2022

We only noticed the issue in a unit test checking a cert we expected to be considered invalid (it was a negative test of a cert not signed by a trusted root), so at first we thought we just needed to make the unit test more tolerant of various error messages.

In spelunking around the change in the message, we also ran across https://groups.google.com/g/golang-nuts/c/RGghq2gTWss/m/7GsudTfCAgAJ which indicated requests that previously succeeded were now failing.

Before papering over the change in our unit test by tolerating more validation error messages, I wanted to understand more about which certificates the go validator considers valid that the macOS validator does not. (kubernetes/kubernetes#108956 (comment))

@rolandshoemaker
Copy link
Member

@rolandshoemaker rolandshoemaker commented Mar 28, 2022

Apple enforces their SCT requirements on all publicly trusted certificates as part of its base TLS policy (which we use via SecPolicyCreateSSL, since we are generally targeting the web PKI.) Publicly trusted certificates that lack embedded SCTs are very rare, making up something like 0.01% of all publicly trusted certs, but they are out there (the AWS example being probably the most common.)

This will only affect users who are using the bare system certificate pool and are validating certificates which totally lack embedded SCT, with the server providing them via a TLS extension.

It is noted in a TODO in crypto/x509/root_darwin.go that we may want to support passing SCTs passed this way, since we have no way of telling Apple to disable this particular policy (I would need to double check, but it's possible these requirements are not enforced if you use SecPolicyCreateBasicX509, but that would likely also disabled all of the other web PKI policies that we want applied), but how to do that is rather nuanced (since we'd only do anything with them on macOS.)

@rolandshoemaker
Copy link
Member

@rolandshoemaker rolandshoemaker commented Mar 28, 2022

Slight side note: the AWS case is a weird one, because I don't think they are sending SCTs at all, despite using publicly trusted certs, so even implementations that know how to pipe SCTs passed via TLS extensions wouldn't work on macOS 🤷.

@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Mar 29, 2022

I don't think the SCT policy explains the new error in kubernetes/kubernetes#108956 though, because that's not a publicly trusted certificate. If you want to extract that certificate and share it with us, we can tell you why that one is failing, too.

kubernetes/kubernetes#108956 (comment) papering over the change in our unit test by tolerating more validation error messages, I wanted to understand more about which certificates the go validator considers valid that the macOS validator does not.

We can't really answer this exhaustively, because the macOS verifier has a number of evolving policies that change between OS versions. Note that the platform verifier is only used when the system roots are involved, so behaving like the system is what's expected. I assume k8s clusters use private CAs configured through config.RootCAs for most purposes, which would be unaffected by this.

@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Mar 29, 2022

Slight side note: the AWS case is a weird one, because I don't think they are sending SCTs at all, despite using publicly trusted certs, so even implementations that know how to pipe SCTs passed via TLS extensions wouldn't work on macOS 🤷.

Customers should probably reach out to AWS about this. As a short term workaround, it should be possible to add the Amazon root CAs to a x509.SystemCertPool() and use it as config.RootCAs so that the Go verifier is used as well as the system one. (Don't start with an empty pool so that if the root changes you have a chance at not breaking.)

@liggitt
Copy link
Contributor

@liggitt liggitt commented Mar 29, 2022

It looks like this change means we no longer get typed TLS errors (e.g. x509.UnknownAuthorityError) when validating using system roots.

That means that special handling of those errors (logging or other fallback paths) that previously worked no longer works in go1.18.

edit: I'll open a separate issue for that, since that is distinct from the "certificate considered valid in go1.17" → "certificate considered invalid in go1.18" issue

@liggitt
Copy link
Contributor

@liggitt liggitt commented Mar 29, 2022

opened #52010 for the untyped error issue

@liggitt
Copy link
Contributor

@liggitt liggitt commented Mar 29, 2022

I assume k8s clusters use private CAs configured through config.RootCAs for most purposes, which would be unaffected by this.

I also expect that to be true in most scenarios (and in scenarios where it isn't for the certs issued by public CAs to be compatible with system roots, though the referenced AWS issue is evidence that not all certs issued by public CAs are valid).

For k8s' use, I don't think this issue is very significant.

@rolandshoemaker
Copy link
Member

@rolandshoemaker rolandshoemaker commented Mar 29, 2022

Oh, that particular test certificate (in TestTLSConfig) is non-compliant in a handful of ways. It's self-signed, but isCA is false, it is missing the cert sign key usage, and it's validity period is likely too long (although I'm not sure if macOS enforces that for self-signed certs.)

@calvinbui
Copy link

@calvinbui calvinbui commented Mar 30, 2022

We've had the same issue with connecting to AWS Elasticache Redis servers. Amazon will not support SCTs to avoid publishing customer cluster names in a public log. The connection previously worked fine in 1.17.

@jimidle
Copy link

@jimidle jimidle commented Mar 31, 2022

A little more background about AWS, or at least how we were connecting to the Neptune graph database service. In case it helps anyone.

Because Neptune is a little "light" on security, you can only connect to it through local/private VPC. This isn't very useful for developers, so we have a VPN to a bastion host for a development only instance of Neptune (Neptune does not have any local installation - it is an AWS service only).

It seems that AWS did not feel the need to put any SCTs in to the Neptune cert, thinking it would only see connections from the secured VPC, and so our connections (via go, it is fine from Java for instance) will fail.

We have raised a ticket with AWS about this. There isn't much can be done about that in Go.

As this is a developer only connection, we have created a reverse proxy with a local CA root. This allows the connection for developers. Hokey, but does what we want for a developer connection. The real solution is of course for AWS to re-issue their certificate, however they say they don't want SCTs in order to avoid placing customer cluster names in a public log (see #51991 (comment) )

@bcmills bcmills changed the title crypto/x509: "certificate is not standards compliant" on MacOS only with golang 1.18 crypto/x509: "certificate is not standards compliant" on MacOS Apr 20, 2022
@bcmills bcmills added this to the Go1.19 milestone Apr 20, 2022
@rolandshoemaker
Copy link
Member

@rolandshoemaker rolandshoemaker commented Apr 20, 2022

As far as I can tell this seems, possibly, (this is unbearably painful to diagnose) to be an issue with 10.15.1, which is what the the darwin-amd64-10_15 builder is running. I suspect that updating the builder to use 10.15.6 would fix this, but I have absolutely no clue how viable that is.

@bcmills
Copy link
Member

@bcmills bcmills commented Apr 20, 2022

I suspect that updating the builder to use 10.15.6 would fix this, but I have absolutely no clue how viable that is.

@golang/release, can you weigh in on that? (How hard is the macOS 10.15 image to update?)

@heschi
Copy link
Contributor

@heschi heschi commented Apr 20, 2022

For amd64 I think it's maybe a day's work, if we're willing to cut over all the builders at once. Rolling it out gradually will be more unpleasant. I haven't read this issue to judge whether it's a good use of time.

@rolandshoemaker
Copy link
Member

@rolandshoemaker rolandshoemaker commented Apr 21, 2022

I don't think there is really any other way to address this issue, given how deeply integrated the TLS client is in the toolchain there isn't really any (safe) way of silently handling/skipping these failures. It's not a high frequency flake though (it seems somewhat correlated with when new certificates are issued) so probably not super high priority.

@FiloSottile
Copy link
Contributor

@FiloSottile FiloSottile commented Apr 21, 2022

(it seems somewhat correlated with when new certificates are issued)

Can the machine reach the internet? That sounds consistent with a bloom filter window miss on the Apple Valid system, which leads to an OCSP connection to the CA. If that fails, I could see it leading to a vague error like this.

@jbg
Copy link

@jbg jbg commented May 11, 2022

In some cases certificates may be deliberately excluded from CT logs to avoid publishing a detailed map of internal infrastructure. (In our case, the certs are associated with DNS names that are only resolvable internally, and which resolve to private IPs.)

e.g. AWS ACM allows disabling CT logging for this purpose, which will result in a valid certificate issued by a trusted CA but not listed in CT logs.

When trying to access a service with such a cert from Go (in our case, using a Terraform provider) on developer (darwin_arm64) machines, we get this certificate is not standards compliant error.

Is there any solution other than logging the certs? Is there any knob in the Go TLS client for turning off the check, which the TF provider could provide a config option to turn?

@jimidle
Copy link

@jimidle jimidle commented May 11, 2022

@neild
Copy link
Contributor

@neild neild commented May 11, 2022

Split off the builder flakes into a separate issue: #52854

@neild neild removed this from the Go1.19 milestone May 11, 2022
@bcmills bcmills added this to the Backlog milestone May 11, 2022
jedisct1 added a commit to DNSCrypt/dnscrypt-resolvers that referenced this issue Jun 20, 2022
…standards compliant"

This reverts commit 186c57d.

Apparently, the issue only happens on macOS

golang/go#51991
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation OS-Darwin
Projects
None yet
Development

No branches or pull requests