Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prometheus Remote Write Exporter for Cortex] Fix Panic Issue in MutualTLS Test #315

Merged
merged 10 commits into from
Sep 3, 2020

Conversation

ercl
Copy link
Contributor

@ercl ercl commented Aug 28, 2020

This PR resolves the test panic described in #310.

The panic occurs because the functions for generating certificates are slower on the 386 build and can cause the MutualTLS test to timeout. This PR updates the test to make certificate generation optional, and adds a testdata folder with static certificates and keys as well as a bash script for generating new certificates.

@ercl
Copy link
Contributor Author

ercl commented Aug 29, 2020

I changed the key generation from rsa 4096 to ecdsa p256, which improved the performance substantially. This may be because Go provides an assembly implementation of P256: See golang/go/src/crypto/elliptic.

Here are the results of running GOARCH=386 go test -run MutualTLS -cpuprofile cpu.prof -v and go tool pprof cpu.prof

With RSA 4096 (original code)
Duration: 31.97s, Total samples = 31.79s (99.44%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top 20
Showing nodes accounting for 31.26s, 98.33% of 31.79s total
Dropped 73 nodes (cum <= 0.16s)
Showing top 20 nodes out of 26
      flat  flat%   sum%        cum   cum%
    28.15s 88.55% 88.55%     28.15s 88.55%  math/big.addMulVVW
     2.85s  8.97% 97.51%     31.09s 97.80%  math/big.nat.montgomery
     0.18s  0.57% 98.08%      0.18s  0.57%  math/big.subVV
     0.05s  0.16% 98.24%      0.35s  1.10%  math/big.nat.divBasic
     0.03s 0.094% 98.33%     31.13s 97.92%  math/big.nat.expNNMontgomery
         0     0% 98.33%     31.36s 98.65%  crypto/rand.Prime
         0     0% 98.33%      0.32s  1.01%  crypto/rsa.(*PrivateKey).Sign
         0     0% 98.33%     31.36s 98.65%  crypto/rsa.GenerateKey (inline)
         0     0% 98.33%     31.36s 98.65%  crypto/rsa.GenerateMultiPrimeKey
         0     0% 98.33%      0.19s   0.6%  crypto/rsa.SignPKCS1v15
         0     0% 98.33%      0.31s  0.98%  crypto/rsa.decrypt
         0     0% 98.33%      0.32s  1.01%  crypto/rsa.decryptAndCheck
         0     0% 98.33%      0.19s   0.6%  crypto/x509.CreateCertificate
         0     0% 98.33%     31.55s 99.25%  go.opentelemetry.io/contrib/exporters/metric/cortex.TestMutualTLS.func1
         0     0% 98.33%      3.12s  9.81%  go.opentelemetry.io/contrib/exporters/metric/cortex.generateCACertFiles
         0     0% 98.33%     31.55s 99.25%  go.opentelemetry.io/contrib/exporters/metric/cortex.generateCertFiles
         0     0% 98.33%     19.24s 60.52%  go.opentelemetry.io/contrib/exporters/metric/cortex.generateClientCertFiles
         0     0% 98.33%      9.19s 28.91%  go.opentelemetry.io/contrib/exporters/metric/cortex.generateServingCertFiles
         0     0% 98.33%      0.32s  1.01%  math/big.(*Int).Exp
         0     0% 98.33%     31.34s 98.58%  math/big.(*Int).ProbablyPrime
With RSA 1024
Duration: 503.57ms, Total samples = 350ms (69.50%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top 20
Showing nodes accounting for 350ms, 100% of 350ms total
Showing top 20 nodes out of 79
 flat  flat%   sum%        cum   cum%
     200ms 57.14% 57.14%      200ms 57.14%  math/big.addMulVVW
      70ms 20.00% 77.14%      270ms 77.14%  math/big.nat.montgomery
      10ms  2.86% 80.00%       10ms  2.86%  encoding/asn1.MarshalWithParams
      10ms  2.86% 82.86%       10ms  2.86%  internal/bytealg.CountString
      10ms  2.86% 85.71%       10ms  2.86%  math/big.mulAddVWW
      10ms  2.86% 88.57%       10ms  2.86%  math/big.nat.cmp
      10ms  2.86% 91.43%      280ms 80.00%  math/big.nat.expNNMontgomery
      10ms  2.86% 94.29%       10ms  2.86%  runtime.(*mspan).sweep
      10ms  2.86% 97.14%       10ms  2.86%  runtime.stkbucket
      10ms  2.86%   100%       10ms  2.86%  runtime/internal/atomic.Cas
         0     0%   100%      300ms 85.71%  crypto/rand.Prime
         0     0%   100%       20ms  5.71%  crypto/rsa.(*PrivateKey).Sign
         0     0%   100%      300ms 85.71%  crypto/rsa.GenerateKey
         0     0%   100%      300ms 85.71%  crypto/rsa.GenerateMultiPrimeKey
         0     0%   100%       10ms  2.86%  crypto/rsa.SignPKCS1v15
         0     0%   100%       10ms  2.86%  crypto/rsa.SignPSS
         0     0%   100%       20ms  5.71%  crypto/rsa.decrypt
         0     0%   100%       20ms  5.71%  crypto/rsa.decryptAndCheck
         0     0%   100%       10ms  2.86%  crypto/rsa.signPSSWithSalt
         0     0%   100%       10ms  2.86%  crypto/tls.(*Conn).Handshake
With ECDSA P256
Duration: 201.08ms, Total samples = 30ms (14.92%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top 20
Showing nodes accounting for 30ms, 100% of 30ms total
Showing top 20 nodes out of 37
      flat  flat%   sum%        cum   cum%
      20ms 66.67% 66.67%       20ms 66.67%  crypto/elliptic.p256ReduceDegree
      10ms 33.33%   100%       10ms 33.33%  vendor/golang.org/x/crypto/cryptobyte.(*Builder).addLengthPrefixed
         0     0%   100%       20ms 66.67%  crypto/ecdsa.Verify
         0     0%   100%       20ms 66.67%  crypto/ecdsa.VerifyASN1
         0     0%   100%       20ms 66.67%  crypto/ecdsa.verify (inline)
         0     0%   100%       20ms 66.67%  crypto/ecdsa.verifyGeneric
         0     0%   100%       20ms 66.67%  crypto/elliptic.p256Curve.ScalarMult
         0     0%   100%       20ms 66.67%  crypto/elliptic.p256Mul
         0     0%   100%       10ms 33.33%  crypto/elliptic.p256PointAdd
         0     0%   100%       10ms 33.33%  crypto/elliptic.p256PointDouble
         0     0%   100%       20ms 66.67%  crypto/elliptic.p256ScalarMult
         0     0%   100%       30ms   100%  crypto/tls.(*Conn).Handshake
         0     0%   100%       10ms 33.33%  crypto/tls.(*Conn).clientHandshake
         0     0%   100%       20ms 66.67%  crypto/tls.(*Conn).serverHandshake
         0     0%   100%       10ms 33.33%  crypto/tls.(*Conn).verifyServerCertificate
         0     0%   100%       10ms 33.33%  crypto/tls.(*certificateRequestMsgTLS13).marshal
         0     0%   100%       10ms 33.33%  crypto/tls.(*certificateRequestMsgTLS13).marshal.func1
         0     0%   100%       10ms 33.33%  crypto/tls.(*certificateRequestMsgTLS13).marshal.func1.1
         0     0%   100%       10ms 33.33%  crypto/tls.(*certificateRequestMsgTLS13).marshal.func1.1.3
         0     0%   100%       10ms 33.33%  crypto/tls.(*certificateRequestMsgTLS13).marshal.func1.1.3.1

The build doesn't pass currently, but it is not due to the MutualTLS test.

EDIT: Now passes CI tests after running them a second time. The previous commit didn't pass, but it wasn't due to the MutualTLS test:

Log from CI
go test ./... + race in ./instrumentation/host
=== RUN   TestHostCPU
--- PASS: TestHostCPU (1.02s)
=== RUN   TestHostMemory
    host_test.go:183: 
        	Error Trace:	host_test.go:183
        	Error:      	Relative error is too high: 0.05 (expected)
        	            	        < 0.14240003154297343 (actual)
        	Test:       	TestHostMemory
--- FAIL: TestHostMemory (3.29s)
=== RUN   TestHostNetwork
--- PASS: TestHostNetwork (1.02s)
FAIL
FAIL	go.opentelemetry.io/contrib/instrumentation/host	5.388s

@MrAlias
Copy link
Contributor

MrAlias commented Aug 30, 2020

With the discovery of how to stop certificate creation from completely stalling the build, do you think we should still check in certificates?

It seems like if we are able to generate this for the test it will avoid the issue of the certificates needing to be updated in the future. Though, I, myself, am a bit undecided as this approach with static certificates will remove some uncertainty from what is being tested.

I'm interested what you think @ercl and @evantorrie.

@ercl
Copy link
Contributor Author

ercl commented Aug 31, 2020

I think that it would be better to remove the static files. It would remove the need to regenerate the certificates like you said and would make the MutualTLS test more consistent with other tests, which also create files when needed.

@evantorrie
Copy link
Contributor

I'm happy either way, but since we assume (?) that the Go crypto and tls cert libraries are well-tested, I think it's fine for us to generate the keys/certs on the fly and use them rather than checking in an additional 6-7 testdata/ files.

@MrAlias
Copy link
Contributor

MrAlias commented Sep 3, 2020

@ercl this looks good to go. Can you sync it with the master branch?

@ercl
Copy link
Contributor Author

ercl commented Sep 3, 2020

Merged!

@MrAlias MrAlias added this to Needs Triage in GA Burndown Sep 3, 2020
@MrAlias MrAlias added bug Something isn't working release:required-for-ga labels Sep 3, 2020
@MrAlias MrAlias moved this from Needs Triage to Reviewer approved in GA Burndown Sep 3, 2020
@MrAlias MrAlias merged commit 6f61c54 into open-telemetry:master Sep 3, 2020
GA Burndown automation moved this from Reviewer approved to Done Sep 3, 2020
@MrAlias MrAlias added this to Done in OpenTelemetry Go RC Oct 13, 2020
plantfansam referenced this pull request in plantfansam/opentelemetry-go-contrib Mar 18, 2022
* change trace.WithAttributes to append values instead of replacing

* improve doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants