Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http client certificate authorization regression in .NET 5 #47580

Closed
SukharevAndrey opened this issue Jan 28, 2021 · 9 comments · Fixed by #47729
Closed

Http client certificate authorization regression in .NET 5 #47580

SukharevAndrey opened this issue Jan 28, 2021 · 9 comments · Fixed by #47729
Assignees
Milestone

Comments

@SukharevAndrey
Copy link

Description

After our project was updated to .NET 5 we found serious regression when trying to send certificate authorized HTTP request using HttpClient to our partner's API and were forced to roll back to .NET Core 3.1 where it works normally.

The problem is a little tricky to reproduce. It is reproducible only when a client is running on a Linux environment (we are using an official docker image based on Debian Buster). The problem is not reproducible on Windows 10.

What it looked like:
After an updated version of our service was released we noticed HTTP requests with certificate authorization started to fail (status code 401) after several successful attempts and then never succeeded again until the service was redeployed. Then we started to debug the application on Windows, but it worked just fine.

To reproduce the problem on Linux we created a simple console application. The code creates HttpClient with certification authorization handler in a loop and tries to make a request to the partner's API. The only first request was successful, the others failed with 401 error.
If we use the same HttpClient for every request then all works fine.
If we create two identical http handlers and corresponding HttpClients, then we try to send requests one by one, then requests made by the first client will succeed while requests made by the second one will fail (the order does not matter). If we will clean SSL sessions cache using reflection before making requests by the other client, it will also work fine. You can see that version of the code in the repository below.
When using .NET Core 3.1 all the requests are made successfully.

In order to investigate the problem by ourselves, we tried to debug .NET internal libraries' code, but we have not found any obvious errors. The only clue was SSL cache. Clearing cache between requests solves the problem, but we have no idea what is wrong with it.

We obviously can't share the production certificate with the private key and provide access to the partner's API, so I've created an alternative server and certificate. The server is written in Go for simplicity and the certificate is issued by Let's Encrypt. We weren't able to reproduce the problem using a self-signed certificate issued by self-signed CA, so we suspect that the problem is somewhere in certificate chain handling.

You can reproduce the problem using the client, server, and certificate provided in the repository below following the instructions:
https://github.com/SukharevAndrey/CertificateAuthBugReproduce

Configuration

  • Which version of .NET is the code running on?
    .NET 5.0.2
  • What OS and version, and what distro if applicable?
    Debian Buster
  • What is the architecture (x64, x86, ARM, ARM64)?
    x64

Regression?

Yes, the problem is not observable using .NET Core 3.1.

Other information

If you use provided client and server, you will see that the first two requests are successful, then the following two will throw an exception, and the last one will be successful again:

System.AggregateException: One or more errors occurred. (An error occurred while sending the request.)
 ---> System.Net.Http.HttpRequestException: An error occurred while sending the request.
 ---> System.IO.IOException: The decryption operation failed, see inner exception.
 ---> Interop+OpenSsl+SslException: Decrypt failed with OpenSSL error - SSL_ERROR_SSL.
 ---> Interop+Crypto+OpenSslCryptographicException: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
   --- End of inner exception stack trace ---
   at Interop.OpenSsl.Decrypt(SafeSslHandle context, Byte[] outBuffer, Int32 offset, Int32 count, SslErrorCode& errorCode)
   at System.Net.Security.SslStreamPal.EncryptDecryptHelper(SafeDeleteContext securityContext, ReadOnlyMemory`1 input, Int32 offset, Int32 size, Boolean encrypt, Byte[]& output, Int32& resultSize)
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.FillAsync(Boolean async)
   at System.Net.Http.HttpConnection.ReadNextResponseHeaderLineAsync(Boolean async, Boolean foldedHeadersAllowed)
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.SendAsyncCore(HttpRequestMessage request, HttpCompletionOption completionOption, Boolean async, Boolean emitTelemetryStartStop, CancellationToken cancellationToken)

At the same time go server will print this error to standard output:

http: TLS handshake error from 127.0.0.1:56082: tls: failed to verify client's certificate: x509: certificate signed by unknown authority
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net.Http untriaged New issue has not been triaged by the area owner labels Jan 28, 2021
@ghost
Copy link

ghost commented Jan 28, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

After our project was updated to .NET 5 we found serious regression when trying to send certificate authorized HTTP request using HttpClient to our partner's API and were forced to roll back to .NET Core 3.1 where it works normally.

The problem is a little tricky to reproduce. It is reproducible only when a client is running on a Linux environment (we are using an official docker image based on Debian Buster). The problem is not reproducible on Windows 10.

What it looked like:
After an updated version of our service was released we noticed HTTP requests with certificate authorization started to fail (status code 401) after several successful attempts and then never succeeded again until the service was redeployed. Then we started to debug the application on Windows, but it worked just fine.

To reproduce the problem on Linux we created a simple console application. The code creates HttpClient with certification authorization handler in a loop and tries to make a request to the partner's API. The only first request was successful, the others failed with 401 error.
If we use the same HttpClient for every request then all works fine.
If we create two identical http handlers and corresponding HttpClients, then we try to send requests one by one, then requests made by the first client will succeed while requests made by the second one will fail (the order does not matter). If we will clean SSL sessions cache using reflection before making requests by the other client, it will also work fine. You can see that version of the code in the repository below.
When using .NET Core 3.1 all the requests are made successfully.

In order to investigate the problem by ourselves, we tried to debug .NET internal libraries' code, but we have not found any obvious errors. The only clue was SSL cache. Clearing cache between requests solves the problem, but we have no idea what is wrong with it.

We obviously can't share the production certificate with the private key and provide access to the partner's API, so I've created an alternative server and certificate. The server is written in Go for simplicity and the certificate is issued by Let's Encrypt. We weren't able to reproduce the problem using a self-signed certificate issued by self-signed CA, so we suspect that the problem is somewhere in certificate chain handling.

You can reproduce the problem using the client, server, and certificate provided in the repository below following the instructions:
https://github.com/SukharevAndrey/CertificateAuthBugReproduce

Configuration

  • Which version of .NET is the code running on?
    .NET 5.0.2
  • What OS and version, and what distro if applicable?
    Debian Buster
  • What is the architecture (x64, x86, ARM, ARM64)?
    x64

Regression?

Yes, the problem is not observable using .NET Core 3.1.

Other information

If you use provided client and server, you will see that the first two requests are successful, then the following two will throw an exception, and the last one will be successful again:

System.AggregateException: One or more errors occurred. (An error occurred while sending the request.)
 ---> System.Net.Http.HttpRequestException: An error occurred while sending the request.
 ---> System.IO.IOException: The decryption operation failed, see inner exception.
 ---> Interop+OpenSsl+SslException: Decrypt failed with OpenSSL error - SSL_ERROR_SSL.
 ---> Interop+Crypto+OpenSslCryptographicException: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
   --- End of inner exception stack trace ---
   at Interop.OpenSsl.Decrypt(SafeSslHandle context, Byte[] outBuffer, Int32 offset, Int32 count, SslErrorCode& errorCode)
   at System.Net.Security.SslStreamPal.EncryptDecryptHelper(SafeDeleteContext securityContext, ReadOnlyMemory`1 input, Int32 offset, Int32 size, Boolean encrypt, Byte[]& output, Int32& resultSize)
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
   at System.Net.Http.HttpConnection.FillAsync(Boolean async)
   at System.Net.Http.HttpConnection.ReadNextResponseHeaderLineAsync(Boolean async, Boolean foldedHeadersAllowed)
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.SendAsyncCore(HttpRequestMessage request, HttpCompletionOption completionOption, Boolean async, Boolean emitTelemetryStartStop, CancellationToken cancellationToken)

At the same time go server will print this error to standard output:

http: TLS handshake error from 127.0.0.1:56082: tls: failed to verify client's certificate: x509: certificate signed by unknown authority
Author: SukharevAndrey
Assignees: -
Labels:

area-System.Net.Http, untriaged

Milestone: -

@wfurt
Copy link
Member

wfurt commented Jan 30, 2021

Thanks for the repro case. I was able to reproduce it on Ubuntu 18 as well.

In the failing handshake, client sends only the provided certificate. In case of the successful case it sends the full chain. I'll take a look why this is happening and why this is not consistent as it should.

It should be sufficient IMHO to feed the intermediate CA to the server. (e.g. feed the give the ca.crt)

Note that the server does not send the chain either - as it should. It works because of the hijacked validation callback. In "normal" case, .NET would try to fetch the chain but it seems like GO does not try it and simply fail.

@SukharevAndrey
Copy link
Author

@wfurt Unfortunately we can't change software on the partner's side and thereby fix the problem.
Sure their server may be not configured properly either, but it works well enough to accept requests made by other clients such as written in .NET Core 3.1.
There is some sort of state corruption in .NET 5 that prevents sending the full chain in all the requests, so consistency issue needs to be fixed indeed.
Thank you for taking the time to investigate this problem.

@wfurt
Copy link
Member

wfurt commented Jan 30, 2021

I think this is regression caused by #38364 and I have preliminary fix. I need to work out test cases.

@karelz karelz added bug regression-from-last-release and removed untriaged New issue has not been triaged by the area owner labels Feb 1, 2021
@karelz karelz added this to the 5.0.x milestone Feb 1, 2021
@karelz karelz added the os-linux Linux OS (any supported distro) label Feb 1, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 2, 2021
@SukharevAndrey
Copy link
Author

SukharevAndrey commented Feb 5, 2021

@wfurt I have also reproduced the bug on Mac OS 11.2. All 5 requests are failing with exception:

System.AggregateException: One or more errors occurred. (The SSL connection could not be established, see inner exception.)
 ---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
 ---> Interop+AppleCrypto+SslException: misc. bad certificate
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](TIOAdapter adapter, Boolean receiveFirst, Byte[] reAuthenticationData, Boolean isApm)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Boolean async, Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Boolean async, Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.SendAsyncCore(HttpRequestMessage request, HttpCompletionOption completionOption, Boolean async, Boolean emitTelemetryStartStop, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---

.NET Core 3.1 works fine.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 8, 2021
@wfurt
Copy link
Member

wfurt commented Feb 9, 2021

reopening for servicing

@wfurt wfurt reopened this Feb 9, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 9, 2021
@danmoseley
Copy link
Member

@SukharevAndrey +1 thanks for providing such a good actionable repro case.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 10, 2021
@wfurt
Copy link
Member

wfurt commented Feb 10, 2021

fix will come out in 5.0.4.

@wfurt wfurt closed this as completed Feb 10, 2021
@karelz
Copy link
Member

karelz commented Feb 11, 2021

Fixed in 6.0 (master) in PR #47729 and in 5.0.4 in PR #48042.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants