Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use of certificates after adding to X509Store is non-deterministic #48017

Closed
wfurt opened this issue Feb 8, 2021 · 6 comments · Fixed by #48149
Closed

use of certificates after adding to X509Store is non-deterministic #48017

wfurt opened this issue Feb 8, 2021 · 6 comments · Fixed by #48149
Labels
area-System.Security untriaged New issue has not been triaged by the area owner

Comments

@wfurt
Copy link
Member

wfurt commented Feb 8, 2021

I did ran into this while working on #47729 and #46837.
related to #47680 and #47580 where users reported issues with certificate chain in SslStream.

Essentially I have private chain of certificates, I add CAs to to StoreName.CertificateAuthority and subsequent X509Chain.Build() may or may not see them. At this point I'm not sure if the issue is with the store or chain. When I add Sleep(5000) between store.Add() and chain.Build() I usually get expected result. That makes it very difficult to write reliable code. For the test I was working on I added retry loop but that is not great solution either IMHO.

This is essential for SslStream's ability to send correct chain from server or when client certificate authentication is used.

I have test in #48014 to demonstrate the behavior.

      System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd [STARTING]
      System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd [FAIL]
        Assert.Equal() Failure
        Expected: 3
        Actual:   1
        Stack Trace:
          /home/furt/github/wfurt-runtime/src/libraries/System.Security.Cryptography.X509Certificates/tests/X509StoreTests.cs(617,0): at System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd()
        Output:
          Test result after sleep is True and 3
      System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd [FINISHED] Time: 5.23547s

here is relevant fragment from #48014

          
            using (X509Store store = new X509Store(StoreName.CertificateAuthority, StoreLocation.CurrentUser))
            {
                // add chain certificate so we can construct chain since there is no way how to pass intermediates directly.
                store.Open(OpenFlags.ReadWrite);
                store.Add(intermediate.CloneIssuerCert());
                store.Add(root.CloneIssuerCert());
                store.Close();
            }

           using(....) 
           {
                X509Chain chain = new X509Chain();
                chain.ChainPolicy.VerificationFlags = X509VerificationFlags.AllFlags;
                chain.ChainPolicy.RevocationMode = X509RevocationMode.NoCheck;
                chain.ChainPolicy.DisableCertificateDownloads = true;

                bool result = chain.Build(endEntity);
                int count = result ? chain.ChainElements.Count : -1;

                if (!result || count != 3)
                {
                    Thread.Sleep(5000);
                    bool result2 = chain.Build(endEntity);
                    _output.WriteLine("Test result after sleep is {0} and {1}", result2, chain.ChainElements?.Count);
                }
                CleanupCertificates(MethodBase.GetCurrentMethod().Name);

                Assert.True(result, "chain was not built");
                Assert.Equal(3, count);

one more note, that on macOS this seems to always fail. I don't know at this point if this is just timing e.g. needs more time or if the StoreName.CertificateAuthority is ignored or if it is something else. It would be great if this works consistent on all platforms.

@ghost
Copy link

ghost commented Feb 8, 2021

Tagging subscribers to this area: @bartonjs, @vcsjones, @krwq, @GrabYourPitchforks
See info in area-owners.md if you want to be subscribed.

Issue Details

I did ran into this while working on #47729 and #46837.
related to #47680 and #47580 where users reported issues with certificate chain in SslStream.

Essentially I have private chain of certificates, I add CAs to to StoreName.CertificateAuthority and subsequent X509Chain.Build() may or may not see them. At this point I'm not sure if the issue is with the store or chain. When I add Sleep(5000) between store.Add() and chain.Build() I usually get expected result. That makes it very difficult to write reliable code. For the test I was working on I added retry loop but that is not great solution either IMHO.

This is essential for SslStream's ability to send correct chain from server or when client certificate authentication is used.

I have test in #48014 to demonstrate the behavior.

      System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd [STARTING]
      System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd [FAIL]
        Assert.Equal() Failure
        Expected: 3
        Actual:   1
        Stack Trace:
          /home/furt/github/wfurt-runtime/src/libraries/System.Security.Cryptography.X509Certificates/tests/X509StoreTests.cs(617,0): at System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd()
        Output:
          Test result after sleep is True and 3
      System.Security.Cryptography.X509Certificates.Tests.X509StoreTests.CertificatesUsableAfterAdd [FINISHED] Time: 5.23547s

here is relevant fragment from #48014

          
            using (X509Store store = new X509Store(StoreName.CertificateAuthority, StoreLocation.CurrentUser))
            {
                // add chain certificate so we can construct chain since there is no way how to pass intermediates directly.
                store.Open(OpenFlags.ReadWrite);
                store.Add(intermediate.CloneIssuerCert());
                store.Add(root.CloneIssuerCert());
                store.Close();
            }

           using(....) 
           {
                X509Chain chain = new X509Chain();
                chain.ChainPolicy.VerificationFlags = X509VerificationFlags.AllFlags;
                chain.ChainPolicy.RevocationMode = X509RevocationMode.NoCheck;
                chain.ChainPolicy.DisableCertificateDownloads = true;

                bool result = chain.Build(endEntity);
                int count = result ? chain.ChainElements.Count : -1;

                if (!result || count != 3)
                {
                    Thread.Sleep(5000);
                    bool result2 = chain.Build(endEntity);
                    _output.WriteLine("Test result after sleep is {0} and {1}", result2, chain.ChainElements?.Count);
                }
                CleanupCertificates(MethodBase.GetCurrentMethod().Name);

                Assert.True(result, "chain was not built");
                Assert.Equal(3, count);

one more note, that on macOS this seems to always fail. I don't know at this point if this is just timing e.g. needs more time or if the StoreName.CertificateAuthority is ignored or if it is something else. It would be great if this works consistent on all platforms.

Author: wfurt
Assignees: -
Labels:

area-System.Security

Milestone: -

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 8, 2021
@bartonjs
Copy link
Member

bartonjs commented Feb 8, 2021

There's probably a better open issue for this somewhere, but for a quick thing:

In general, X509Store content changes very little, so we have a cache for the three stores used in chain building (My, CA, Root). If it has been less than 1 second since we last refreshed from disk, don't check. If it has been more than 1 second, but less than 30, see if the directory's LMT has changed to decide. After 30, just go ahead and reload from disk.

One proposed improvement to this is to make X509Store.Add (or Remove) know if the store is a cache-relevant store and tell the cache provider to flush the cache, but we haven't done that yet.

The general expectation is that the stores are what they should be. Tests, of course, mess with that model.

@wfurt
Copy link
Member Author

wfurt commented Feb 8, 2021

I think it is ok to have lag if the store is changed from behind. I feel explicit changes should be deterministic. In example from #47680, one can load pfx and try to use it. Adding chain to store is IMHO only one way right now how to send the client chain. With the current behavior it is not clear when it is safe to send request via HttpClient.

Another option is to add explicit API for client just like we did for the server.

cc: @stephentoub and @geoffkizer for any additional thoughts.

@bartonjs
Copy link
Member

bartonjs commented Feb 8, 2021

Another option is to add explicit API for client just like we did for the server.

I feel like there's a request for that floating around somewhere, from someone who wants to use client auth without messing with cert stores 😄.

@geoffkizer
Copy link
Contributor

I feel explicit changes should be deterministic.

Yep, I agree. I assume the proposed change from @bartonjs above (flush cache on store add/remove) would fix this, right? How hard is this to do?

Another option is to add explicit API for client just like we did for the server.

This seems like a good thing regardless. Is there an issue to track this? Is this something we would consider for 6?

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 11, 2021
@wfurt
Copy link
Member Author

wfurt commented Feb 11, 2021

After looking at the code, I start wondering what is special about the first second. We will do kernel call on every chain build ever after. It feels we could change it do check modification once in a second to save kernel calls. That would be only to check for changes behind our backs under ~/.dotnet/corefx/cryptography, right?

I put together small PR to make the Add/Remove more predictable e.g. reload explicitly.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 12, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Mar 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Security untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants