New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signing operations fail after a token has been removed and reinserted (Chrome/Firefox) #1822
Comments
|
which version of PCSC-Lite are you using? |
|
I have tried it against pcsc-lite 1.8.14, and 1.8.24 Edit: I just did a quick test with pcsc-lite 1.8.25, and the issue still occurs. |
|
Can you share debug log from the current master or 0.19.0 version? In your log, I see some mixing of sessions from different processes (observe the first hexadecimal value on the log lines). |
|
I have created a new log using Ubuntu19.04 and a master build of opensc, chromium / nss should be the only process using OpenSC in this one: Thanks |
|
What I find interesting is that the following parts are handled by two separate threads (notice the difference in the One of the threads asks for Then the NSS creates a new session and tries to log in, which fails already on pcsc level before even attempting to send APDU. According to the specs, the error This error comes from inside of |
|
I have included the opensc log for clarity. |
|
From what I remember few years back when I was fiddling with the NSS, that it used to call The pcsc log does not contain timestamps so it is hard to capture the time when the error occurs. But there are visible several failed initialization after reinsertion, which might (in the lack of synchronization) be hit by the attempt to start the transaction by opensc. Do you use the opensc directly or through some proxy (p11-kit-proxy?)? Can you try to get pcsc logs through syslog with timestamps so we can align the events? Some timestamps from the OpenSC log:
|
|
Yes NSS does use the C_WaitForSlotEvent, but unfortunately both Firefox and Chromium have removed the call to SECMOD_WaitForAnyTokenEvent which in turn called WaitForSlotEvent. We are using opensc-pk11 directly as the smartcard driver for chromium, and firefox. I will get a syslog for pcsc + corresponding opensc log tomorrow morning. |
|
Here are some pcsc logs with timestamps (along with the opensc ones) Thanks! . |
|
PCSC_lite also has a pcsc-spy, that could be helpful. From pcscd.log it looks like the removal of the reader /dev/bus/usb/003/004 is detected: Then on reinsertion the reader appears to be /dev/bus/usb/003/005 But then lines It is not clear why pcsc did not return SCARD_E_READER_UNAVAILABLE 0x80100017 for the status. |
|
It looks like exactly I said. Last APDU/transaction working: the appropriate event on pcsc layer, which is immediately followed by the card removal: After few moments, we observe the failed locks: Between that, I think the pcsc did not manage to init your yubikey: I think @dengert pointed in the right direction, which points to this code: https://github.com/OpenSC/OpenSC/blob/master/src/libopensc/reader-pcsc.c#L416 It looks like I can reproduce your issue with Chromium, but simple checking the SCARD_E_INVALID_VALUE and setting SC_READER_CARD_CHANGED did not address the issue for me. I will try to have a better look tomorrow since this change in NSS probably uncovered some bug in OpenSC and will be probably affect more users. |
|
Thanks for the replies, i did have a look at reader-pcsc.c and tried the same thing, not being familiar with the library i assumed i was on the wrong track. I will try to make some time and get some info from pcsc_spy tomorrow. |
|
Note that line https://github.com/OpenSC/OpenSC/blob/master/src/libopensc/reader-pcsc.c#L416 only checks for one return value. It ignores any other value. PCSC may have returned SCARD_E_READER_UNAVAILABLE 0x80100017 which is not checked for. The pcsc_spy should show what is returned and from where it was called. Also note that removing a Yubico removes both the reader and card at the same time. Removing the reader has more consequences then just removing the card. |
|
After some fighting with pcsc-spy, I managed to get the pcsc-spy to log what I needed. The important parts follow. Note, that the order of the messages might be different depending on the timing and the removal can be detected either in This is the first call after the yubikey was reinsterted: After that, all following status query fail, because it is using wrong card handle. This simply says there are no more events/changes from this reader: The transaction using wrong hCard handle: That said, we need to check for So far, I came up with the following patch, which seems to make this particular use case working for me (tested with chromium): With the above fix, in my case the trace flow looks like this: This detedted the invalid handle in the first place This informs us about This verifies that really the problem is in the handle, but handling of this return value was ignored. This reconnects the card so we can continue to the card/reader detection and so on. I would like to hear some comments to my analysis from @LudovicRousseau whether this makes sense. |
|
And calling |
|
SCardStatus returns SCARD_E_INVALID_VALUE after the yubikey removal and reinsertion, unless the disconnect was called from some other application that would not go through the pcsc-spy and that would more closely follow the changes. |
|
After the call to only 2 return values are checked. Should anything other the Success be treated as a error. Then processed as card error or reader error or other error. As you point out, timing issues can cause similar issues with any SCard call, So SCardBeginTransaction return codes may need similar tests. If the reader has been removed and the "reader" reinserted there is no no guaranty it is the same token, it may be the same type of token with the same type of builtin reader but different data. |
|
@dengert you are right. We might be getting completely different reader so it should be probably detected from the beginning again. But that goes again through all the layers of abstraction and I am not sure if I will be able to get to figure out today what needs to be done where to clean the reader structure. But I would be glad to test proposed solutions. In eevery case, I think this needs to be addressed before 0.20.0 release as it is affecting at least all the browsers and yubikeys. |
|
If we don't find a fast solution, I'd like not to delay the new release with a hand full of security fixes any further. This issue has been hidden in OpenSC for ages, so far it has not been bugging anyone. This kind of problem has occurred before 54f285d. unfortunately the error handling in reader-pcsc.c is a mess. Instead of reconnecting, try setting This will be checked in slot.c so that the reader gets removed in PKCS#11 |
|
It was not found before, because NSS/Firefox used |
|
Having some better look after the weekend and reviewing the logs again, there is the call before it (and I think after the removal), which already detect the reader removal. These is But whatever I tried so far (changing return values of PKCS11 calls to reflect that the reader/slot/token is no longer valid), I was not able to make NSS call the The comments in various places say that the slot list can not change before PKCS# 11 v2.20 so I am wondering how the applications can detect the reader removal https://github.com/OpenSC/OpenSC/blob/master/src/pkcs11/pkcs11-global.c#L453 |
|
Does NSS check PKCS11version number?
How does p11-kit handle version numbers?
Is this a hotplug issue. i.e. can it detect if reader was same as before
and use same usb numbers abd sane reader slot?
Is this a problem with hotplug
…On Mon, Oct 21, 2019, 5:54 AM Jakub Jelen ***@***.***> wrote:
Having some better look after the weekend and reviewing the logs again,
there is the call before it (and I think after the removal), which already
detect the reader removal. These is C_GetSlotInfo(), which checks the
SCardGetStatusChange. But setting the READER_REMOVED here, will just get
us to the removal of the reader later, but it is never added back again
(unless the calling application would call again the C_GetSlotList() and
it does not, since the previous function returns CKR_OK -- I think this
needs to be changed to something like CKR_SLOT_ID_INVALID.
But whatever I tried so far (changing return values of PKCS11 calls to
reflect that the reader/slot/token is no longer valid), I was not able to
make NSS call the C_GetSlotList again, which is the only place that would
remove the readers and add them again if I read right, even thought the
attempt to detect cards happens all the time from various places, the
readers are not re-added.
The comments in various places say that the slot list can not change
before PKCS# 11 v2.205 so I am wondering how the applications can detect
the reader removal
https://github.com/OpenSC/OpenSC/blob/master/src/pkcs11/pkcs11-global.c#L453
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1822?email_source=notifications&email_token=AAGTIMJ3KDLQ7MBZDTDCGL3QPWC43A5CNFSM4I63XDO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBZ5FMI#issuecomment-544461489>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGTIMIJWSMX46C6ELNCO63QPWC43ANCNFSM4I63XDOQ>
.
|
|
Getting back to the impact, I just checked with RHEL7 machine having NSS 3.44 and Firefox 68 (and opensc 0.19.0). It looks like it handles this workflow without any issue if I try to connect to the same port, but under the hood, it is just resumed session and I can observe very similar logs (
I will certainly have some more looks into this, but I am fine to progress with the next release without this fix now. The security things should go out in time. |
When the application (NSS) does not use WaitForSlotEvent and just opportunistically tries to detect card and reader removals with C_GetSlotInfo() and C_GetSessionInfo(), we might get errors in various plcaes, in the sc_lock() function, when we try to transfer other messages or when we ask for the reader status. This is generally too late to call any disconnect functions because no PC/SC handles are valid anymore. The reader state from PCSC is searched by name so we can be pretty sure it is very similar reader (with same name as the old one) and I hope we can reuse the reader structure and just call the pcsc_connect() on that as we do with invalid handles. Otherwise we detect this issue in the refresh_attributes() (called from C_GetSlotInfo()), where we can report the slot change in the expected manner. Fixes OpenSC#1822
When the application (NSS) does not use WaitForSlotEvent and just opportunistically tries to detect card and reader removals with C_GetSlotInfo() and C_GetSessionInfo(), we might get errors in various plcaes, in the sc_lock() function, when we try to transfer other messages or when we ask for the reader status. This is generally too late to call any disconnect functions because no PC/SC handles are valid anymore. The reader state from PCSC is searched by name so we can be pretty sure it is very similar reader (with same name as the old one) and I hope we can reuse the reader structure and just call the pcsc_connect() on that as we do with invalid handles. Otherwise we detect this issue in the refresh_attributes() (called from C_GetSlotInfo()), where we can report the slot change in the expected manner. Fixes OpenSC#1822
|
The Test that I used:
The patch in #1842 solves this issue for me most of the times. But unfortunately, I saw the cases when either the opensc removed the slot and reported it as empty (probably failed initalization?) or on attempt to login reported the slot as logged in (probably some leftover session). I will try to investigate a bit more, but if somebody else can test this, it would be great. |
|
So I have tried the patch #1842 and unfortunately for my test case (settings->certificates => remove token => reinsert token => settings->certificates) it fails. It removes the smartcard cert from the user certificates, but never adds them back in. |
|
Thank you for another debug log. This looks very similar as the ones that I saw, though I did not put too much effort into this code path. The slot change gets correctly detected in The I think the This is "disconnect" between what the functions Another issue that I am still hitting is that when I hit the error with In this particular case, the error is caught in which tries to recover from the error and in the end returns just OK (not logged in): But again, here we are pretty deep and late to invalidate the session handle. The driver layer already did what it needed, but the changes were not signalized to the pkcs11 layer. where it should probably close and invalidate all the existing sessions (which resets login status). Something like the new commit in #1842. I think we will have to answer few questions before we will decide how to solve this:
What pops in my mind is checking serial number of the card to decide what needs to be flushed and make the reinserting faster (the PIV initialization goes through reading of all the certificates). But that usually involves going through the whole card detection and initialization cycle so it would not save time. What I observed with some other drivers was usage of CPLC data -- they are easily accessible and they should uniquely identify card, if it is Glbal Platform-based card. Were there such attempts before? Would it make sense? |
|
I would say if it is pulled out and "reinserted" just start over. It is the safest thing to do. Unfortunately NIST did not say anything about card serial numbers, so a PIV card is not required to have one. The PIV driver will use the CHUI if present to get a GUID or use the FSAC-N both of which are unique to derive a "serial number" for the PIV applet. I believe Windows does something similar and uses the serial for the containerID. Changing how the serial number is derived for a card could cause problems for existing cards |
|
Getting back to the track to this issue after some delay. There are few more commits that I mentioned in the previous comment: master...Jakuje:yubikey-reinsertion At this moment, I will try to adjust the code to remove and reinsert card in the existing reader structure (as it has the same name we believe it will have the same properties). This will let the opensc to go through all the card detection over, but it is the safest thing. |
|
@the-kernel Could you check #1875 whether it will help you to move on a bit? I think the last patch go along the lines your debug shows and should address the issue. Again, at least for me it worked now in Firefox with the same reproducing steps as above. |
|
@Jakuje I have given the new patch a test, and it is definitely a big improvement On Firefox 66.0.3 it appears to be working great! I did a number of tests using different web services + firefox's certificate viewer, all work well. On Chromium 78.0.3904.108 unfortunately there are some issues, occasionally it did work as expected, but other times it caused the page load to hang, and in the case of the view certificates page (chrome://settings/certificates) the whole browser hung and I needed to kill it. I have attached a log where I:
Looking at the log though it does appear that the module is trying to unlock the card, but for what ever reason returns success on the piv_pin_cmd without ever requesting a new pin (from the user): P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] card.c:523:sc_unlock: called Thanks for your help! |
|
What you describe sounds like something that should be fixed with 4bd8cda. But the part of the log you showed is just checking for the login status. Re-reading the commit again, it shows an obvious error there -- the lock is acquired in the start of the function, but not released and this is probably the reason why your application hangs. I also do not see explicit reset of the @frankmorgner you are too fast with the merging :) |
|
@Jakuje So I have finally had a chance to apply your fix, seems to be working great! Thanks for all your help. I will give it a thorough test just in case, but everything appears to be working well in chromium and firefox! |
|
Good to hear that. Thank you for the help with testing! And sorry it took too long. |
Problem Description
After performing an operation that causes the PIV token to be unlocked (in our case a Yubikey 4), removing the token and then reinserting it causes all subsequent operations to fail with:
There seems to be an issue in the interaction between opensc and nss, this has been tested with CACKey as the middleware which appears to work fine.
It is worth noting that if you access the Firefox Security Devices page (Preferences->Security Devices) while the token is removed it updates the state, which leads me to think there is an issue with caching, and stale slots not being handled correctly (possibly because both Chromium and Firefox no longer use the NSS SECMOD_WaitForAnyTokenEvent (which in turn calls C_WaitForSlotEvent), causing the slots not to be updated when the token is removed).
NSS Bug Report
Note: "Please read about reporting bugs before opening an issue." The link is dead :(
Steps to reproduce
Logs
opensc-debug.log
Edit:
This has been tested against opensc-0.15 and master branch build on Ubuntu 16.04 and 19.04 (with the latest versions of Chrome/Firefox)
The text was updated successfully, but these errors were encountered: