Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signing operations fail after a token has been removed and reinserted (Chrome/Firefox) #1822

Closed
the-kernel opened this issue Oct 9, 2019 · 33 comments
Labels

Comments

@the-kernel
Copy link

the-kernel commented Oct 9, 2019

Problem Description

After performing an operation that causes the PIV token to be unlocked (in our case a Yubikey 4), removing the token and then reinserting it causes all subsequent operations to fail with:

pkcs15-pin.c:358:sc_pkcs15_verify_pin: sc_lock() failed: -1900 (Unknown error)
...
reader-pcsc.c:546:pcsc_lock: Yubico Yubikey 4 OTP+U2F+CCID 00 00:SCardBeginTransaction failed: 0x80100011
misc.c:61:sc_to_cryptoki_error_common: libopensc return value: -1900 (Unknown error)

There seems to be an issue in the interaction between opensc and nss, this has been tested with CACKey as the middleware which appears to work fine.

It is worth noting that if you access the Firefox Security Devices page (Preferences->Security Devices) while the token is removed it updates the state, which leads me to think there is an issue with caching, and stale slots not being handled correctly (possibly because both Chromium and Firefox no longer use the NSS SECMOD_WaitForAnyTokenEvent (which in turn calls C_WaitForSlotEvent), causing the slots not to be updated when the token is removed).

NSS Bug Report

Note: "Please read about reporting bugs before opening an issue." The link is dead :(

Steps to reproduce

  1. Unlock PIV card by performing some operation (i.e. going to settings -> manage certificates)
  2. Remove card.
  3. Reinsert card.
  4. Perform another operation that requires unlocking the card i.e. an x509 login.
  5. Login fails due to signing operation failure.

Logs

opensc-debug.log

Edit:
This has been tested against opensc-0.15 and master branch build on Ubuntu 16.04 and 19.04 (with the latest versions of Chrome/Firefox)

@frankmorgner
Copy link
Member

which version of PCSC-Lite are you using?

@the-kernel
Copy link
Author

the-kernel commented Oct 14, 2019

I have tried it against pcsc-lite 1.8.14, and 1.8.24
thanks

Edit: I just did a quick test with pcsc-lite 1.8.25, and the issue still occurs.

@Jakuje
Copy link
Member

Jakuje commented Oct 15, 2019

Can you share debug log from the current master or 0.19.0 version? In your log, I see some mixing of sessions from different processes (observe the first hexadecimal value on the log lines).

@the-kernel
Copy link
Author

I have created a new log using Ubuntu19.04 and a master build of opensc, chromium / nss should be the only process using OpenSC in this one:

opensc-debug.txt

Thanks

@Jakuje
Copy link
Member

Jakuje commented Oct 15, 2019

What I find interesting is that the following parts are handled by two separate threads (notice the difference in the T: value -- correctly, the process ID is still the same:

P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] misc.c:290:session_get_operation: called
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] framework-pkcs15.c:3675:pkcs15_prkey_get_attribute: pkcs15_prkey_get_attribute() called
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] pkcs11-object.c:263:C_GetAttributeValue: Object 140091030797552: CKA_KEY_TYPE = CKK_RSA
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] pkcs11-object.c:285:C_GetAttributeValue: C_GetAttributeValue(hSession=0x7f697c2055b0, hObject=0x7f697c207cf0) = CKR_OK
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] framework-pkcs15.c:3675:pkcs15_prkey_get_attribute: pkcs15_prkey_get_attribute() called
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] pkcs11-object.c:263:C_GetAttributeValue: Object 140091030797552: CKA_TOKEN = TRUE
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] pkcs11-object.c:285:C_GetAttributeValue: C_GetAttributeValue(hSession=0x7f697c2055b0, hObject=0x7f697c207cf0) = CKR_OK
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] framework-pkcs15.c:3675:pkcs15_prkey_get_attribute: pkcs15_prkey_get_attribute() called
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] pkcs11-object.c:263:C_GetAttributeValue: Object 140091030797552: CKA_PRIVATE = TRUE
P:3842; T:0x140089866123008 10:15:46.456 [opensc-pkcs11] pkcs11-object.c:285:C_GetAttributeValue: C_GetAttributeValue(hSession=0x7f697c2055b0, hObject=0x7f697c207cf0) = CKR_OK
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] framework-pkcs15.c:3675:pkcs15_prkey_get_attribute: pkcs15_prkey_get_attribute() called
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] pkcs11-object.c:263:C_GetAttributeValue: Object 140091030797552: CKA_MODULUS = <size inquiry>
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] pkcs11-object.c:285:C_GetAttributeValue: C_GetAttributeValue(hSession=0x7f697c2055b0, hObject=0x7f697c207cf0) = CKR_OK
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] framework-pkcs15.c:3675:pkcs15_prkey_get_attribute: pkcs15_prkey_get_attribute() called
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] pkcs11-object.c:263:C_GetAttributeValue: Object 140091030797552: CKA_MODULUS = D9B818BA3E6955258FB212311FC6DF5727D2C0E3D117E465C06AE9FBEA6C196B
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] pkcs11-object.c:285:C_GetAttributeValue: C_GetAttributeValue(hSession=0x7f697c2055b0, hObject=0x7f697c207cf0) = CKR_OK
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] framework-pkcs15.c:3675:pkcs15_prkey_get_attribute: pkcs15_prkey_get_attribute() called
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] pkcs11-object.c:263:C_GetAttributeValue: Object 140091030797552: CKA_PRIVATE = TRUE
P:3842; T:0x140089853531904 10:15:46.522 [opensc-pkcs11] pkcs11-object.c:285:C_GetAttributeValue: C_GetAttributeValue(hSession=0x7f697c2055b0, hObject=0x7f697c207cf0) = CKR_OK

One of the threads asks for CKA_TOKEN, the other for CKA_MODULUS and both of them for CKA_PRIVATE of the same object using the same session. Hopefully, there is a synchronization on the NSS side.

Then the NSS creates a new session and tries to log in, which fails already on pcsc level before even attempting to send APDU. According to the specs, the error 0x80100011 is SCARD_E_INVALID_VALUE.

This error comes from inside of SCardBeginTransaction, which calls RFReaderInfoById. This fails since the reader reference is probably no longer valid. Running pcsc in debug mode might show also some useful information.

@the-kernel
Copy link
Author

opensc-debug.txt
pcscd.log

I have included the opensc log for clarity.

@Jakuje
Copy link
Member

Jakuje commented Oct 15, 2019

From what I remember few years back when I was fiddling with the NSS, that it used to call C_WaitForSlotEvent(), but I do not see any of these calls in your log.

The pcsc log does not contain timestamps so it is hard to capture the time when the error occurs. But there are visible several failed initialization after reinsertion, which might (in the lack of synchronization) be hit by the attempt to start the transaction by opensc.

Do you use the opensc directly or through some proxy (p11-kit-proxy?)? Can you try to get pcsc logs through syslog with timestamps so we can align the events?

Some timestamps from the OpenSC log:

  • 11:41:31.194 Start of the first process
  • 11:41:31.700 Card detection (and PIV initialization) ended (probably unrelated)
  • 11:41:47.277 Second process starts (this is what we are interested in)
  • 11:41:47.609 Second card detection (and PIV initialization) ended
  • 11:41:52.020 The last APDU that worked from first thread
  • 11:42:01.836 The second thread starts,
  • 11:42:01.841 The first failure pops up from the second thread. The removal happened around this time or the new insertion is not yet recognized? The pcsc handles are no longer valid?
  • 11:42:06.328 Third thread starts, seeing the same errors
  • 11:42:07.685 Fourth thread starts, seeing the same errors

@the-kernel
Copy link
Author

Yes NSS does use the C_WaitForSlotEvent, but unfortunately both Firefox and Chromium have removed the call to SECMOD_WaitForAnyTokenEvent which in turn called WaitForSlotEvent.

We are using opensc-pk11 directly as the smartcard driver for chromium, and firefox.

I will get a syslog for pcsc + corresponding opensc log tomorrow morning.

@the-kernel
Copy link
Author

Here are some pcsc logs with timestamps (along with the opensc ones)
opensc-debug.txt
pcscd.log

Thanks!

.

@dengert
Copy link
Member

dengert commented Oct 16, 2019

PCSC_lite also has a pcsc-spy, that could be helpful.

From pcscd.log it looks like the removal of the reader /dev/bus/usb/003/004 is detected:
1495: Oct 16 13:24:05 paradox-FUTRO-S720 pcscd: ccid_usb.c:859:WriteUSB() write failed (3/4): -4 LIBUSB_ERROR_NO_DEVICE

Then on reinsertion the reader appears to be /dev/bus/usb/003/005

But then lines
Oct 16 13:24:11 paradox-FUTRO-S720 pcscd: winscard_svc.c:361:ContextThread() Received command: STATUS from client 16
Oct 16 13:24:11 paradox-FUTRO-S720 pcscd: winscard_svc.c:630:ContextThread() STATUS rv=0x80100011 for client 16

It is not clear why pcsc did not return SCARD_E_READER_UNAVAILABLE 0x80100017 for the status.
Or why OpenSC did not define SCARD_E_INVALID_VALUE 0x80100011 or handle it correctly.

@Jakuje
Copy link
Member

Jakuje commented Oct 17, 2019

It looks like exactly I said. Last APDU/transaction working:

P:4594; T:0x140033548224256 13:24:01.932 [opensc-pkcs11] card.c:525:sc_unlock: called

the appropriate event on pcsc layer, which is immediately followed by the card removal:

Oct 16 13:24:01 paradox-FUTRO-S720 pcscd: winscard_svc.c:579:ContextThread() END_TRANSACTION rv=0x0 for client 15
Oct 16 13:24:05 paradox-FUTRO-S720 pcscd: hotplug_libudev.c:649:HPEstablishUSBNotifications() USB Device removed

After few moments, we observe the failed locks:

P:4556; T:0x140192701896448 13:24:11.387 [opensc-pkcs11] reader-pcsc.c:660:pcsc_lock: Yubico YubiKey OTP+FIDO+CCID 00 00:SCardBeginTransaction returned: 0x80100011
P:4556; T:0x140192701896448 13:24:11.387 [opensc-pkcs11] reader-pcsc.c:688:pcsc_lock: Yubico YubiKey OTP+FIDO+CCID 00 00:SCardBeginTransaction failed: 0x80100011

Between that, I think the pcsc did not manage to init your yubikey:

Oct 16 13:24:07 paradox-FUTRO-S720 pcscd: readerfactory.c:376:RFAddReader() Yubico YubiKey OTP+FIDO+CCID init failed.

I think @dengert pointed in the right direction, which points to this code:

https://github.com/OpenSC/OpenSC/blob/master/src/libopensc/reader-pcsc.c#L416

It looks like I can reproduce your issue with Chromium, but simple checking the SCARD_E_INVALID_VALUE and setting SC_READER_CARD_CHANGED did not address the issue for me. I will try to have a better look tomorrow since this change in NSS probably uncovered some bug in OpenSC and will be probably affect more users.

@the-kernel
Copy link
Author

Thanks for the replies, i did have a look at reader-pcsc.c and tried the same thing, not being familiar with the library i assumed i was on the wrong track.

I will try to make some time and get some info from pcsc_spy tomorrow.

@dengert
Copy link
Member

dengert commented Oct 17, 2019

Note that line https://github.com/OpenSC/OpenSC/blob/master/src/libopensc/reader-pcsc.c#L416 only checks for one return value. It ignores any other value.

PCSC may have returned SCARD_E_READER_UNAVAILABLE 0x80100017 which is not checked for.
OpenSC does not define SCARD_E_INVALID_VALUE 0x80100011.

The pcsc_spy should show what is returned and from where it was called.

Also note that removing a Yubico removes both the reader and card at the same time. Removing the reader has more consequences then just removing the card.

@Jakuje
Copy link
Member

Jakuje commented Oct 18, 2019

After some fighting with pcsc-spy, I managed to get the pcsc-spy to log what I needed. The important parts follow. Note, that the order of the messages might be different depending on the timing and the removal can be detected either in SCardGetStatusChange or in SCardBeginTransaction.

This is the first call after the yubikey was reinsterted:

    SCardGetStatusChange
     i hContext: 0x17658BFC
     i dwTimeout: 0x00000000 (0)
     i cReaders: 1
     i szReader: Yubico YubiKey OTP+FIDO+CCID 00 00
     i  dwCurrentState: SCARD_STATE_PRESENT, SCARD_STATE_INUSE (0x00000120)
     i  dwEventState: SCARD_STATE_PRESENT, SCARD_STATE_INUSE (0x00000120)
     i  Atr length: 0x00000012 (18)
     i  Atr: 3B F8 13 00 00 81 31 FE 15 59 75 62 69 6B 65 79 34 D4
     o szReader: Yubico YubiKey OTP+FIDO+CCID 00 00
     o  dwCurrentState: SCARD_STATE_PRESENT, SCARD_STATE_INUSE (0x00000120)
     o  dwEventState: SCARD_STATE_CHANGED, SCARD_STATE_PRESENT (0x00000022)
     o  Atr length: 0x00000012 (18)
     o  Atr: 3B F8 13 00 00 81 31 FE 15 59 75 62 69 6B 65 79 34 D4
     => Command successful. (SCARD_S_SUCCESS [0x00000000])  [0.000730]

After that, all following status query fail, because it is using wrong card handle.

    SCardStatus
     i hCard: 0x17A77C50
     i pcchReaderLen 0x00000000 (0)
     i pcbAtrLen 0x00000021 (33)
     o cchReaderLen 0x00000000 (0)
     o mszReaderName NULL
     o dwState 0x00000000 (0)
     o dwProtocol 0x00000000 (0)
     o bAtrLen 0x00000000 (0)
     o bAtr 
     => Invalid value given. (SCARD_E_INVALID_VALUE [0x80100011])  [0.000339]

This simply says there are no more events/changes from this reader:

    SCardGetStatusChange
     i hContext: 0x17658BFC
     i dwTimeout: 0x00000000 (0)
     i cReaders: 1
     i szReader: Yubico YubiKey OTP+FIDO+CCID 00 00
     i  dwCurrentState: SCARD_STATE_CHANGED, SCARD_STATE_PRESENT (0x00000022)
     i  dwEventState: SCARD_STATE_CHANGED, SCARD_STATE_PRESENT (0x00000022)
     i  Atr length: 0x00000012 (18)
     i  Atr: 3B F8 13 00 00 81 31 FE 15 59 75 62 69 6B 65 79 34 D4
     o szReader: Yubico YubiKey OTP+FIDO+CCID 00 00
     o  dwCurrentState: SCARD_STATE_CHANGED, SCARD_STATE_PRESENT (0x00000022)
     o  dwEventState: SCARD_STATE_PRESENT (0x00000020)
     o  Atr length: 0x00000012 (18)
     o  Atr: 3B F8 13 00 00 81 31 FE 15 59 75 62 69 6B 65 79 34 D4
     => Command timeout. (SCARD_E_TIMEOUT [0x8010000A])  [0.000661]

The transaction using wrong hCard handle:

    SCardBeginTransaction
     i hCard: 0x17A77C50
     => Invalid value given. (SCARD_E_INVALID_VALUE [0x80100011])  [0.000129]

That said, we need to check for SCARD_E_INVALID_VALUE in at least two places: In the sc_lock() to be able to recover from failed attempt to start transaction and in refresh_attributes() to detect the reader change.

So far, I came up with the following patch, which seems to make this particular use case working for me (tested with chromium):

diff --git a/src/libopensc/reader-pcsc.c b/src/libopensc/reader-pcsc.c
index 18835d37..932ebe48 100644
--- a/src/libopensc/reader-pcsc.c
+++ b/src/libopensc/reader-pcsc.c
@@ -415,7 +415,8 @@ static int refresh_attributes(sc_reader_t *reader)
 				unsigned char atr[SC_MAX_ATR_SIZE];
 				rv = priv->gpriv->SCardStatus(priv->pcsc_card, NULL,
 						&readers_len, &cstate, &prot, atr, &atr_len);
-				if (rv == (LONG)SCARD_W_REMOVED_CARD)
+				if (rv == (LONG)SCARD_W_REMOVED_CARD ||
+					rv == (LONG)SCARD_E_INVALID_VALUE)
 					reader->flags |= SC_READER_CARD_CHANGED;
 			}
 		} else {
@@ -662,6 +663,7 @@ static int pcsc_lock(sc_reader_t *reader)
 	switch (rv) {
 		case SCARD_E_INVALID_HANDLE:
 		case SCARD_E_READER_UNAVAILABLE:
+		case SCARD_E_INVALID_VALUE:
 			r = pcsc_connect(reader);
 			if (r != SC_SUCCESS) {
 				sc_log(reader->ctx, "pcsc_connect failed (%d)",

With the above fix, in my case the trace flow looks like this:

            SCardBeginTransaction
             i hCard: 0x53808361
             => Invalid value given. (SCARD_E_INVALID_VALUE [0x80100011])  [0.000091]

This detedted the invalid handle in the first place

            SCardGetStatusChange
             i hContext: 0x76BDF00A
             i dwTimeout: 0x00000000 (0)
             i cReaders: 1
             i szReader: Yubico YubiKey OTP+FIDO+CCID 00 00
             i  dwCurrentState: SCARD_STATE_PRESENT, SCARD_STATE_INUSE (0x00000120)
             i  dwEventState: SCARD_STATE_PRESENT, SCARD_STATE_INUSE (0x00000120)
             i  Atr length: 0x00000012 (18)
             i  Atr: 3B F8 13 00 00 81 31 FE 15 59 75 62 69 6B 65 79 34 D4
             o szReader: Yubico YubiKey OTP+FIDO+CCID 00 00
             o  dwCurrentState: SCARD_STATE_PRESENT, SCARD_STATE_INUSE (0x00000120)
             o  dwEventState: SCARD_STATE_CHANGED, SCARD_STATE_PRESENT (0x00000022)
             o  Atr length: 0x00000012 (18)
             o  Atr: 3B F8 13 00 00 81 31 FE 15 59 75 62 69 6B 65 79 34 D4
             => Command successful. (SCARD_S_SUCCESS [0x00000000])  [0.000130]

This informs us about SCARD_STATE_CHANGED but the reader is still called the same way and we can not know that it is reader + card changed.

            SCardStatus
             i hCard: 0x53808361
             i pcchReaderLen 0x00000000 (0)
             i pcbAtrLen 0x00000021 (33)
             o cchReaderLen 0x00000000 (0)
             o mszReaderName NULL
             o dwState 0x00000000 (0)
             o dwProtocol 0x00000000 (0)
             o bAtrLen 0x00000000 (0)
             o bAtr 
             => Invalid value given. (SCARD_E_INVALID_VALUE [0x80100011])  [0.000074]

This verifies that really the problem is in the handle, but handling of this return value was ignored.

            SCardConnect
             i hContext: 0x76BDF00A
             i szReader Yubico YubiKey OTP+FIDO+CCID 00 00
             i dwShareMode: SCARD_SHARE_SHARED (0x00000002)
             i dwPreferredProtocols: 0x00000003 (T=0, T=1)
             i phCard 0x2346DCB6C8751700 (2541961717302499072)
             i pdwActiveProtocol 0x00023DE5 (146917)
             o phCard 0x49DF3F29 (1239367465)
             o dwActiveProtocol: T=1 (0x00000002)
             => Command successful. (SCARD_S_SUCCESS [0x00000000])  [0.027391]

This reconnects the card so we can continue to the card/reader detection and so on.

I would like to hear some comments to my analysis from @LudovicRousseau whether this makes sense.

@LudovicRousseau
Copy link
Member

SCardStatus() should returns SCARD_E_INVALID_VALUE only after SCardDisconnect() or SCardReconnect() has been called.

And calling SCardStatus() after SCardDisconnect() or SCardReconnect() is a bad idea. Do you find this sequence in the pcsc-spy trace?

@Jakuje
Copy link
Member

Jakuje commented Oct 18, 2019

SCardStatus returns SCARD_E_INVALID_VALUE after the yubikey removal and reinsertion, unless the disconnect was called from some other application that would not go through the pcsc-spy and that would more closely follow the changes.
I just recreated the "offending" trace and attaching it here:

logfile.txt

@dengert
Copy link
Member

dengert commented Oct 18, 2019

After the call to

rv = priv->gpriv->SCardStatus(priv->pcsc_card, NULL,
 						&readers_len, &cstate, &prot, atr, &atr_len);

only 2 return values are checked. Should anything other the Success be treated as a error. Then processed as card error or reader error or other error.

As you point out, timing issues can cause similar issues with any SCard call, So SCardBeginTransaction return codes may need similar tests. If the reader has been removed and the "reader" reinserted there is no no guaranty it is the same token, it may be the same type of token with the same type of builtin reader but different data.
So I question the code in pcsc_lock of calling r = pcsc_connect(reader); with the same "reader".

@Jakuje
Copy link
Member

Jakuje commented Oct 18, 2019

@dengert you are right. We might be getting completely different reader so it should be probably detected from the beginning again. But that goes again through all the layers of abstraction and I am not sure if I will be able to get to figure out today what needs to be done where to clean the reader structure. But I would be glad to test proposed solutions.

In eevery case, I think this needs to be addressed before 0.20.0 release as it is affecting at least all the browsers and yubikeys.

@frankmorgner
Copy link
Member

If we don't find a fast solution, I'd like not to delay the new release with a hand full of security fixes any further. This issue has been hidden in OpenSC for ages, so far it has not been bugging anyone.

This kind of problem has occurred before 54f285d. unfortunately the error handling in reader-pcsc.c is a mess. Instead of reconnecting, try setting

reader->flags |= SC_READER_REMOVED;

This will be checked in slot.c so that the reader gets removed in PKCS#11

@Jakuje
Copy link
Member

Jakuje commented Oct 18, 2019

It was not found before, because NSS/Firefox used C_WaitForSlotEvent, which changed probably quite recently. Thank you for the pointers. I am not sure if I will have a some time to investigate and try this during weekend, but I will do latest on Monday. It would be nice if we could fix the outstanding oss-fuzz issues for that release too. There are still few left after #1830.

@Jakuje
Copy link
Member

Jakuje commented Oct 21, 2019

Having some better look after the weekend and reviewing the logs again, there is the call before it (and I think after the removal), which already detect the reader removal. These is C_GetSlotInfo(), which checks the SCardGetStatusChange. But setting the READER_REMOVED here, will just get us to the removal of the reader later, but it is never added back again (unless the calling application would call again the C_GetSlotList() and it does not, since the previous function returns CKR_OK -- I think this needs to be changed to something like CKR_SLOT_ID_INVALID.

But whatever I tried so far (changing return values of PKCS11 calls to reflect that the reader/slot/token is no longer valid), I was not able to make NSS call the C_GetSlotList again, which is the only place that would remove the readers and add them again if I read right, even thought the attempt to detect cards happens all the time from various places, the readers are not re-added.

The comments in various places say that the slot list can not change before PKCS# 11 v2.20 so I am wondering how the applications can detect the reader removal

https://github.com/OpenSC/OpenSC/blob/master/src/pkcs11/pkcs11-global.c#L453

@dengert
Copy link
Member

dengert commented Oct 21, 2019 via email

@Jakuje
Copy link
Member

Jakuje commented Oct 21, 2019

Getting back to the impact, I just checked with RHEL7 machine having NSS 3.44 and Firefox 68 (and opensc 0.19.0). It looks like it handles this workflow without any issue if I try to connect to the same port, but under the hood, it is just resumed session and I can observe very similar logs (SCardBeginTransaction failed: 0x80100011). The connection is also broken the same way as in current version if I try different port to TLS server.

Does NSS check PKCS11version number?
How do you check for PKCS11 version number?

I will certainly have some more looks into this, but I am fine to progress with the next release without this fix now. The security things should go out in time.

Jakuje added a commit to Jakuje/OpenSC that referenced this issue Oct 31, 2019
When the application (NSS) does not use WaitForSlotEvent and just
opportunistically tries to detect card and reader removals with
C_GetSlotInfo() and C_GetSessionInfo(), we might get errors in
various plcaes, in the sc_lock() function, when we try to transfer
other messages or when we ask for the reader status.

This is generally too late to call any disconnect functions because no
PC/SC handles are valid anymore. The reader state from PCSC
is searched by name so we can be pretty sure it is very similar
reader (with same name as the old one) and I hope we can reuse the
reader structure and just call the pcsc_connect() on that as we do
with invalid handles.

Otherwise we detect this issue in the refresh_attributes() (called
from C_GetSlotInfo()), where we can report the slot change in the
expected manner.

Fixes OpenSC#1822
Jakuje added a commit to Jakuje/OpenSC that referenced this issue Oct 31, 2019
When the application (NSS) does not use WaitForSlotEvent and just
opportunistically tries to detect card and reader removals with
C_GetSlotInfo() and C_GetSessionInfo(), we might get errors in
various plcaes, in the sc_lock() function, when we try to transfer
other messages or when we ask for the reader status.

This is generally too late to call any disconnect functions because no
PC/SC handles are valid anymore. The reader state from PCSC
is searched by name so we can be pretty sure it is very similar
reader (with same name as the old one) and I hope we can reuse the
reader structure and just call the pcsc_connect() on that as we do
with invalid handles.

Otherwise we detect this issue in the refresh_attributes() (called
from C_GetSlotInfo()), where we can report the slot change in the
expected manner.

Fixes OpenSC#1822
@Jakuje
Copy link
Member

Jakuje commented Oct 31, 2019

The SCardGetStatusChange() function searches for readers by name so if the reader was removed and plugged back, I think it should be fine to reuse the reader structure and just do the reconnect with a new card initialization, rather than going through the whole circus of reader removal and re-adding. The hotplug of another reader is something different.

Test that I used:

  • Plug in yubikey with my cert in, signed with local CA
  • Start gnuTLS https server: ./gnutls-http-serv -r -d 4 with --x509cafile /path/to/rootCA.crt from doc/credetials directory
  • open chromium and authenticate to https://localhost:5556
  • remove and insert yubikey
  • refresh the page in chromium (does not support session resumption so it will ask again for the token).

The patch in #1842 solves this issue for me most of the times. But unfortunately, I saw the cases when either the opensc removed the slot and reported it as empty (probably failed initalization?) or on attempt to login reported the slot as logged in (probably some leftover session). I will try to investigate a bit more, but if somebody else can test this, it would be great.

@the-kernel
Copy link
Author

So I have tried the patch #1842 and unfortunately for my test case (settings->certificates => remove token => reinsert token => settings->certificates) it fails. It removes the smartcard cert from the user certificates, but never adds them back in.

opensc-debug.txt

@frankmorgner frankmorgner reopened this Nov 1, 2019
@Jakuje
Copy link
Member

Jakuje commented Nov 1, 2019

Thank you for another debug log. This looks very similar as the ones that I saw, though I did not put too much effort into this code path. The slot change gets correctly detected in C_GetSlotInfo(), with the indication Card changed, but immediately follows the card removed, which is followed by the closing of all sessions, but not to reconnecting to the card and then detecting the card again, which returns that no token is present:

reader-pcsc.c:333:refresh_attributes: Yubico Yubikey 4 OTP+U2F+CCID 00 00 check
reader-pcsc.c:353:refresh_attributes: returning with: 0 (Success)

The reader->flags = SC_READER_CARD_PRESENT bit gets lost somewhere along the lines here between the first return from refresh_attributes() and second call to this same function, where SCardGetStatusChange just times out as there is no new change recorded. This happens probably by the pcsc_disconnect() function, where now the function SCardDisconnect() fails with our error INVALID_VALUE (the card handle is no longer valid). This is called from the card_removed() as expected and that is probably fine.

I think the card_removed() should not touch the reader->flags (through the pcsc_disconnect()) when we already checked the status of the reader and we know that there is a (possibly different) card in the reader or we should not call the pcsc_disconnect() (since the card handle pcsc_card does not work).

This is "disconnect" between what the functions refresh_attributes() and pcsc_disconnect() expect. The first expect that between the calls, nobody will change its reader->flags, while the second does exactly that. The solution might be also to reset the priv->reader_state, but I am not sure if this will help to detect the change. This is something I am not sure how to address just now. But I might come up with some idea over the weekend :)

Another issue that I am still hitting is that when I hit the error with sc_lock(), it correctly reconnects the card, resets the internal PIV states, but not the state in pkcs11 layer. The C_GetSessionInfo returns the session is CKS_RO_PUBLIC_SESSION (this checks the real state of the login with card), but the following C_Login returns CKR_USER_ALREADY_LOGGED_IN without attempting to do the actual login. The slot->login_user variable is not reset in this particular case. This is because in this code path, the slots is actually not removed and they are trying to be reused with the existing card structures, as the card would be just reset by some other application, which is probably also wrong.

43051 P:7536; T:0x140345545385728 17:37:41.244 [opensc-pkcs11] pkcs11-session.c:279:C_Login: C_Login(0xfba5a550690, 1)
43052 P:7536; T:0x140345545385728 17:37:41.244 [opensc-pkcs11] pkcs11-session.c:301:C_Login: C_Login() slot->login_user 1

In this particular case, the error is caught in GetSessionInfo pkcs11 call:

pkcs11-session.c:206:C_GetSessionInfo: C_GetSessionInfo(hSession:0xfba5a550690)
pkcs11-session.c:214:C_GetSessionInfo: C_GetSessionInfo(slot:0x0)
pkcs15-pin.c:689:sc_pkcs15_get_pin_info: called
card.c:475:sc_lock: called
reader-pcsc.c:654:pcsc_lock: called
reader-pcsc.c:663:pcsc_lock: Yubico YubiKey OTP+FIDO+CCID 00 00:SCardBeginTransaction returned: 0x80100011

which tries to recover from the error and in the end returns just OK (not logged in):

pkcs15-pin.c:717:sc_pkcs15_get_pin_info: returning with: 0 (Success)
pkcs11-session.c:232:C_GetSessionInfo: C_GetSessionInfo(0xfba5a550690) = CKR_OK

But again, here we are pretty deep and late to invalidate the session handle. The driver layer already did what it needed, but the changes were not signalized to the pkcs11 layer. where it should probably close and invalidate all the existing sessions (which resets login status). Something like the new commit in #1842.

I think we will have to answer few questions before we will decide how to solve this:

  • After the yubikey removal and reinsertion, what all should be cleaned?
  • Can we reuse the old driver + card structures?
  • Should we clean just the cache as attempted in sc_lock()?
  • Should we clear all the card internal structure and card detection should start over as different card would be inserted?
  • Are there some other options?

What pops in my mind is checking serial number of the card to decide what needs to be flushed and make the reinserting faster (the PIV initialization goes through reading of all the certificates). But that usually involves going through the whole card detection and initialization cycle so it would not save time. What I observed with some other drivers was usage of CPLC data -- they are easily accessible and they should uniquely identify card, if it is Glbal Platform-based card. Were there such attempts before? Would it make sense?

@dengert
Copy link
Member

dengert commented Nov 1, 2019

I would say if it is pulled out and "reinserted" just start over. It is the safest thing to do.
A user may have more then one Yubico with different credentials and is doing a quick swap of the tokens which would then look like a reinsertion.

Unfortunately NIST did not say anything about card serial numbers, so a PIV card is not required to have one. The PIV driver will use the CHUI if present to get a GUID or use the FSAC-N both of which are unique to derive a "serial number" for the PIV applet.

I believe Windows does something similar and uses the serial for the containerID. Changing how the serial number is derived for a card could cause problems for existing cards

@Jakuje
Copy link
Member

Jakuje commented Nov 26, 2019

Getting back to the track to this issue after some delay. There are few more commits that I mentioned in the previous comment:

master...Jakuje:yubikey-reinsertion

At this moment, I will try to adjust the code to remove and reinsert card in the existing reader structure (as it has the same name we believe it will have the same properties). This will let the opensc to go through all the card detection over, but it is the safest thing.

@Jakuje Jakuje mentioned this issue Nov 27, 2019
3 tasks
@Jakuje
Copy link
Member

Jakuje commented Nov 27, 2019

@the-kernel Could you check #1875 whether it will help you to move on a bit? I think the last patch go along the lines your debug shows and should address the issue.

Again, at least for me it worked now in Firefox with the same reproducing steps as above.

@the-kernel
Copy link
Author

the-kernel commented Nov 28, 2019

@Jakuje I have given the new patch a test, and it is definitely a big improvement

On Firefox 66.0.3 it appears to be working great! I did a number of tests using different web services + firefox's certificate viewer, all work well.

On Chromium 78.0.3904.108 unfortunately there are some issues, occasionally it did work as expected, but other times it caused the page load to hang, and in the case of the view certificates page (chrome://settings/certificates) the whole browser hung and I needed to kill it.

I have attached a log where I:

  1. Went to chrome://settings/certificates, and unlocked the yubikey
  2. Removed then reinserted the key.
  3. Went to a local citrix server that requires certificate login, clicked login
  4. Issue Occured: Instead of the pin or cert dialog the page just sat at the loading spinner.
  5. Navigated to chrome://settings/certificates
  6. Issue Occured: The whole browser becomes unresponsive, had to issue a kill command.

Looking at the log though it does appear that the module is trying to unlock the card, but for what ever reason returns success on the piv_pin_cmd without ever requesting a new pin (from the user):

P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] card.c:523:sc_unlock: called
P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] iso7816.c:123:iso7816_check_sw: PIN not verified (remaining tries: 3)
P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] card-piv.c:3723:piv_pin_cmd: piv_pin_cmd tries_left=3, logged_in=0
P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] card-piv.c:3724:piv_pin_cmd: returning with: 0 (Success)
P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] sec.c:247:sc_pin_cmd: returning with: 0 (Success)
P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] card.c:523:sc_unlock: called
P:9424; T:0x140211663091456 09:07:01.850 [opensc-pkcs11] reader-pcsc.c:709:pcsc_unlock: called

Thanks for your help!
opensc-debug.txt

@Jakuje
Copy link
Member

Jakuje commented Nov 28, 2019

What you describe sounds like something that should be fixed with 4bd8cda. But the part of the log you showed is just checking for the login status.

Re-reading the commit again, it shows an obvious error there -- the lock is acquired in the start of the function, but not released and this is probably the reason why your application hangs. I also do not see explicit reset of the login_user value so I will try with that. WIP diff if you will get to that earlier than me:

diff --git a/src/pkcs11/pkcs11-session.c b/src/pkcs11/pkcs11-session.c
index 9ab0d569..c6b271b7 100644
--- a/src/pkcs11/pkcs11-session.c
+++ b/src/pkcs11/pkcs11-session.c
@@ -218,9 +218,11 @@ CK_RV C_GetSessionInfo(CK_SESSION_HANDLE hSession,	/* the session's handle */
 
 	slot = session->slot;
 	logged_out = (slot_get_logged_in_state(slot) == SC_PIN_STATE_LOGGED_OUT);
-	if (logged_out && slot->login_user > -1) {
+	if (logged_out && slot->login_user >= 0) {
+		slot->login_user = -1;
 		sc_pkcs11_close_all_sessions(session->slot->id);
-		return CKR_SESSION_HANDLE_INVALID;
+		rv = CKR_SESSION_HANDLE_INVALID;
+		goto out;
 	}
 	if (slot->login_user == CKU_SO && !logged_out) {
 		pInfo->state = CKS_RW_SO_FUNCTIONS;

@frankmorgner you are too fast with the merging :)

@the-kernel
Copy link
Author

@Jakuje So I have finally had a chance to apply your fix, seems to be working great! Thanks for all your help. I will give it a thorough test just in case, but everything appears to be working well in chromium and firefox!

@Jakuje
Copy link
Member

Jakuje commented Dec 3, 2019

Good to hear that. Thank you for the help with testing! And sorry it took too long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants