-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: multiple, persistent session support for pkcs11key #322
Comments
@jsha It's a bit problematic introducing a blocking backend to the synchronous Sign call. If it times out, the sign will just fail. We have been looking into supporting fully Asynchronous Signers (sketched out here), but this will require a bit more work: Generation of unique asynchronous callback tokens, a thread-safe map from these tokens to |
@jacobhaven: To be clear, this would not be a remote backend, but would be multiple threads of the same process, communicating with Go channels. I don't think there's a risk of timeouts there. There is a risk that the processing thread might have an error but fail to report that error on the response channel, but I think we can avoid that with careful programming and use of Out of curiosity, what is your use case for Asynchronous Signers? |
@jsha That seems reasonable then, as long as you can rely on your PKCS11 signer to return in a reasonable time. Asynchronous signer's would be largely useful as a primitive in communicating with external CAs (which have natively asynchronous APIs). |
Fixes cloudflare#329 Fixes cloudflare#325 Fixes cloudflare#322 Incorporates cloudflare#321 pkcs11key used to log in and log out for each Sign request. This created a performance problem, because logins are slow. It also created a concurrency problem. Logged in / logged out state is a property of a PKCS#11 application, not a session. So it was possible for two Sign operations on separate threads to log in at the same time, and for one Sign operation to finish and call Logout before the other Sign operation called SignInit. This would result in a CKR_OBJECT_HANDLE_INVALID error. To fix both of these issues, I made the session a persistent property of a PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single session throughout its lifetime, increasing signing performance. Also, some of the field names for PKCS11Key were incorrect, and there was no provision to select by token label, only slot label. In general, token label is more unique, and the better choice. This change incorporates the elements of cloudflare#321 fixing that. That is, it renames Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field TokenLabel. Note for people updating configs: In your configs, don't rename Token to TokenLabel. The value you currently have in Token is actually a SlotDescription, and should be named such. You should add a name field TokenLabel, whose value you can find with pkcs11-tool --list-slots. I also changed where the PKCS11Key Config object is defined, to make it more straightforward to share between different components, and added object label support to pkcs11uri, so private keys can be selected by label.
Fixes cloudflare#329 Fixes cloudflare#325 Fixes cloudflare#322 Incorporates cloudflare#321 pkcs11key used to log in and log out for each Sign request. This created a performance problem, because logins are slow. It also created a concurrency problem. Logged in / logged out state is a property of a PKCS#11 application, not a session. So it was possible for two Sign operations on separate threads to log in at the same time, and for one Sign operation to finish and call Logout before the other Sign operation called SignInit. This would result in a CKR_OBJECT_HANDLE_INVALID error. To fix both of these issues, I made the session a persistent property of a PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single session throughout its lifetime, increasing signing performance. Also, some of the field names for PKCS11Key were incorrect, and there was no provision to select by token label, only slot label. In general, token label is more unique, and the better choice. This change incorporates the elements of cloudflare#321 fixing that. That is, it renames Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field TokenLabel. Note for people updating configs: In your configs, don't rename Token to TokenLabel. The value you currently have in Token is actually a SlotDescription, and should be named such. You should add a name field TokenLabel, whose value you can find with pkcs11-tool --list-slots. I also changed where the PKCS11Key Config object is defined, to make it more straightforward to share between different components, and added object label support to pkcs11uri, so private keys can be selected by label.
Fixes cloudflare#329 Fixes cloudflare#325 Fixes cloudflare#322 Incorporates cloudflare#321 pkcs11key used to log in and log out for each Sign request. This created a performance problem, because logins are slow. It also created a concurrency problem. Logged in / logged out state is a property of a PKCS#11 application, not a session. So it was possible for two Sign operations on separate threads to log in at the same time, and for one Sign operation to finish and call Logout before the other Sign operation called SignInit. This would result in a CKR_OBJECT_HANDLE_INVALID error. To fix both of these issues, I made the session a persistent property of a PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single session throughout its lifetime, increasing signing performance. Also, some of the field names for PKCS11Key were incorrect, and there was no provision to select by token label, only slot label. In general, token label is more unique, and the better choice. This change incorporates the elements of cloudflare#321 fixing that. That is, it renames Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field TokenLabel. Note for people updating configs: In your configs, don't rename Token to TokenLabel. The value you currently have in Token is actually a SlotDescription, and should be named such. You should add a name field TokenLabel, whose value you can find with pkcs11-tool --list-slots. I also changed where the PKCS11Key Config object is defined, to make it more straightforward to share between different components, and added object label support to pkcs11uri, so private keys can be selected by label.
Fixes cloudflare/cfssl#329 Fixes cloudflare/cfssl#325 Fixes cloudflare/cfssl#322 Incorporates cloudflare/cfssl#321 pkcs11key used to log in and log out for each Sign request. This created a performance problem, because logins are slow. It also created a concurrency problem. Logged in / logged out state is a property of a PKCS#11 application, not a session. So it was possible for two Sign operations on separate threads to log in at the same time, and for one Sign operation to finish and call Logout before the other Sign operation called SignInit. This would result in a CKR_OBJECT_HANDLE_INVALID error. To fix both of these issues, I made the session a persistent property of a PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single session throughout its lifetime, increasing signing performance. Also, some of the field names for PKCS11Key were incorrect, and there was no provision to select by token label, only slot label. In general, token label is more unique, and the better choice. This change incorporates the elements of cloudflare/cfssl#321 fixing that. That is, it renames Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field TokenLabel. Note for people updating configs: In your configs, don't rename Token to TokenLabel. The value you currently have in Token is actually a SlotDescription, and should be named such. You should add a name field TokenLabel, whose value you can find with pkcs11-tool --list-slots. I also changed where the PKCS11Key Config object is defined, to make it more straightforward to share between different components, and added object label support to pkcs11uri, so private keys can be selected by label.
Right now, pkcs11key opens a session and logs in for every Sign operation. For best performance, it's better to keep some number of sessions alive long-term. Per discussion at letsencrypt/boulder#795, common practice is to have a thread pool with one session per thread.
I think, using channels and goroutines, we can achieve this without changing the Sign API to be asynchronous. We would have Sign send a message on a channel that is consumed by one of the goroutines holding an open session, then block on receiving a response on another channel. Possibly the
chan chan
pattern is useful for this: https://tleyden.github.io/blog/2013/11/23/understanding-chan-chans-in-go/.I'm planning to do this work along with @jmhodges for Let's Encrypt, but I wanted to file an issue here for tracking, and to make sure this sounds like a good plan to you.
The text was updated successfully, but these errors were encountered: