Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: multiple, persistent session support for pkcs11key #322

Closed
jsha opened this issue Sep 15, 2015 · 3 comments · Fixed by #330
Closed

Performance: multiple, persistent session support for pkcs11key #322

jsha opened this issue Sep 15, 2015 · 3 comments · Fixed by #330

Comments

@jsha
Copy link
Contributor

jsha commented Sep 15, 2015

Right now, pkcs11key opens a session and logs in for every Sign operation. For best performance, it's better to keep some number of sessions alive long-term. Per discussion at letsencrypt/boulder#795, common practice is to have a thread pool with one session per thread.

I think, using channels and goroutines, we can achieve this without changing the Sign API to be asynchronous. We would have Sign send a message on a channel that is consumed by one of the goroutines holding an open session, then block on receiving a response on another channel. Possibly the chan chan pattern is useful for this: https://tleyden.github.io/blog/2013/11/23/understanding-chan-chans-in-go/.

I'm planning to do this work along with @jmhodges for Let's Encrypt, but I wanted to file an issue here for tracking, and to make sure this sounds like a good plan to you.

@0xhaven
Copy link
Contributor

0xhaven commented Sep 15, 2015

@jsha It's a bit problematic introducing a blocking backend to the synchronous Sign call. If it times out, the sign will just fail.

We have been looking into supporting fully Asynchronous Signers (sketched out here), but this will require a bit more work: Generation of unique asynchronous callback tokens, a thread-safe map from these tokens to RetryError, and an API that returns these tokens and another that accepts them to retry the sign operation (calling RetryError#Retry())

@jsha
Copy link
Contributor Author

jsha commented Sep 15, 2015

@jacobhaven: To be clear, this would not be a remote backend, but would be multiple threads of the same process, communicating with Go channels. I don't think there's a risk of timeouts there. There is a risk that the processing thread might have an error but fail to report that error on the response channel, but I think we can avoid that with careful programming and use of defer.

Out of curiosity, what is your use case for Asynchronous Signers?

@0xhaven
Copy link
Contributor

0xhaven commented Sep 15, 2015

@jsha That seems reasonable then, as long as you can rely on your PKCS11 signer to return in a reasonable time.

Asynchronous signer's would be largely useful as a primitive in communicating with external CAs (which have natively asynchronous APIs).

jsha added a commit to jsha/cfssl that referenced this issue Sep 17, 2015
Fixes cloudflare#329
Fixes cloudflare#325
Fixes cloudflare#322
Incorporates cloudflare#321

pkcs11key used to log in and log out for each Sign request. This created a
performance problem, because logins are slow. It also created a concurrency
problem.

Logged in / logged out state is a property of a PKCS#11 application, not a
session. So it was possible for two Sign operations on separate threads to log
in at the same time, and for one Sign operation to finish and call Logout before
the other Sign operation called SignInit. This would result in a
CKR_OBJECT_HANDLE_INVALID error.

To fix both of these issues, I made the session a persistent property of a
PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single
session throughout its lifetime, increasing signing performance.

Also, some of the field names for PKCS11Key were incorrect, and there was no
provision to select by token label, only slot label. In general, token label is
more unique, and the better choice. This change incorporates the elements of
cloudflare#321 fixing that. That is, it renames
Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field
TokenLabel. Note for people updating configs: In your configs, don't rename
Token to TokenLabel. The value you currently have in Token is actually a
SlotDescription, and should be named such. You should add a name field
TokenLabel, whose value you can find with pkcs11-tool --list-slots.

I also changed where the PKCS11Key Config object is defined, to make it more
straightforward to share between different components, and added object label
support to pkcs11uri, so private keys can be selected by label.
jsha added a commit to jsha/cfssl that referenced this issue Sep 17, 2015
Fixes cloudflare#329
Fixes cloudflare#325
Fixes cloudflare#322
Incorporates cloudflare#321

pkcs11key used to log in and log out for each Sign request. This created a
performance problem, because logins are slow. It also created a concurrency
problem.

Logged in / logged out state is a property of a PKCS#11 application, not a
session. So it was possible for two Sign operations on separate threads to log
in at the same time, and for one Sign operation to finish and call Logout before
the other Sign operation called SignInit. This would result in a
CKR_OBJECT_HANDLE_INVALID error.

To fix both of these issues, I made the session a persistent property of a
PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single
session throughout its lifetime, increasing signing performance.

Also, some of the field names for PKCS11Key were incorrect, and there was no
provision to select by token label, only slot label. In general, token label is
more unique, and the better choice. This change incorporates the elements of
cloudflare#321 fixing that. That is, it renames
Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field
TokenLabel. Note for people updating configs: In your configs, don't rename
Token to TokenLabel. The value you currently have in Token is actually a
SlotDescription, and should be named such. You should add a name field
TokenLabel, whose value you can find with pkcs11-tool --list-slots.

I also changed where the PKCS11Key Config object is defined, to make it more
straightforward to share between different components, and added object label
support to pkcs11uri, so private keys can be selected by label.
jsha added a commit to jsha/cfssl that referenced this issue Sep 17, 2015
Fixes cloudflare#329
Fixes cloudflare#325
Fixes cloudflare#322
Incorporates cloudflare#321

pkcs11key used to log in and log out for each Sign request. This created a
performance problem, because logins are slow. It also created a concurrency
problem.

Logged in / logged out state is a property of a PKCS#11 application, not a
session. So it was possible for two Sign operations on separate threads to log
in at the same time, and for one Sign operation to finish and call Logout before
the other Sign operation called SignInit. This would result in a
CKR_OBJECT_HANDLE_INVALID error.

To fix both of these issues, I made the session a persistent property of a
PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single
session throughout its lifetime, increasing signing performance.

Also, some of the field names for PKCS11Key were incorrect, and there was no
provision to select by token label, only slot label. In general, token label is
more unique, and the better choice. This change incorporates the elements of
cloudflare#321 fixing that. That is, it renames
Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field
TokenLabel. Note for people updating configs: In your configs, don't rename
Token to TokenLabel. The value you currently have in Token is actually a
SlotDescription, and should be named such. You should add a name field
TokenLabel, whose value you can find with pkcs11-tool --list-slots.

I also changed where the PKCS11Key Config object is defined, to make it more
straightforward to share between different components, and added object label
support to pkcs11uri, so private keys can be selected by label.
@jsha jsha mentioned this issue Sep 17, 2015
jsha added a commit to letsencrypt/pkcs11key that referenced this issue Feb 22, 2016
Fixes cloudflare/cfssl#329
Fixes cloudflare/cfssl#325
Fixes cloudflare/cfssl#322
Incorporates cloudflare/cfssl#321

pkcs11key used to log in and log out for each Sign request. This created a
performance problem, because logins are slow. It also created a concurrency
problem.

Logged in / logged out state is a property of a PKCS#11 application, not a
session. So it was possible for two Sign operations on separate threads to log
in at the same time, and for one Sign operation to finish and call Logout before
the other Sign operation called SignInit. This would result in a
CKR_OBJECT_HANDLE_INVALID error.

To fix both of these issues, I made the session a persistent property of a
PKCS11Key, and protected it with a mutex. Now each PKCS11Key maintains a single
session throughout its lifetime, increasing signing performance.

Also, some of the field names for PKCS11Key were incorrect, and there was no
provision to select by token label, only slot label. In general, token label is
more unique, and the better choice. This change incorporates the elements of
cloudflare/cfssl#321 fixing that. That is, it renames
Token to SlotDescription, Label to PrivateKeyLabel, and adds a new field
TokenLabel. Note for people updating configs: In your configs, don't rename
Token to TokenLabel. The value you currently have in Token is actually a
SlotDescription, and should be named such. You should add a name field
TokenLabel, whose value you can find with pkcs11-tool --list-slots.

I also changed where the PKCS11Key Config object is defined, to make it more
straightforward to share between different components, and added object label
support to pkcs11uri, so private keys can be selected by label.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants