-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kcm: improve performance #1165
kcm: improve performance #1165
Conversation
About the performance improvements. I used GSSAPI to acquire and establish security context with 200 services. SSSD KCM took 33 minutes as a starting point. When I addressed bottlenecks on our side, I got to 2 minutes. When addressing IO bottleneck with this patch I got to 15 seconds. For comparison, keyring took 2 seconds. |
FWIW: this would be more interesting as a proof of concept if you showed it implemented for a daemon as well. Best would probably be kcmserver.py, while showing it for SSSD is probably better than nothing. |
Apple has added twelve KCM opcodes in sequential order (per their latest Heimdal snapshot from opensource.apple.com). Any opcodes we add will need to start at some offset to avoid conflicts. opcodes are unsigned 16-bit values; I suggest a base of 13000 ('M' for 'MIT' being the 13th letter). It seems like FILE could easily support notification path by returning a copy of the residual. |
SSSD part is in https://github.com/pbrezina/sssd/tree/kcm. |
I'd like to see the notification change split into a separate PR, since it might be difficult to follow the discussion of both changes at once. For the iteration performance improvements: I'd like to see us follow Apple's lead and use KCM_OP_RETRIEVE for the retrieve operation. We can make the client pass the KRB5_GC_CACHED flag to prevent the Heimdal KCM from going out and getting tickets. I think that would provide a significant speedup to the 200-service scenario, with or without the iteration changes. But it's probably not enough by itself, due to scan_ccache(). It is possible that we might find that storing the entire ccache contents in memory within libkrb5 causes problems for large caches. At that point we might address the scan_ccache() problem and go back to slow one-cred-per-request iteration, or we might implement a more complicated buffered solution. This might look like:
where "token" is a continuation token, initially empty. In either of those scenarios, we'd be abandoning KCM_OP_GET_CRED_LIST. I think I'm okay with that because we could drop the client logic immediately; sssd could decide whether to drop the server-side logic or phase it out. |
Done. Notifications PR: #1167
What is this operation supposed to do?
I'm fine with this. However, my knowledge is not good enough to understand your proposal. Could you please elaborate on that more? |
KCM_OP_RETRIEVE accepts a cache name, a 32-bit flags value, and a cred tag (the same inputs as KCM_OP_REMOVE_CRED). On success, it responds with a credential matching the tag. In essence, it matches the semantics of krb5_cc_retrieve_cred(). (Forget what I said about KRB5_GC_CACHED. Heimdal's KCM does look for that flag, but that's a botch; the flags should be from the KRB5_TC number space, not the KRB5_GC number space, and KRB5_GC_CACHED has the same value as KRB5_TC_MATCH_IS_SKEY. So we should just send the flags we got from the caller.) I'm not suggesting that anyone implement the |
Do I understand it correctly that you suggest to implement KCM_OP_RETRIEVE in KCM and then use it in krb5 gssapi instead of iteration? And do you suggest to drop GET_CRED_LIST or keep it? |
I am suggesting to implement KCM_OP_RETRIEVE in KCM and see how that affects the performance of the 200-service test, with and without GET_CRED_LIST. I'm not suggesting a change to the higher-level code at this time. |
Ah, I see that KCM code calls |
Establishing security context if the ticket is not yet available takes in my tests two iterations (
200 ticket tests result:
Using This tells me that we can not avoid making the iteration more performant. |
RETRIEVE should be able to perform a fast lookup within the KCM daemon, at least in principle. But perhaps that's not important. There is a seven-second performance gap between KCM+iteration and KEYRING for the 200-service scenario. If those seven seconds are spent mostly on IPC overhead not related to iteration (GET_PRINCIPAL, GET_KDC_OFFSET, etc.) then it won't matter how much retrievals are sped up (although as the value 200 increases, iteration cost should eventually dominate). While investigating possible reasons for there being two non-retrieval iterations, I was reminded of krb5_cccol_have_content(), which stops at the first non-config cred it sees. Unfortunately, neither the old nor the new iteration design can make this operation fast for very large ccaches, because the old GET_CACHE_UUID_LIST is O(n). To eliminate this bottleneck we would have to introduce a new ccache operation and KCM opcode to check for the presence of non-config creds in a cache. (I don't know whether your test program is running into this operation; it depends on how the GSSAPI caller is written.) |
7ed2916
to
c92f417
Compare
I pushed some changes, primarily to fix potential memory issues. |
RETRIEVE in KCM is fast, all operations are fast on their own. The problem is that you need to run iteration 5 times
Constant number of requests is fine. The iteration is indeed the problem. There might be still space to reduce the number of requests, e.g. we probably can avoid calling the same operations multiple times since it should yield the same result anyway. But this does not matter if we keep iterating over the cache by sending one request per credential.
Iteration in single process is fast. Iteration using IO is slow.
Yes, it hits the operation, see above. But since control creds are stored at the end in my scenario it iterates basically over every ticket. I think what we have with If you prefer to avoid this, I think we can move 1 (krb5_cccol_have_content), 2 (whatever iteration in scan_ccache does), 3 (retrieve) to KCM. It will however require to publish lots of internal API so we don't reimplement the logic there. The RETRIEVE already requires three functions - mcred (un)marshall and (modified) krb5_cc_retrieve_cred_match.
Thank you. Does this mean that you would like to continue to implement this |
It should stop at the first non-config credential. It's not looking for anything specific, just one real cred.
I think it's a significant improvement, and (as I said before) we can always remove the client code later if we make better improvements for large ccaches. The old iteration method is also O(n) in memory usage due to the cred UUID list, although the constant factor isn't as large. |
Thank you. Are there any changes you'd like me to do or is this ready to be merged? |
I added test coverage of the non-fallback iteration path. I think this can be merged, but perhaps @frozencemetery can have a look at my changes first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good by me.
(I considered suggesting caching the fallback result, but I don't think that's worth doing.)
be33ff2
to
f53609f
Compare
For large caches, one IPC operation per credential dominates the cost of iteration. Instead transfer the whole list of credentials to the client in one IPC operation. Add optional support for the new opcode to the test KCM server to allow testing of the main and fallback code paths. [ghudson@mit.edu: fixed memory leaks and potential memory errors; adjusted code style and comments; rewrote commit message; added kcmserver.py support and tests] ticket: 8990 (new)
This is based on the following
krbdev
thread:https://mailman.mit.edu/pipermail/krbdev/2021-February/013424.html
kcm: use GET_CRED_LIST where available
This KCM operation will returned all credentials of the selected
ccache. This change will remove unnecessary IO operations that
are a bottleneck for large ccaches, mostly for GSSAPI and
applications that iterates over credentials often. The implementation
is backwards compatible - if the call is not available or fails, it
will fallback to
GET_CRED_UUID_LIST
.cc: add krb5_cc_notification_path
This adds new API function
krb5_cc_notification_path
. The purposeof this call is to create a file that can be watched for events
using inotify or similar tools.
It is currently documented only in the header file and implemented only
for KCM. I will finish it once we agree on the API.
The KCM implementation introduces new operation
GET_CACHE_NOTIFICATION_PATH (ccache) -> (path)
.