Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sssd-kcm cannot handle big tickets #4413

Closed
sssd-bot opened this issue May 2, 2020 · 0 comments
Closed

sssd-kcm cannot handle big tickets #4413

sssd-bot opened this issue May 2, 2020 · 0 comments
Assignees
Labels
Bugzilla Closed: Fixed Issue was closed as fixed. KCM

Comments

@sssd-bot
Copy link

sssd-bot commented May 2, 2020

Cloned from Pagure issue: https://pagure.io/SSSD/sssd/issue/3386


If there is a ticket stored in kcm with many service principals then it can easily reach hardcoded limit 2KiB.

a) the hardcoded limit does not make sense because It is configurable in sssd-secrets (man sssd-secrets -> max_payload_size)
b) limit is too low and therefore unusable in real environment. (it's 8 times lower then default limit in sssd-secrets

sh$ klist 
klist: Credentials cache I/O operation failed
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [sec_get_send] (0x2000): Retrieving ccache bd874c90-32c6-4e47-8b5a-d0935800d706-1000:80312
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [tcurl_request_send] (0x0400): Sending TCURL request for http://localhost/kcm/persistent/1000/ccache/bd874c90-32c6-4e47-8b5a-d0935800d706-1000:80312, at socket /var/run/secrets.socket
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [schedule_fd_processing] (0x2000): timeout_ms: 0
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [handle_socket] (0x2000): Activity on curl socket 13 socket data (nil)
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [schedule_fd_processing] (0x2000): timeout_ms: 0
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [check_curl_timeouts] (0x4000): Still tracking 1 outstanding requests
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [schedule_fd_processing] (0x2000): timeout_ms: 5000
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [check_curl_timeouts] (0x4000): Still tracking 1 outstanding requests
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [tcurl_write_data] (0x2000): ---> begin libcurl data
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [tcurl_write_data] (0x2000): {
    "version": 1,
    "kdc_offset": 0,
    "principal": {
        "type": 0,
        "realm": "REDHAT.COM",
        "components": [
            "lslebodn"
        ]
    },
    "creds": [
        {
            "uuid": "25d3b632-ee4f-4a0c-ab72-952861bf68b9",
            "payload": "AAAAAQAAAAEAAAKKUkVESEFULkNPTQAAAAhsc2xlYm9kbgAAAAMAAAACAAAAClJFREhBVC5DT00AAAAESFRUUAAAACVqaXJhLTAxLmhvc3QucHJvZC5lbmcucmR1Mi5yZWRoYXQuY29tABIAAAAgPXMXnJjF+r50Vbtr0fkFgwe4DPDnVYsYkgLym1GXqt1Y6yzaWOtnsVjruXoAAAAAAEApAAAAAAAAAAAAAAAAAW9hggFrMIIBZ6ADAgEFoQwbClJFREhBVC5DT02iODA2oAMCAQOhLzAtGwRIVFRQGyVqaXJhLTAxLmhvc3QucHJvZC5lbmcucmR1Mi5yZWRoYXQuY29to4IBFjCCARKgAwIBEqEDAgEDooIBBASCAQCxSd/sQO0J8B6c+4LrRxXltt+fSLsGqFfvOjgyHREQpODAREfaqDL5AgsIC7kmKaTE42Tn9T27TSnuLFYD3yUP8Z2uicaGbFRCcbMPH5DR464RHq6YsyID2Z2yPBnsJf6p05T6VuzL+orM2/TvbRF14MDA45yXmKitrv5ALV9PERE8RNuTiGbvBWmAWqkOjuvpvpSXaCBoBvUlA11QslllON/rCPYBqrWllowkZv6dCBaDuUsx9SJKcdijNjWsttJwEanN92ymUbOthGHDBVKriG6Hygjpe2lev0FvUCTaBx71yO5FvwtP4GBnVcp3wTQ1O3reDNa3bi6Ey0nB66GKAAAAAA=="
        },
        {
            "uuid": "de7084a4-5153-4f76-9b68-82f41090c130",
            "payload": "ABAAAAQAAAAEAAAAKUkVESEFULkNPTQAAAAhsc2xlYm9kbgAAAAMAAAACAAAAClJFREhBVC5DT00AAAAESFRUUAAAACplcnJhdGEtd2ViLTAxLmhvc3QucHJvZC5lbmcuYm9zLnJlZGhhdC5jb20AAgAAACDiBq1YuOGzE2wkvy4MadkP+buzQYoNYWmpDLtkJ3CffFjrLNpY6y3sWOu5egAAAAAAQCkAAAAAAAAAAAAAAAABdGGCAXAwggFsoAMCAQWhDBsKUkVESEFULkNPTaI9MDugAwIBA6E0MDIbBEhUVFAbKmVycmF0YS13ZWItMDEuaG9zdC5wcm9kLmVuZy5ib3MucmVkaGF0LmNvbaOCARYwggESoAMCARKhAwIBBKKCAQQEggEATQXGnOJbTfEChzl1UBnA4UJJFWQAoRNFVPr7TCGZVxy+uCP606flsk6rLAlWKCgb9PFMycl+YYigHsrhppHUsXGgXLz/8QyAbxheHHhCwgaRVrNNGf2OvPUZwy/WiPdqZsVSvBcd/PCeS2kt39yFfwvwUrfxueVP6I3keViHTzivEOJI6m8dvmhM+h+Md4pAFP7g2+ZrUMuN2KI4iorsqio8XVm0lKAPuWfaqVTO2lkhGVni2L1eZj6KT6PSQeHbinbMDyI7+7p1lOKJERfZRbWcysN31+YcoSXUw473GZvxYzEjqeI7sEOuACxfIfhnhgHJj2WsvVuGr1/FPXUB7AAAAAA="
        },
        {
            "uuid": "a075810a-8211-45d1-bfc9-44eba01dad1d",
            "payload": "ACAAAAAAAACAEAAAAKUkVESEFULkNPTQAAAAhsc2xlYm9kbgAAAAAAAAADAAAADFgtQ0FDSEVDT05GOgAAABVrcmI1X2NjYWNoZV9jb25mX2RhdGEAAAAHcGFfdHlwZQAAABxrcmJ0Z3QvUkVESEFULkNPTUBSRURIQVQuQ09NAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABMgAAAAA="
        },
        {
            "uuid": "94033c35-6aa6-4af7-b81c-085e8a475aae",
            "payload": "AACAAAQAAAAEAAAAKUCkVESCEFULkNPTQAAAAhsc2xlYm9kbgAAAAIAAAACAAAAClJFREhBVC5DT00AAAAGa3JidGd0AAAAClJFREhBVC5DT00AEgAAACBiOdJZoKwcCxiDCeC9pe/cw6Wz/axuoxwQJqa8tsxbgVjrLNpY6yzaWOu5egAAAAAAQGEAAAAAAAAAAAAAAAABQGGCATwwggE4oAMCAQWhDBsKUkVESEFULkNPTaIfMB2gAwIBAqEWMBQbBmtyYnRndBsKUkVESEFULkNPTaOCAQAwgf2gAwIBEqEDAgECooHwBIHt5dW+TGBvQ8jrUfVprar8f2bkgwFJc/rmYs+YtrOSZeoq/82GNB34LsCoWUqt1XPDms85xSmj+C1amK7rwgxsvHddGA5ocln9RQFLwkMQuTRbLh5qjgrkJ8lfUkUPaeDXkHJgKpoDC+Vwjxlb+CtTjieeCE/QPe7ha+h7tRWqa2Tp7IpTbMuiQrymVyRwydSt9veP2bp0FXhwK/NI3q2lgWBM6cchRFMh92gORYJCOO89Td48os+wwalgwR33llQOlg+eJT6nUjir/FEYz5x43jwbsu5Ujyv702DiexleaXcvImEhd3/iBt73r0VrAAAAAA=="
        },
        {
            "uuid": "475e9bec-9dc7-44e8-bcc8-dd65ceeda61f",
            "payload": "AAAAAACAAAAEAAAAKUkVESCEFULkNPTQAAAAhsc2xlYm9kbgAAAAAAAAADAAAADFgtQ0FDSEVDT05GOgAAABVrcmI1X2NjYWNoZV9jb25mX2RhdGEAAAAKZmFzdF9hdmFpbAAAABxrcmJ0Z3QvUkVESEFULkNPTUBSRURIQVQuQ09NAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADeWVzAAAAAA=="
        },
        {
            "uuid": "6871fd43-1588-4673-8bdf-5054240af2b0",
            "payload": "AAAACAQAAAAEAAAAKCUkVESEFULkNPTQAAAAhsc2xlYm9kbgAAAAMAAAACAAAAClJFREhBVC5DT00AAAAEaW1hcAAAABRtYWlsLmNvcnAucmVkaGF0LmNvbQASAAAAIAKDi/YxzECI4zIgzcjswH6d8loUyZrZWPfPglzj5L+KWOss2ljrLOZY67l6AAAAAABAKQAAAAAAAAAAAAAAAAFeYYIBWjCCAVagAwIBBaEMGwpSRURIQVQuQ09NoicwJaADAgEDoR4wHBsEaW1hcBsUbWFpbC5jb3JwLnJlZGhhdC5jb22jggEWMIIBEqADAgESoQMCAQWiggEEBIIBABH+e46XiXyhudhNUUzRwqTRHjQ38dnNBhp+8vSUeVfIPI6TmneGIN1/SSv1dy9Fx0ezLiliBoupnWcDGcd9YlKvuiubO6b7OJtbx61RKtGjYIrGEs+16s3VbRzp8FrtSxLgP9FKyY0ZzBbWHMpJbchrYY9CqLWycNrzsw3pODioDgQKBMg3Xvko9fQfnzilhzI4DHLjTlUBhmtfpHnbyvN8yvdFurlQwyvSJorbejivKgPZo4CypgPoh2SLIV6UnAyoVzXox1zppPcACgzHcjAS24Qb3sUVGwOR9oEQkT9ALtSUZZJU0fttjN/0lIAJ1wc6/9vFiv+Fu1Q9N/PrzaYAAAAA"
        },
        {
            "uuid": "749f83e7-775d-47cf-89e9-2cd34ec71340",
            "payload": "AAAACAQAAAAEAAAACKUkVECSEFULkNPTQAAAAhsc2xlYm9kbgAAAAMAAAACAAAAClJFREhBVC5DT00AAAAESFRUUAAAAA9zYW1sLnJlZGhhdC5jb20AEgAAACBrQ2j3VL+lgWXgPgabgbY+jucrQmpCbSrKutfGiIaGaVjrLNpY6zQRWOu5egAAAAAAQCkAAAAAAAAAAAAAAAABWWGCAVUwggFRoAMCAQWhDBsKUkVESEFULkNPTaIiMCCgAwIBA6EZMBcbBEhUVFAbD3NhbWwucmVkaGF0LmNvbaOCARYwggESoAMCARKhAwIBBKKCAQQEggEALisqz2OQaRp1K497OOQDr50kJ4NGslyiQo1MfFBBB7FnOXXILaN3UkyN88mX+PxIcbMzVgTWf7BJBItZLMdi1t0WAkto4NiAGLoqfC0Hz3SlNrgD3jE9Al5f0kxVbq9IhuXV3ENZ05l/Ml8fPR3RKLXFa+LMLM0DQal3R8jQAmnGaSl/4iGNntpd1+XQNfUUcemPDWZjqJk0HSjYtx/xj3DStnHlP9pZTUBeCX25nvikZBq9F50dUebVIuwh7I7r9UQk8lPgqy0+bOCeYjlmEkowPJts91QHb1kxqV+wmVCMTSKQmtAGPsU6gz4tR9nb32zxthfrphyn1cA+YWocQwAAAAA="
        }
    ]
}
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [tcurl_write_data] (0x2000): <--- end libcurl data
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [tcurl_write_data] (0x0020): Failed to write data to buffer [105]: No buffer space available
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [handle_socket] (0x2000): Activity on curl socket 13 socket data 0x55a058100100
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [handle_curlmsg_done] (0x0400): Handled http://localhost/kcm/persistent/1000/ccache/bd874c90-32c6-4e47-8b5a-d0935800d706-1000:80312
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [handle_curlmsg_done] (0x0040): CURL operation failed [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [tcurl_request_done] (0x0400): TCURL request finished [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [schedule_fd_processing] (0x2000): timeout_ms: -1
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [check_curl_timeouts] (0x4000): Still tracking 0 outstanding requests
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [sec_get_done] (0x0040): GET HTTP request failed [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [sec_get_ccache_done] (0x0040): Cannot resolve key to ccache [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [ccdb_sec_getbyuuid_done] (0x0040): Cannot retrieve the ccache [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_ccdb_getbyuuid_done] (0x0040): Failed to get cache by UUID [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_op_get_cache_by_uuid_done] (0x0040): Cannot get ccahe by UUID [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_cmd_done] (0x0040): op receive function failed [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_cmd_request_done] (0x0040): KCM operation failed [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_reply_error] (0x0040): KCM operation returs failure [5]: Input/output error
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_failbuf_construct] (0x1000): Sent reply with error -1765328191
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_send] (0x2000): All data sent!
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_recv] (0x0010): Failed to receive data (104, Connection reset by peer), aborting client
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [client_close_fn] (0x2000): Terminated client [0x55a0580fd4c0][14]
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [kcm_recv] (0x4000): Client closed connection.
(Mon Apr 10 13:25:16 2017) [sssd[kcm]] [client_close_fn] (0x2000): Terminated client [0x55a0580f4030][12]

Comments


Comment from pbrezina at 2017-05-02 13:40:24

Hmm, probably there should not be any hardcoded limit in tcurl but pass it from the caller as parameter.


Comment from lslebodn at 2017-05-02 13:58:53

Agree, It is hardcoded on few places

src/responder/kcm/kcmsrv_ops.c:110:    state->reply = sss_iobuf_init_empty(state,
src/responder/kcm/kcmsrv_ops.c-111-                                        KCM_REPLY_MAX,
src/responder/kcm/kcmsrv_ops.c-112-                                        KCM_REPLY_MAX);
src/responder/kcm/kcmsrv_ops.c-113-    if (state->reply == NULL) {
src/responder/kcm/kcmsrv_ops.c-114-        ret = ENOMEM;

src/responder/kcm/kcmsrv_ops.c:153:    state->op_ctx->reply = sss_iobuf_init_empty(
src/responder/kcm/kcmsrv_ops.c-154-                                        state,
src/responder/kcm/kcmsrv_ops.c-155-                                        KCM_REPLY_MAX - 2*sizeof(uint32_t),
src/responder/kcm/kcmsrv_ops.c-156-                                        KCM_REPLY_MAX - 2*sizeof(uint32_t));
src/responder/kcm/kcmsrv_ops.c-157-    if (state->reply == NULL) {

src/util/tev_curl.c:655:    state->response = sss_iobuf_init_empty(state, TCURL_IOBUF_CHUNK, TCURL_IOBUF_MAX);
src/util/tev_curl.c-656-    if (state->response == NULL) {
src/util/tev_curl.c-657-        ret = ENOMEM;
src/util/tev_curl.c-658-        goto done;
src/util/tev_curl.c-659-    }

I use sssd-kcm with remporary workaround:

diff --git a/src/responder/kcm/kcmsrv_ops.c b/src/responder/kcm/kcmsrv_ops.c
index 2feaf51f2..7a78e9d6b 100644
--- a/src/responder/kcm/kcmsrv_ops.c
+++ b/src/responder/kcm/kcmsrv_ops.c
@@ -31,7 +31,7 @@
 #include "responder/kcm/kcmsrv_ops.h"
 #include "responder/kcm/kcmsrv_ccache.h"
 
-#define KCM_REPLY_MAX   2048
+#define KCM_REPLY_MAX 16384
 
 struct kcm_op_ctx {
     struct kcm_resp_ctx *kcm_data;
diff --git a/src/util/tev_curl.c b/src/util/tev_curl.c
index 8faf07c71..52c86adde 100644
--- a/src/util/tev_curl.c
+++ b/src/util/tev_curl.c
@@ -35,7 +35,7 @@
 #include "util/tev_curl.h"
 
 #define TCURL_IOBUF_CHUNK   1024
-#define TCURL_IOBUF_MAX     4096
+#define TCURL_IOBUF_MAX    16384
 
 static bool global_is_curl_initialized;

Comment from simo at 2017-05-02 23:41:06

The max buf should be much larger, a single ticket can be 65K in size, and you need to take in account encoding and such.
I would use something that is at least triple that size because some tickets may contain even more AD data technically.


Comment from jhrozek at 2017-05-04 15:56:07

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 1.15.3

Comment from jhrozek at 2017-05-04 15:58:48

Metadata Update from @jhrozek:


Comment from jhrozek at 2017-05-04 15:58:48

Metadata Update from @jhrozek:


Comment from jhrozek at 2017-05-04 15:58:48

Issue linked to Bugzilla: Bug 1448094


Comment from jhrozek at 2017-05-04 16:29:09

Metadata Update from @jhrozek:

  • Issue priority set to: critical

Comment from jhrozek at 2017-05-23 19:34:05

Metadata Update from @jhrozek:

  • Issue assigned to jhrozek

Comment from jhrozek at 2017-06-28 17:38:34

The first patchset is part of #225

But since the patches still need to be reviewed and RHEL is no longer blocked waiting for this bug, I think it's OK if upstream moves the ticket to 1.15.4


Comment from jhrozek at 2017-06-28 17:38:36

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 1.15.4 (was: SSSD 1.15.3)

Comment from jhrozek at 2017-07-24 21:21:15

The proper fix will really come in 1.15.4


Comment from jhrozek at 2017-10-19 21:05:21

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 1.16.1 (was: SSSD 1.15.4)

Comment from tibbs at 2017-11-15 20:46:46

I just found that this is trivial to hit in F27. The machine stays unusable for the user who overflows their ccache until I wipe the ccache manually (with curl -XDELETE) or reboot, it seems.

I tend to have a whole lot of service principals active at one time. Over 200 is not abnormal. This is doable using the kernel keyring (though a bug in keyutils will occasionally cause ssh segfaults) but of course that just leads to aggressive recycling of entries in the keyring.

It appears that the kernel defaults will allow me to store about 80 keys. I would suggest that the KCM by default at least have a similar limit so there's no functionality loss.

Edit: I rebuilt sssd with KCM_REPLY_MAX and TCURL_IOBUF_MAX raised to 2**18. The behavior is changed, but not really much better. Instead of klist: Credentials cache I/O operation failed I get klist: No credentials cache found and by poking at sssd_secrets with curl I can see that the credentials cache was indeed completely destroyed.

Edit 2: Adding a bunch of keys to my ccache slowly shows that after I get to some small number over 60 keys, the ccache is simply destroyed. It doesn't seem to matter how long I wait in between acquiring keys. The count is so far either 61 or 62; I guess I'm hitting a limit somewhere but I don't know what it is. The json data retrieved with curl just before this happened is 48544 bytes in size, so it's not hitting an obvious power of two limit and certainly not hitting the 256K limit I patched in.


Comment from lslebodn at 2017-11-16 08:52:42

Metadata Update from @lslebodn:

  • Issue tagged with: KCM

Comment from jhrozek at 2017-12-15 18:25:59

Metadata Update from @jhrozek:

  • Issue tagged with: postpone-to-1-16-2

Comment from jhrozek at 2018-01-08 17:26:31

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 1.16.2 (was: SSSD 1.16.1)

Comment from jhrozek at 2018-01-08 17:38:11

Metadata Update from @jhrozek:

  • Issue untagged with: postpone-to-1-16-2

Comment from jhrozek at 2018-03-29 20:19:39

Some more patches that improve the situation:
b09cd30
786c400
2f11cf2
bfc6d9d


Comment from jhrozek at 2018-06-05 15:06:23

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 2.0 (was: SSSD 1.16.2)

Comment from jhrozek at 2018-08-13 10:07:07

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 2.1 (was: SSSD 2.0)

Comment from jhrozek at 2018-12-04 15:31:28

PR: #705


Comment from jhrozek at 2018-12-04 15:31:51

Metadata Update from @jhrozek:

  • Issue tagged with: PR

Comment from jhrozek at 2019-02-22 15:50:12

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 2.2 (was: SSSD 2.1)

Comment from jhrozek at 2019-06-13 23:22:13

Metadata Update from @jhrozek:

  • Issue set to the milestone: SSSD 2.3 (was: SSSD 2.2)

Comment from jhrozek at 2019-08-07 20:59:43


Comment from jhrozek at 2019-08-07 20:59:43

Metadata Update from @jhrozek:

  • Issue close_status updated to: Fixed
  • Issue status updated to: Closed (was: Open)
@sssd-bot sssd-bot added Bugzilla Closed: Fixed Issue was closed as fixed. KCM labels May 2, 2020
@sssd-bot sssd-bot closed this as completed May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugzilla Closed: Fixed Issue was closed as fixed. KCM
Projects
None yet
Development

No branches or pull requests

2 participants