You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If there is a ticket stored in kcm with many service principals then it can easily reach hardcoded limit 2KiB.
a) the hardcoded limit does not make sense because It is configurable in sssd-secrets (man sssd-secrets -> max_payload_size)
b) limit is too low and therefore unusable in real environment. (it's 8 times lower then default limit in sssd-secrets
The max buf should be much larger, a single ticket can be 65K in size, and you need to take in account encoding and such.
I would use something that is at least triple that size because some tickets may contain even more AD data technically.
But since the patches still need to be reviewed and RHEL is no longer blocked waiting for this bug, I think it's OK if upstream moves the ticket to 1.15.4
I just found that this is trivial to hit in F27. The machine stays unusable for the user who overflows their ccache until I wipe the ccache manually (with curl -XDELETE) or reboot, it seems.
I tend to have a whole lot of service principals active at one time. Over 200 is not abnormal. This is doable using the kernel keyring (though a bug in keyutils will occasionally cause ssh segfaults) but of course that just leads to aggressive recycling of entries in the keyring.
It appears that the kernel defaults will allow me to store about 80 keys. I would suggest that the KCM by default at least have a similar limit so there's no functionality loss.
Edit: I rebuilt sssd with KCM_REPLY_MAX and TCURL_IOBUF_MAX raised to 2**18. The behavior is changed, but not really much better. Instead of klist: Credentials cache I/O operation failed I get klist: No credentials cache found and by poking at sssd_secrets with curl I can see that the credentials cache was indeed completely destroyed.
Edit 2: Adding a bunch of keys to my ccache slowly shows that after I get to some small number over 60 keys, the ccache is simply destroyed. It doesn't seem to matter how long I wait in between acquiring keys. The count is so far either 61 or 62; I guess I'm hitting a limit somewhere but I don't know what it is. The json data retrieved with curl just before this happened is 48544 bytes in size, so it's not hitting an obvious power of two limit and certainly not hitting the 256K limit I patched in.
Cloned from Pagure issue: https://pagure.io/SSSD/sssd/issue/3386
If there is a ticket stored in kcm with many service principals then it can easily reach hardcoded limit 2KiB.
a) the hardcoded limit does not make sense because It is configurable in sssd-secrets (man sssd-secrets -> max_payload_size)
b) limit is too low and therefore unusable in real environment. (it's 8 times lower then default limit in sssd-secrets
Comments
Comment from pbrezina at 2017-05-02 13:40:24
Hmm, probably there should not be any hardcoded limit in tcurl but pass it from the caller as parameter.
Comment from lslebodn at 2017-05-02 13:58:53
Agree, It is hardcoded on few places
I use sssd-kcm with remporary workaround:
Comment from simo at 2017-05-02 23:41:06
The max buf should be much larger, a single ticket can be 65K in size, and you need to take in account encoding and such.
I would use something that is at least triple that size because some tickets may contain even more AD data technically.
Comment from jhrozek at 2017-05-04 15:56:07
Metadata Update from @jhrozek:
Comment from jhrozek at 2017-05-04 15:58:48
Metadata Update from @jhrozek:
Comment from jhrozek at 2017-05-04 15:58:48
Metadata Update from @jhrozek:
Comment from jhrozek at 2017-05-04 15:58:48
Issue linked to Bugzilla: Bug 1448094
Comment from jhrozek at 2017-05-04 16:29:09
Metadata Update from @jhrozek:
Comment from jhrozek at 2017-05-23 19:34:05
Metadata Update from @jhrozek:
Comment from jhrozek at 2017-06-28 17:38:34
The first patchset is part of #225
But since the patches still need to be reviewed and RHEL is no longer blocked waiting for this bug, I think it's OK if upstream moves the ticket to 1.15.4
Comment from jhrozek at 2017-06-28 17:38:36
Metadata Update from @jhrozek:
Comment from jhrozek at 2017-07-24 21:21:15
The proper fix will really come in 1.15.4
Comment from jhrozek at 2017-10-19 21:05:21
Metadata Update from @jhrozek:
Comment from tibbs at 2017-11-15 20:46:46
I just found that this is trivial to hit in F27. The machine stays unusable for the user who overflows their ccache until I wipe the ccache manually (with curl -XDELETE) or reboot, it seems.
I tend to have a whole lot of service principals active at one time. Over 200 is not abnormal. This is doable using the kernel keyring (though a bug in keyutils will occasionally cause ssh segfaults) but of course that just leads to aggressive recycling of entries in the keyring.
It appears that the kernel defaults will allow me to store about 80 keys. I would suggest that the KCM by default at least have a similar limit so there's no functionality loss.
Edit: I rebuilt sssd with KCM_REPLY_MAX and TCURL_IOBUF_MAX raised to 2**18. The behavior is changed, but not really much better. Instead of
klist: Credentials cache I/O operation failed
I getklist: No credentials cache found
and by poking at sssd_secrets with curl I can see that the credentials cache was indeed completely destroyed.Edit 2: Adding a bunch of keys to my ccache slowly shows that after I get to some small number over 60 keys, the ccache is simply destroyed. It doesn't seem to matter how long I wait in between acquiring keys. The count is so far either 61 or 62; I guess I'm hitting a limit somewhere but I don't know what it is. The json data retrieved with curl just before this happened is 48544 bytes in size, so it's not hitting an obvious power of two limit and certainly not hitting the 256K limit I patched in.
Comment from lslebodn at 2017-11-16 08:52:42
Metadata Update from @lslebodn:
Comment from jhrozek at 2017-12-15 18:25:59
Metadata Update from @jhrozek:
Comment from jhrozek at 2018-01-08 17:26:31
Metadata Update from @jhrozek:
Comment from jhrozek at 2018-01-08 17:38:11
Metadata Update from @jhrozek:
Comment from jhrozek at 2018-03-29 20:19:39
Some more patches that improve the situation:
b09cd30
786c400
2f11cf2
bfc6d9d
Comment from jhrozek at 2018-06-05 15:06:23
Metadata Update from @jhrozek:
Comment from jhrozek at 2018-08-13 10:07:07
Metadata Update from @jhrozek:
Comment from jhrozek at 2018-12-04 15:31:28
PR: #705
Comment from jhrozek at 2018-12-04 15:31:51
Metadata Update from @jhrozek:
Comment from jhrozek at 2019-02-22 15:50:12
Metadata Update from @jhrozek:
Comment from jhrozek at 2019-06-13 23:22:13
Metadata Update from @jhrozek:
Comment from jhrozek at 2019-08-07 20:59:43
Comment from jhrozek at 2019-08-07 20:59:43
Metadata Update from @jhrozek:
The text was updated successfully, but these errors were encountered: