You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sss_nss_check_header() in src/sss_client/nss_mc_common.c lacks a memory
barrier. As a result, this loop:
/* retry barrier protected reading max 5 times then give up */
for (count = 5; count > 0; count--) {
memcpy(&h, ctx->mmap_base, sizeof(struct sss_mc_header));
if (MC_VALID_BARRIER(h.b1) && h.b1 == h.b2) {
/* record is consistent so we can proceed */
break;
}
}
if (count == 0) {
/* couldn't successfully read header we have to give up */
return EIO;
}
is compiled to:
0x0000000000004d20 <+0>: mov 0x10(%rdi),%rdx
0x0000000000004d24 <+4>: mov %rbx,-0x28(%rsp)
0x0000000000004d29 <+9>: mov $0x5,%eax
0x0000000000004d2e <+14>: mov %rbp,-0x20(%rsp)
0x0000000000004d33 <+19>: mov %r12,-0x18(%rsp)
0x0000000000004d38 <+24>: mov %r13,-0x10(%rsp)
0x0000000000004d3d <+29>: mov %r14,-0x8(%rsp)
0x0000000000004d42 <+34>: mov (%rdx),%esi
0x0000000000004d44 <+36>: mov 0x28(%rdx),%ebp
0x0000000000004d47 <+39>: mov 0x20(%rdx),%ebx
0x0000000000004d4a <+42>: mov 0x30(%rdx),%r8d
0x0000000000004d4e <+46>: mov 0x8(%rdx),%r9d
0x0000000000004d52 <+50>: mov 0xc(%rdx),%r11d
0x0000000000004d56 <+54>: mov 0x10(%rdx),%r12d
0x0000000000004d5a <+58>: mov 0x1c(%rdx),%r13d
0x0000000000004d5e <+62>: mov %esi,%ecx
0x0000000000004d60 <+64>: mov 0x4(%rdx),%r10d
0x0000000000004d64 <+68>: mov 0x14(%rdx),%r14d
0x0000000000004d68 <+72>: and $0xff000000,%ecx
0x0000000000004d6e <+78>: cmp $0xf0000000,%ecx
0x0000000000004d74 <+84>: je 0x4da0 <sss_nss_check_header+128>
0x0000000000004d76 <+86>: sub $0x1,%eax
0x0000000000004d79 <+89>: jne 0x4d6e <sss_nss_check_header+78>
0x0000000000004d7b <+91>: mov $0x5,%eax
0x0000000000004d80 <+96>: mov -0x28(%rsp),%rbx
0x0000000000004d85 <+101>: mov -0x20(%rsp),%rbp
0x0000000000004d8a <+106>: mov -0x18(%rsp),%r12
0x0000000000004d8f <+111>: mov -0x10(%rsp),%r13
0x0000000000004d94 <+116>: mov -0x8(%rsp),%r14
0x0000000000004d99 <+121>: retq
0x0000000000004d9a <+122>: nopw 0x0(%rax,%rax,1)
0x0000000000004da0 <+128>: cmp %r8d,%esi
That is, %ecx is never reloaded from memory.
In sss_nss_mc_get_record(), __sync_synchronize() is needed before the final
barrier check.
sss_mc_add_rec_to_chain() in src/responder/nss/nsssrv_mmap_cache.c contains
this comment:
/* changing a single uint32_t is atomic, so there is no
* need to use barriers in this case */
I think the comment is misleading because it's not just atomicity that matters,
ordering can also be relevant. However, in this particular case, it does not
appear to matter in what order the writes to the links in the hash chain
happen.
A client can pick up a reference to a hash table slot which is about to be
invalidated, resulting in a lookup failure when the record is actually present
in the cache. This is probably not a problem for this application.
You need to deal with counter overflow in your barriers (the ABA problem). One
way to do this is to switch to switch to a different file when it happens, and
rename it over the existing file. Concurrent readers will still have a
consistent view.
It seems you use this scheme because you need wait-free writers, so that the
privileged process which updates the cache does not block on readers. Correct?
Otherwise, there are probably simpler approaches.
I have not changed anything for ABA because updates are so rare that it is effectively impossible for any meaningful client to run into this issue IMO.
Cloned from Pagure issue: https://pagure.io/SSSD/sssd/issue/1694
https://bugzilla.redhat.com/show_bug.cgi?id=883429 (Fedora)
Comments
Comment from simo at 2012-12-05 21:42:53
Fields changed
blockedby: =>
blocking: =>
coverity: =>
design: =>
design_review: => 0
feature_milestone: =>
fedora_test_page: =>
milestone: NEEDS_TRIAGE => SSSD 1.9.3
owner: somebody => simo
status: new => assigned
testsupdated: => 0
Comment from simo at 2012-12-05 22:43:15
Fields changed
patch: 0 => 1
Comment from simo at 2012-12-05 22:49:22
A synchronization instruction right after the memcpy effectively fixes this problem.
This is the new code:
Patch sent to the list.
I have not changed anything for ABA because updates are so rare that it is effectively impossible for any meaningful client to run into this issue IMO.
Comment from jhrozek at 2012-12-05 23:27:59
resolution: => fixed
status: assigned => closed
Comment from simo at 2012-12-11 18:39:08
The fix was not complete, the Fedora bugzilla has been reopened as well.
resolution: fixed =>
status: closed => reopened
Comment from simo at 2012-12-11 21:05:41
Fields changed
milestone: SSSD 1.9.3 => SSSD 1.9.4
patch: 1 => 0
version: => 1.9.3
Comment from dpal at 2012-12-13 04:38:34
Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=886757
rhbz: [https://bugzilla.redhat.com/show_bug.cgi?id=883429 883429] => [https://bugzilla.redhat.com/show_bug.cgi?id=883429 883429], [https://bugzilla.redhat.com/show_bug.cgi?id=886757 886757]
Comment from jhrozek at 2012-12-13 20:45:58
resolution: => fixed
status: reopened => closed
Comment from jhrozek at 2017-02-24 15:05:57
Metadata Update from @jhrozek:
The text was updated successfully, but these errors were encountered: