Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] 'Strength' comparision of SeqCst fence and SeqCst load/stores #739

Open
pr0cf5 opened this issue Dec 3, 2022 · 0 comments
Open
Assignees
Labels
question Further information is requested

Comments

@pr0cf5
Copy link

pr0cf5 commented Dec 3, 2022

Opening

I'm trying to understand the reason why hazard pointers require exactly two SeqCst orderings. While thinking about this, I wanted to fully understand the guarantees of SeqCst atomic operations and fences. Intuitively, fences should provide a stronger guarantee due to its name(the term fence gives a very intimidating impression!) and below are my conclusions:

Low-Level Perspective

The difference of a SeqCst fence and SeqCst atomic.store in a compiler's point of view:

  • fence is lowered to a mfence instruction. The mfence instruction is said to be a serializing instruction. This means that all stores prior to the mfence instruction are committed to the main memory.
  • store is lowerd to a xchg rax, [rbx]-like instruction. This is interesting, because rax is discarded afterwards. For Release ordering, the store is not lowered to a xchg instruction but a regular store instruction. This implies that there is something 'special' about xchg, which I don't know yet.

High-Level Perspective

  • All stores with the SeqCst ordering contribute to the global ordering S. All loads with the SeqCst ordering see stores accordig to the global ordering S. The term see means the following: if a thread can see a store, it means that it can observe the effects of that store.

  • However, this does not mean that a SeqCst load can see the last store within S. It only guarantees that if it sees a certain store s within S, it can always see t such that t < s where '<' represents the ordering within S. In other words, threads see a 'prefix of S, which is at worst nothing (null sequence) and at best the entire sequence S. Let's consider the following example:

fn thread1_func(m: &Arc<AtomicUsize>, n: &Arc<AtomicUsize>) {
    m.store(1, Ordering::SeqCst); // [1]
    let val = n.load(Ordering::SeqCst);
    if val == 0 {
        read(x);
    }
}

fn thread2_func(m: &Arc<AtomicUsize>, n: &Arc<AtomicUsize>) {
    n.store(1, Ordering::SeqCst); // [2]
    let val = m.load(Ordering::SeqCst);
    if val == 0 {
        free(x);
    }
}

Let's assume that everything in thread2 was executed before thread1. Then, the global ordering S would be [2] --> [1]. However, the load of n in thread1 can yield 0, if it sees 'nothing' from S. Thus, this results in a UaF and crashes the program.

  • We can use fences to solve this problem.
fn thread1_func(m: &Arc<AtomicUsize>, n: &Arc<AtomicUsize>) {
    m.store(1, Ordering::SeqCst); // [1]
    fence(Ordering::SeqCst);
    let val = n.load(Ordering::SeqCst); // [3]
    if val == 0 {
        read(x);
    }
}

fn thread2_func(m: &Arc<AtomicUsize>, n: &Arc<AtomicUsize>) {
    n.store(1, Ordering::SeqCst); // [2]
    fence(Ordering::SeqCst);
    let val = m.load(Ordering::SeqCst); // [4]
    if val == 0 {
        free(x);
    }
}
  • I understood the semantics of a SeqCst fence as follows: For stores, nothing changes. (assuming all stores had the SeqCst ordering) For loads, an additional condition is imposed. If a SeqCst fence is present before a load, it must see a subsequence of the global ordering S that includes all of the stores before the fence.

  • Let's see how the problem is solved in this case: Let's consider both cases, where the S is [1] --> [2] and S is [2] --> [1].

  • S = [1] --> [2]: The load of n in thread1 ([3]) either sees [1] or [1]-->[2]. If it sees [1], the load of n will return 0, and read(x) will be performed. In the other case, read(x) will not be performed. The load of m in thread2 ([4]) must see [1]-->[2] due to the fence. Thus, the load of m will return 1 and free(x) will never be performed. Because read(x) is 'maybe' done and free(x) is never done, the pogram is safe!

  • S = [2] --> [1]: The load of n in thread1([3]) must see [2]-->[1] due to the fence. Thus, the load of n will return 1 and read(x) will not be performed. The load of m in thread2 ([4]) either sees [2] or [2]-->[1]. If it sees [2], the load of m will return 0 and free(x) will be performed. In the other case, free(x) will not be performed. In conclusion, read(x) will never be performed so safety is guaranteed regardless of whether free(x) is called. Safe!

  • So in conclusion, fence(SeqCst) imposes some condition for load(SeqCst) and limits how it 'sees' S.

Connecting the Dots

I think it is also important to think why the usage of mfence provides this high-level conditions that I described. Let's see the code before in pseudo-assembly.

thread1:
    mov rax, 1
    xchg rax, [rbx] // atomic store [1]
    mfence
    mov rax, [rcx] // atomic load [3]
thread2:
    mov rax, 1
    xchg rax, [rcx] // atomic store [2]
    mfence
    mov rax, [rbx] // atomic load [4]

The order between the two xchgs can be arbitrary. But, by mfence, the loads are done strictly after xchg commits to memory. Thus, because there exists an order between xchgs PLUS an order between xchg and load, the high-level properties described previously are obtained.

Conclusions and Thoughts

If any of my discussions are wrong, please tell me where my thoughts are wrong. These are rather unbased thoughts of mine that were created during the process of attempting to understand what fences are.

@pr0cf5 pr0cf5 added the question Further information is requested label Dec 3, 2022
@Lee-Janggun Lee-Janggun added the lecture question/discussion related to lectures label Dec 5, 2022
@Lee-Janggun Lee-Janggun removed the lecture question/discussion related to lectures label Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants