

| Initially: $X = Y = 0$ |                  |
|------------------------|------------------|
| processor 1            | processor 2      |
| X = 1                  | Y = 1            |
| $r_1 = Y$              | $r_2=\mathrm{X}$ |
| T3 4 11                | 0                |

Eventually:  $r_1 = 0$ ,  $r_2 = 0$ 

When the two threads run on different processors, the stores to X and Y in the first line can possibly be delayed by the store buffers in these processors. Subsequent loads in the second line see the initial values of X and Y if the store buffers have not yet been committed.

(b)

**Fig. 1.** (a) A comparison of various memory models [6,9,14,15,24]. (b) An execution that is possible on TSO but not on SC.