Ramneet Singh Mrunmayi Bhalerao

IIT Delhi

May 2023

- Introduction
- Abstract Language
- Basic Proof System
- Multicopy Atomic Memory Models
- 5 Non-Multicopy Atomic Memory Models

Introduction

•0000

Introduction

Introduction

00000

- Verification of concurrent programs with shared resources is challenging due to combinatorial explosion
- Abstraction to the rescue!
- Everything the environment can do:  $\mathcal{R}$
- Everything you can do:  $\mathcal{G}$

$$\mathcal{R}, \mathcal{G} \vdash P\{ c \} Q$$

Compositional!

### Extension to Weak Memory Models

- Judgements using earlier techniques are valid under sequentially consistent semantics
  - Can be directly used for data-race free code executing on weak memory models
  - But, lots of code has data races! seqlock, ConcurrentLinkedQueue in java.util.concurrent ...

Multicopy Atomic Memory Models

- How do we extend them to weak memory models?
- What If: We could find a condition under which sequentially consistent rely-guarantee reasoning can be soundly preserved

$$(\vdash P\{c\}Q) \land ?? \implies \vdash P\{c_{WM}\}Q$$

- Benefits:
  - Reuse existing verification techniques
  - Deal with the complexity of weak memory separately as a side-condition

Introduction

- Relaxing the memory consistency guarantees provided by hardware enables optimisations
  - Store forwarding (will see later)
  - Write huffers
- (Part 1) Multicopy Atomic: One thread's stores become observable to all other threads at the same time.
  - x86-TSO. ARMv8. RISC-V
- (Part 2) Non-Multicopy Atomic: Each component has its own view of the global memory.
  - Older ARM versions, POWER, C11
- Challenge: Two types of interference now **Inter-Thread** + **Intra-Thread** (due to reordering)
- How will we deal with this? . . .

Abstract Language

#### Teaser

- We want a compositional approach through thread-local reasoning.
- Exploit the reordering semantics of Colvin and Smith: multicopy atomic memory models can be captured in terms of instruction reordering.
  - Combinatorial explosion? (n reorderable instructions in a thread  $\implies$  n! behaviours)
  - Introduce reordering interference freedom between  $(\frac{n(n-1)}{2})$  pairs of instructions (Stay tuned...)
- In non-multicopy atomic WMMs, there is no global shared state(!!)
  - Judgement for each thread is applicable to its view (depends on propagation of writes by hardware)
  - How do we know it holds in other threads' views?
  - Represent the semantics using reordering between different threads
  - No longer compositional? Hardest part of the talk global reordering interference freedom: use the rely abstraction to represent reorderings between threads

Introduction

#### Section 2

Abstract Language

# **Syntax**

- Individual (atomic) instructions  $\alpha$
- Commands (or programs)

$$c := \epsilon \mid \alpha \mid c_1; c_2 \mid c_1 \sqcap c_2 \mid c^* \mid c_1 \mid c_2$$

- Iteration, choice are non-deterministic
- Empty program  $\epsilon$  represents termination

# • Each atomic instruction $\alpha$ has a relation $beh(\alpha)$ (over pre- and post-states) specifying its behaviour

- Program execution is defined by a small-step semantics over commands
- Iteration, non-deterministic choice are dealt with at a higher level (see next slide)

$$\begin{array}{ccc} & c_1 \mapsto_{\alpha} c_1' \\ \hline c_1; c_2 \mapsto_{\alpha} c_1'; c_2 \\ \hline c_1 \Vdash_{\alpha} c_1' & c_2 \mapsto_{\alpha} c_2' \\ \hline c_1 \parallel c_2 \mapsto_{\alpha} c_1' \parallel c_2 & c_1 \parallel c_2 \mapsto_{\alpha} c_1 \parallel c_2' \\ \hline \end{array}$$

### Semantics: Configurations

- Configuration  $(c, \sigma)$  of a program
  - Command c to be executed
  - State  $\sigma$  (map from variables to values)
- Action Step: Performed by component, changes state

$$(c,\sigma) \xrightarrow{as} (c',\sigma') \iff \exists \alpha.c \mapsto_{\alpha} c' \land (\sigma,\sigma') \in beh(\alpha)$$

• Silent Step: Performed by component, doesn't change state

$$(c_1 \sqcap c_2, \sigma) \leadsto (c_1, \sigma) \quad (c_1 \sqcap c_2, \sigma) \leadsto (c_2, \sigma)$$
  
 $(c^*, \sigma) \leadsto (\epsilon, \sigma) \quad (c^*, \sigma) \leadsto (c; c^*, \sigma)$ 

- Program Step: Action Step or Silent Step
- Environment Step: Performed by environment, changes state.  $(c,\sigma) \xrightarrow{es} (c,\sigma').$

#### Section 3

Basic Proof System

#### Definitions

- Associate a verification condition  $vc(\alpha)$  with each instruction  $\alpha$ : Provides finer-grained control (just set to  $\top$  if not needed)
- Hoare triple

Abstract Language

$$P\{ \alpha \}Q \stackrel{\mathsf{def}}{=} P \subseteq \mathrm{vc}(\alpha) \cap \{ \sigma \mid \forall \sigma', \ (\sigma, \sigma') \in \mathrm{beh}(\alpha) \Longrightarrow \sigma' \in Q \}$$

- A rely-guarantee pair  $(\mathcal{R}, \mathcal{G})$  is well-formed if
  - $\bullet$   $\mathcal{R}$  is reflexive and transitive
  - G is reflexive
- Stability of predicate P under rely condition  $\mathcal{R}$

$$\operatorname{stable}_{\mathcal{R}}(P) \stackrel{\mathsf{def}}{=} P \subseteq \{ \sigma \in P \mid \forall \, \sigma', \, (\sigma, \sigma') \in \mathcal{R} \Longrightarrow \sigma' \in P \}$$

• Instruction  $\alpha$  satisfies guarantee condition  $\mathcal{G}$ 

$$\operatorname{sat}(\alpha, \mathcal{G}) \stackrel{\mathsf{def}}{=} \{ \sigma \mid \forall \sigma', \ (\sigma, \sigma') \in \operatorname{beh}(\alpha) \Longrightarrow (\sigma, \sigma') \in \mathcal{G} \}$$

Now introduce rely/guarantee judgements at three levels

# Instruction Level $(\vdash_a)$

$$\mathcal{R}, \mathcal{G} \vdash_{\mathsf{a}} P \{ \alpha \} Q \stackrel{\mathsf{def}}{=} \mathrm{stable}_{\mathcal{R}}(P) \wedge \mathrm{stable}_{\mathcal{R}}(Q) \wedge \mathrm{vc}(\alpha) \subseteq \mathrm{sat}(\alpha, \mathcal{G}) \wedge P \{ \alpha \} Q$$

 Interplay between environmental interference and pre-,post-conditions handled through stability

Consea

# Component Level $(\vdash_c)$

$$\operatorname{Atom} \frac{\mathcal{R},\mathcal{G}\vdash_{a}P\{\;\alpha\;\}Q}{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;\alpha\;\}Q}$$
 
$$\operatorname{Seq} \frac{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c_{1}\;\}M\quad\mathcal{R},\mathcal{G}\vdash_{c}M\{\;c_{2}\;\}Q}{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c_{1}\;;c_{2}\;\}Q}$$
 
$$\operatorname{Choice} \frac{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c_{1}\;\}Q\quad\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c_{2}\;\}Q}{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c_{1}\;\}Q\quad\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c_{2}\;\}Q}$$
 
$$\operatorname{Iteration} \frac{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c\;\}P\ \operatorname{stable}_{\mathcal{R}}(P)}{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c^{*}\;\}P}$$
 
$$\frac{\mathcal{R},\mathcal{G}\vdash_{c}P\{\;c\;\}Q\quad P'\subseteq P\quad\mathcal{R}'\subseteq\mathcal{R}\quad Q\subseteq Q'\quad\mathcal{G}\subseteq\mathcal{G}'}{\mathcal{R}',\mathcal{G}'\vdash_{c}P'\{\;c\;\}Q'}$$

 Global satisfiability needs component satisfiability + interference check

$$\mathsf{Comp} \frac{\mathcal{R}, \mathcal{G} \vdash_{c} P\{\ c\ \} Q \quad \mathrm{rif}(\mathcal{R}, \mathcal{G}, c)}{\mathcal{R}, \mathcal{G} \vdash P\{\ c\ \} Q}$$

Usual parallel rule

$$\mathsf{Par} \frac{ \mathcal{R}_1, \mathcal{G}_1 \vdash_c P_1 \Set{c_1}{Q_1} \mathcal{R}_2, \mathcal{G}_2 \vdash_c P_2 \Set{c_2}{Q_2} \mathcal{G}_2 \subseteq \mathcal{R}_1 \quad \mathcal{G}_1 \subseteq \mathcal{R}_2}{\mathcal{R}_1 \cap \mathcal{R}_2, \mathcal{G}_1 \cup \mathcal{G}_2 \vdash P_1 \land P_2 \Set{c_1}{|c_2|} \mathcal{Q}_1 \land \mathcal{Q}_2}$$

Multicopy Atomic Memory Models

#### Reordering Semantics: Basics

- Multicopy atomic memory models can be characterised using a reordering relation ← over pairs of instructions in a component
- $\bullet$   $\ \hookleftarrow$  is syntactically derivable based on the specific memory model. E.g., in ARMv8
  - Two instructions which don't access (read or write) a common variable can be reordered
  - Various types of memory barriers prevent reordering
- Forwarding is another complication
  - $\beta=x:=3; \alpha=y:=x.$  Can forward the value 3 to y, losing dependence between  $\alpha,\beta.$
  - $\bullet$  x := 3 ; y := x  $\Longrightarrow$  y := 3 ; x := 3
  - Denote  $\alpha$  with the value written in an earlier instruction forwarded to it as  $\alpha_{<\beta>}$ .
- Forwarding may continue arbitrarily and can span multiple instructions

### Reordering Semantics: Formal

- $\alpha_{<c>}$ : cumulative forwarding effects of the instructions in command c on  $\alpha$
- Ternary relation  $\gamma < c < \alpha$ : Reordering of instruction  $\alpha$  prior to command c, with cumulative forwarding effects producing  $\gamma$ .
- Definition by induction

$$\begin{split} &\alpha_{<\beta>} < \beta < \alpha \stackrel{\mathsf{def}}{=} \beta \hookleftarrow \alpha_{<\beta>} \\ &\alpha_{} < c_1; c_2 < \alpha \stackrel{\mathsf{def}}{=} \alpha_{} < c_1 < \alpha_{} \land \alpha_{} < c_2 < \alpha \end{split}$$

• Example:  $\alpha = (y := x), \beta = (x := 3), \gamma = (z := 5).$  $\alpha_{<\beta>} = (y := 3), \alpha_{<\gamma:\beta>} = (y := 3).$ 

$$y := 3 < x := 3 < y := x$$
 and  $y := 3 < z := 5$ ;  $x := 3 < y := x$ 

• Can execute an instruction which occurs later in the program if reordering and forwarding can bring it (in its new form  $\gamma$ ) to the beginning

Reorder 
$$\frac{c_2 \mapsto_{\alpha} \quad \gamma < c < \alpha}{c_1; c_2 \mapsto_{\gamma} c_1; c'_2}$$

#### Reordering Interference Freedom

- Insight: Any valid reordering will **preserve thread-local semantics**, thus may only invalidate reasoning when observed by the environment.
  - ullet Abstraction to the rescue again! Observed by environment  $\Longrightarrow \mathcal{G}$ violated, or  ${\mathcal R}$  not strong enough
- Three Levels: Instructions, Commands, Program

• Two instructions are reordering interference free: Reasoning over them in their original order is sufficient to include reordered behaviour.

$$\begin{split} \operatorname{rif}_{\mathsf{a}}(\mathcal{R},\mathcal{G},\beta,\alpha) &\stackrel{\mathsf{def}}{=} \forall \, P, \, Q, \, M. \, \, \mathcal{R}, \mathcal{G} \vdash_{\mathsf{a}} P\{ \, \beta \, \} M \, \wedge \, \, \mathcal{R}, \mathcal{G} \vdash_{\mathsf{a}} M\{ \, \alpha \, \} Q \\ &\Longrightarrow \exists M'. \, \, \mathcal{R}, \, \mathcal{G} \vdash_{\mathsf{a}} P\{ \, \alpha_{<\beta>} \, \} M' \, \wedge \, \, \mathcal{R}, \, \mathcal{G} \vdash_{\mathsf{a}} M'\{ \, \beta \, \} Q \end{split}$$

• Command c is reordering interference free from  $\alpha$  under  $\mathcal{R}, \mathcal{G}$  if the reordering of  $\alpha$  over each instruction of c is reordering interference free, including those variants produced by forwarding.

$$\operatorname{rif}_{c}(\mathcal{R}, \mathcal{G}, \beta, \alpha) \stackrel{\mathsf{def}}{=} \operatorname{rif}_{a}(\mathcal{R}, \mathcal{G}, \beta, \alpha)$$
$$\operatorname{rif}_{c}(\mathcal{R}, \mathcal{G}, c_{1}; c_{2}, \alpha) \stackrel{\mathsf{def}}{=} \operatorname{rif}_{c}(\mathcal{R}, \mathcal{G}, c_{1}, \alpha_{< c_{2}>}) \wedge \operatorname{rif}_{c}(\mathcal{R}, \mathcal{G}, c_{2}, \alpha)$$

• Program c is reordering interference free if and only if all possible **reorderings** of its instructions over the respective prefixes are reordering interference free.

$$\operatorname{rif}(\mathcal{R},\mathcal{G},c) \stackrel{\mathsf{def}}{=} \forall \alpha, r, c'. \ c \mapsto_{\alpha_{< r>}} c' \Longrightarrow \operatorname{rif}_{c}(\mathcal{R},\mathcal{G},r,\alpha) \wedge \operatorname{rif}(\mathcal{R},\mathcal{G},c')$$

- Observe: Checking  $rif(\mathcal{R}, \mathcal{G}, c)$  amounts to
  - Checking  $\operatorname{rif}_{a}(\mathcal{R},\mathcal{G},\beta,\alpha)$  for all pairs of instructions  $\beta,\alpha$  that can reorder in c
  - Including those pairs for which  $\alpha$  is a new instruction generated through forwarding

## Gameplan

Abstract Language

- Compute all pairs of reorderable instructions  $(\beta, \alpha)$ .
- Demonstrate reordering interference freedom for as many of these pairs as possible (using rif<sub>a</sub>( $\mathcal{R}, \mathcal{G}, \beta, \alpha$ )).

Multicopy Atomic Memory Models

0000000

- If rif<sub>a</sub> cannot be shown for some pairs
  - introduce memory barriers to prevent their reordering or
  - modify the verification problem such that their reordering can be considered benign
- Verify the component in isolation, using standard rely/guarantee reasoning with an assumed sequentially consistent memory model.

For a thread with *n* reorderable instructions.

n! Possible Behaviours  $\longrightarrow n(n-1)/2 \operatorname{rif}_a$  checks

Thanks for staying tuned: )

#### Section 5

Non-Multicopy Atomic Memory Models

 There is no shared state that all components agree on throughout execution, invalidating a core assumption of standard rely/guarantee reasoning.

### Write History Semantics: Representation

- Each component is associated with a unique identifier.
- Shared memory state is represented as a list of variable writes  $\langle w_1, w_2, w_3, \dots \rangle$ , with metadata to indicate which components have performed and observed particular writes.

Multicopy Atomic Memory Models

- The order of events in this write history provides an *overall order* to the system's events, with those later in the list being the most recent.
- Each  $w_i = (x \mapsto v)_{rds}^{wr}$  where
  - x is a variable
  - v is a value
  - writer $((x \mapsto v)_{rds}^{wr}) = wr$  is the writer component's identifier
  - readers $((x \mapsto v)_{rds}^{wr}) = rds$  is the set of component identifiers that have observed the write
  - $\operatorname{var}((x \mapsto v)_{rds}^{wr}) = x$

#### Write History Semantics: Manipulation

- Divide instructions into two types: global and local. Global instructions  $\alpha$  are:
  - Store  $(x := v)_i$ , Load  $[x = v]_i$ , Memory barrier fence, Skip instruction (corresponding to some internal step)
- Behaviour of these instructions is formalised as (for skip it's just id):

$$beh((x := v)_i) = \{ (h \circ h', h \circ (x \mapsto v)^i_{\{i\}} \circ h') \mid \\ \forall w \in h'. \text{ writer}(w) \neq i \land (\text{var}(w) = x \Longrightarrow i \notin \text{readers}(w)) \\ beh([x = v]_i) = \{ (h \circ (x \mapsto v)^j_r \circ h', h \circ (x \mapsto v)^j_r \circ h') \mid \}$$

 $\forall w \in h'$ .  $(\text{var}(w) = x \Longrightarrow i \notin \text{readers}(w))$ 

beh(fence<sub>i</sub>) = {  $(h, h) \mid \forall w \in h$ .  $(i \in \text{readers}(w) \Longrightarrow \forall y, y \in \text{readers}(w)$ 

Propagations of writes are modelled as environment effects and can

take place at any point during the execution. 
$$prp = \{ (h \circ (x \mapsto v)_r^j \circ h', h \circ (x \mapsto v)_{r \cup \{i\}}^j \circ h') \mid \\ i \not\in r \land \forall w \in h. (var(w) = x \Longrightarrow i \in readers(w)) \}$$

#### More Notation

- New constructor in the language: comp(i, m, c) indicating a component with
  - identifier i
  - local state m
  - command c
- Assume a local behaviour relation lbeh such that  $(m, \alpha', m') \in lbeh(\alpha)$  if executing  $\alpha$ 
  - changes the local state from m to m'
  - ullet corresponds to the global instruction lpha'

$$\operatorname{comp}(i,m,c) \mapsto_{\alpha'_{\cdot}} \operatorname{comp}(i,m',c') \iff c \mapsto_{\alpha} c' \wedge (m,\alpha',m') \in \operatorname{lbeh}(\alpha)$$

- Go from local semantics/reasoning to global semantics/reasoning using comp and lbeh.
  - Constraint: systems are constructed as the parallel composition of a series of comp commands.
  - Trivial support for local state (e.g., hardware registers).

# Meaning of Judgement

• If there is no global state, what does  $\mathcal{R}, \mathcal{G} \vdash P\{c\}Q$  (for a component i with command c) mean?

• For a set of components I, write history h, for all variables x,

 $\operatorname{view}_I(h,x) = v$  iff

$$h = h' \circ (x \mapsto v)_r^w \circ h'' \wedge I \subseteq r \wedge \forall w_i \in h''$$
.  $var(w_i) = x \Longrightarrow I \not\subseteq readers(w_i)$ 

- ullet For all executions of c
  - If
- ullet the execution operates on a write history h such that  $\mathrm{view}_i(h) \in P$
- ullet all propagations to i modify  $\mathrm{view}_i$  in accordance with  ${\mathcal R}$
- Then i will
  - modify  $view_i$  in accordance with G
  - ullet given termination, end with a write history h such that  $\mathrm{view}_i(h) \in \mathcal{Q}$
- This state mapping allows for rely/guarantee judgements over individual components to be trivially lifted from a standard memory model to their respective views of a write history.

- Parallel composition is complicated: Need to relate differing components views.
- If the execution of an instruction  $\alpha$  by some component i satisfies its guarantee specification  $G_i$  in state h.

$$\text{view}_i(h) \in \text{sat}(\alpha, \mathcal{G}_i)$$

• Then the effects of propagating  $\alpha$ 's writes to some other component j will satisfy its rely specification  $\mathcal{R}_i$  in its view,

$$\operatorname{view}_{j}(h) \in \operatorname{sat}(\alpha, \mathcal{R}_{j})$$

 Insight: It is possible to relate the views of two components by only considering the difference in their observed writes, i.e., the writes one component has observed but the other has not.

### Travelling Between Components



- Aim to demonstrate rely/guarantee compatibility when propagating an instruction  $\alpha$  from component i to component i
  - Given: component *i* executes  $\alpha$  such that  $\text{view}_i(h) \in \text{sat}(\alpha, \mathcal{G}_i)$ .
  - Step 1: Show that  $\alpha$  can be executed in the shared view, i.e.,  $\text{view}_{\{i,i\}}(h) \in \text{sat}(\alpha, \mathcal{G}_i).$
  - Step 2: Show that  $\alpha$  can be executed in component j's view, i.e.,  $view_i(h) \in sat(\alpha, \mathcal{G}_i).$

### Travelling to Shared View Through Non i, j Writes

- Prove step 1 by induction on length of  $\Delta_{i,i}(h)$ .
  - Base case trivial  $(\text{view}_{\{i,j\}}(h) = \text{view}_i(h))$ .
  - Induction step:
    - The write cannot be from j since j hasn't observed it.
    - If the write is from i, then the reordering of  $\alpha$  before it has been covered in multicopy reordering interference freedom.
    - If the write is from some  $k \neq i, j$ , then we do what follows.
- Have some relation  $\mathcal{E}$  intended to capture the possible writes i may have observed ahead of i

$$\operatorname{rif}_{nmca}(\mathcal{E}, \alpha, \mathcal{G}_i) = wp(\mathcal{E}, \operatorname{sat}(\alpha, \mathcal{G}_i)) \subseteq \operatorname{sat}(\alpha, \mathcal{G}_i)$$

• Proving this for  $\mathcal{E} = \mathcal{R}_i \cap \mathcal{R}_i \cap id_{\alpha}$  is sufficient.

### Endgame

Abstract Language

 Define a compatibility relation by universally quantifying over all writes

compat
$$(G_i, \mathcal{R}_i, \mathcal{R}_j) \stackrel{\text{def}}{=} \forall x, v.$$
  
 $wp(\mathcal{R}_i \cap \mathcal{R}_j \cap id_x, \text{sat}(x := v, G_i)) \subseteq \text{sat}(x := v, \mathcal{R}_j)$ 

 Modify the rules for parallel composition (note that we need separate relies and guarantees for each component because demonstrating compat requires pairwise checking)

$$\mathsf{Comp'} \frac{\mathcal{R}, \mathcal{G} \vdash_{c} P\{\ c\ \} Q \quad \mathrm{rif}(\mathcal{R}, \mathcal{G}, c)}{[i \mapsto \mathcal{R}], [i \mapsto \mathcal{G}] \vdash P\{\ \mathrm{comp}(i, m, c)\ \} Q}$$

$$\mathsf{Par'} \frac{\mathcal{R}_1, \mathcal{G}_1 \vdash P_1 \{ \ c_1 \ \} Q_1 \quad \mathcal{R}_2, \mathcal{G}_2 \vdash P_2 \{ \ c_2 \ \} Q_2 \quad \mathrm{disjoint}(\mathcal{R}_1, \mathcal{R}_2)}{\forall i \in \mathrm{dom}(\mathcal{R}_1). \ \forall j \in \mathrm{dom}(\mathcal{R}_2). \ \mathrm{compat}(\mathcal{G}_1(i), \mathcal{R}_1(i), \mathcal{R}_2(j))}{\forall i \in \mathrm{dom}(\mathcal{R}_2). \ \forall j \in \mathrm{dom}(\mathcal{R}_1). \ \mathrm{compat}(\mathcal{G}_2(i), \mathcal{R}_2(i), \mathcal{R}_1(j))}}$$

$$\mathcal{R}_1 \uplus \mathcal{R}_2, \mathcal{G}_1 \uplus \mathcal{G}_2 \vdash P_1 \land P_2 \{ \ c_1 \ || \ c_2 \ \} Q_1 \land Q_2$$