

1 Barcelona Supercomputing Center, Spain



2 Universitat Politècnica de Catalunya, Spain



3 Arm, Texas, USA



4 Universidad de Zaragoza, Spain

5 AMD, Texas, USA

# Far Atomic Memory Operations in gem5

### Víctor Soria Pardos<sup>1</sup>

victor.soria@bsc.es https://vsoriap.github.io/

Adrià Armejach<sup>1,2</sup>, Tiago Mück<sup>3</sup>, Darío Suárez Gracia<sup>4</sup>, José A. Joao<sup>3</sup>, Alejandro Rico<sup>5</sup>, Miquel Moretó<sup>2,1</sup>

- Introduction
- Far AMO in AMBA 5 CHI
- gem5 updates and tuning options
- Next steps



## **Introduction - Atomic Memory Operation (AMO)**

- AMOs: enable programmers to develop shared memory parallel applications
  - O Play a central role in:
    - Synchronization primitives
    - Lock-Free data structures
    - Updates to shared data

```
Atomic {
 var++;
}
```

```
Addr

Idadd x0, x1, [@var]

Increment

Stores Old

Value
```

## **Introduction - Atomic Memory Operation (AMO)**

#### Different Operations:

- SWP Swap
- CAS Compare-and-Swap
- LDADD Fetch-and-Add
- LDEOR Fetch-and-XOR
- LDCLR Fetch-and-NAND
- LDSET Fetch-and-OR
- LDUMIN Fetch-and-Unsigned-Minimum
- LDSMAX Fetch-and-Signed-Maximum

#### **Near and Far AMOs**

- Two fundamental AMO execution mechanisms exist:
  - Near AMOs
  - Execute the operation in the private levels of cache





#### **Near and Far AMOs**

Two fundamental AMO execution mechanisms exist:



- Far AMOs
- Execute the operation in the directory

# Introduction - gem5's CHI

- Arm's CHI (Coherent Hub Interface) latest NoC architecture
- Targets high performance and scalability
- MESI and MOESI cache models
- Fully parameterized Ruby implementation available since gem5-21.0

### **Specs that support Near and Far AMOs**

Recent Arm's AMBA 5 CHI includes support for both operations



## Introduction - gem5's CHI

- Arm's CHI (Coherent Hub Interface) latest NoC architecture
- Targets high performance and scalability
- MESI and MOESI cache models
- Fully parameterized Ruby implementation available since gem5-21.0
  - Partially supported near AMOs
  - No support for far AMOs



# **Introduction - Atomic Memory Operations in Ruby**

- Current gem5 implementation of Ruby treats AMOs as stores
- AMOs are executed in the Sequencer



10

- Introduction
- Far AMO in AMBA 5 CHI
- gem5 updates and tuning options
- Next steps



### Introduction - AMBA 5 CHI Far AMOs

- Far AMOs delegate op and data to the directory
  - RN Core, L1, and L2
  - HN Directory (Optionally L3)



### Introduction - AMBA 5 CHI Far AMOs

- Far AMOs delegate op and data to the directory
- Comprises 3 mandatory messages:
  - Request (AtomicStore)
  - Ack (CompDBIDResp)
  - Operand (WriteData)



13

- Introduction
- Far AMO in AMBA 5 CHI
- gem5 updates and tuning options
- Next steps



# **Introduction - Atomic Memory Operations in Ruby**

- Near AMOs are executed in the L1 Controller
- We model a variable AMO latency



### **Static AMO Policies**

 For each cache state AMOs can either be executed Near or Far

| <b>Graviton 3 (Arm Neoverse Arch.)</b> |
|----------------------------------------|
| implements All Near and Unique         |
| Near static policies                   |

 We include Present Near, our policy that outperforms All Near and Unique Near

| Policy Name  | UC | UD | SC | SD | I |
|--------------|----|----|----|----|---|
| All Near     | N  | N  | N  | N  | N |
| Unique Near  | N  | N  | F  | F  | F |
| Present Near | N  | N  | N  | N  | F |

- Introduction
- Far AMO in AMBA 5 CHI
- gem5 updates and tuning options
- Next steps



### **Next Steps**

- Accepted paper at ISCA'23:
  - DynAMO: Improving Parallelism Through Dynamic Placement of Atomic Memory Operations
  - Session 1B: CPU Microarchitecture

### **Next Steps**

- Far AMOs available in gem5:
  - We are preparing a patch to main branch
  - To appear in gem5 23.1 at the end of the year



1 Barcelona Supercomputing Center, Spain



2 Universitat Politècnica de Catalunya, Spain



3 Arm, Texas, USA



4 Universidad de Zaragoza, Spain

# Far Atomic Memory Operations in gem5

Víctor Soria Pardos<sup>1</sup>, Adrià Armejach<sup>1,2</sup>, Tiago Mück<sup>3</sup>, Darío Suárez Gracia<sup>4</sup>, José A. Joao<sup>4</sup>, Alejandro Rico<sup>5</sup>, Miquel Moretó<sup>2,1</sup>

victor.soria@bsc.es

#### **Thanks**