

# **Update on Ara**

27/10/2021

Matteo Perotti

Matheus Cavalcante

Nils Wistoff

Professor Luca Benini Integrated Systems Laboratory ETH Zürich



#### **Summary**

- SW optimization
  - Ideal dispatcher
  - Bug fixes
- HW optimization
  - Timing

#### Ideal dispatcher

- Performance depends on:
  - CVA6
  - Ara
  - (Scalar memory accesses)
  - (Vector memory accesses)
- What does limit performance?
- Decouple the problem:
  - Make CVA6 an "ideal" dispatcher
- Now the performance depends only on Ara



# **Ideal dispatcher**





fmatmul performance (matrices of size #elements x #elements)



ETH Zürich | 5 |











# Ideal dispatcher - fconv3d





## Ideal dispatcher - fconv3d





ETH Zürich | 10 |



# Ideal dispatcher - fconv3d



ETH Zürich | 11 |

#### **Bug fixes**

- WIP Sequencer:
  - Make it independent from ready signals
  - Use counters instead
  - Critical point for both frequency and IPC