

# **Update on Ara**

23/02/2022

Matteo Perotti
Matheus Cavalcante
Nils Wistoff
Gianmarco Ottavi

Professor Luca Benini Integrated Systems Laboratory ETH Zürich

## **Summary**

- T-Head
  - First programs
- New Backend trials
  - Halve the caches
  - Power Breakdown

- Ara projects
  - Toward RVV 1.0



Native toolchain does not compile Vector code



Native toolchain does not compile Vector code

✓ Install external T-Head V toolchain



Native toolchain does not compile Vector code

✓ Install external T-Head V toolchain

Issues with shared libraries and OS



- Native toolchain does not compile Vector code
- ✓ Install external T-Head V toolchain
  - Issues with shared libraries and OS
- ✓ Compile the programs statically

- Native toolchain does not compile Vector code
- ✓ Install external T-Head V toolchain
  - Issues with shared libraries and OS
- ✓ Compile the programs statically
- Vector code traps in illegal instruction!



The scalar code runs with no issues...

ETH Zürich | 8 |



# The vector code traps!

```
10e66:
                                                      5e003257
                                              10e6a:
                                                      5e003457
                                             10e6e:
                                                      5e003657
   FMATMUL
                                             10e72:
                                                      0001
_____
                                              10e74:
                                                      6422
                                             10e76:
                                                      0141
                                              10e78:
                                                      8082
Calculating a (4 \times 4) \times (4 \times 4) matrix multiplication...
Initializing matrices...
Calculating fmatmul...
Illegal instruction
root@RVBoards:compiled on fenga9#
```

ETH Zürich 9

10e5c:

10e5e:

10e60:

10e62:

1141

e422

0800

5e003057

0000000000010e5c <fmatmul vec 4x4 slice init>:

addi

addi

vmv.v.i v0,0

vmv.v.i v4,0

vmv.v.i v8,0

nop

vmv.v.i v12,0

addi

ret

sd s0,8(sp)

ld s0,8(sp)

sp, sp, -16

s0, sp, 16

sp,sp,16

Should the vector engine be enabled through **MSTATUS CSR**?





Non standard VS position!

picture 16.1: Machine mode processor status register (MSTATUS)

ETH Zürich | 10 |



Try to enable the Vector extension...



MSTATUS (or SSTATUS)

ETH Zürich | 11 |



MSTATUS (or SSTATUS)

- Native toolchain does not compile Vector code
- ✓ Install external T-Head V toolchain
- Issues with shared libraries and OS
- ✓ Compile the programs statically
- Vector code traps in illegal instruction!
- Modify SSTATUS with a Kernel Module?

# **T-Head NN Library**

- T-Head NN optimized vector library for C906
- https://github.com/T-head-Semi/csi-nn2
- Try library functions on C906 and on Ara
- Some unsupported instructions, and RVV 0.7.1
- On C906 should run natively: baseline!

### **New Backend Trials**

- L1 scalar caches are huge
- Not critical since computation is mostly on Ara
- Halve I\$ and D\$ line widths?
- fmatmul utilization within ±1%
- Smaller chip, lower power consumption, relaxed PnR timing?



### **Further**

- Add power breakdown and more PnR data to paper
- Submit to ASAP conference
- Vectorize Embench
- Try NN lib on T-Head C906
- Yun testing