

#### An Overview of CMMC (RISC-V Track)

Yingwei Zheng Bingzhen Wang Yifan Wu Wenqian Yan





#### Team members

- 郑英炜 @dtcxzyw
  - ISCAS PLCT Lab Intern, LLVM Committer (Transform/RISC-V backend)
  - Contribution: Infrastructure/Transform
- 王炳臻 @infiWang
  - ISCAS PLCT Lab Intern, luajit rv64 port contributor
  - Contribution: RISC-V backend
- 邬一帆 @GhostFrankWu
  - CTFer, COMPASS CTF Team Leader
  - Contribution: Regression testing/Fuzzing
- 严文谦 @YanWQ-monad
  - CTFer
  - Contribution: Transform/ARM-v7 backend/CI
- Instructor: 刘烨庞
  - SE
  - CS323 Compilers



#### Overview

• CMMC (C Minus Minus Compiler)





### Highlights — Tiny 'SMT' solver

- Constraint solver
  - context-sensitive icmp: (a < b)? (a < b):  $(a >= b) \rightarrow a < b$
  - transitive closure:  $a < b \&\& b < c \rightarrow a < c$
- Integer range analysis
  - signed range [min, max]
    - x is non-negative, srem x, 16  $\rightarrow$  urem x, 16  $\rightarrow$  and x, 15
    - x in [min1, max1], y in [min2, max2], max1 < min2  $\rightarrow$  x < y is true
    - n is non-negative,  $0 \le x \le n \rightarrow x \le n$
  - known bits [known zeros, known ones]
    - (lshr x, 31) in [0, 1]
    - x is negative -> (ashr x, y) is negative



# Highlights – Template code generator

InstInfo

```
FLW:
Format: "flw $Rd:FPR[Def], $Imm:Imm12[Metadata]($Rs1:BaseLike[Use]) # $Alignment:Align[Metadata]"
Flag: [Load]
```

InstSelection

ScheduleModel

```
IntegerArithmeticLateB:
    [ROL, ROLW, ROR, RORI, RORIW, RORW, CLZ, CLZW, CTZ, CTZW, ORC_B]
IntegerArithmeticEarlyB: [CPOP, CPOPW]
IntegerArithmeticLateAB: [REV8]
```

```
RISCVInstInfoFLW() = default:
void print(std::ostream& out, const MIRInst& inst, bool printComment) const override {
   out << "flw " << ::cmmc::mir::RISCV::OperandDumper{ inst.getOperand(0) } << ", "
       << ::cmmc::mir::RISCV::OperandDumper{ inst.getOperand(1) } << "(</pre>
       << ::cmmc::mir::RISCV::OperandDumper{ inst.getOperand(2) } << ")";</pre>
   if(printComment)
       out << " # " << ::cmmc::mir::RISCV::OperandDumper{ inst.getOperand(3) };</pre>
[[nodiscard]] bool verify(const MIRInst& inst, const CodeGenContext& ctx) const override {
   return inst.checkOperandCount(4) && mir::checkISASpecificOperands(inst, ctx, 4) &&
       ::cmmc::mir::RISCV::isOperandFPR(inst.getOperand(0)) && ::cmmc::mir::RISCV::isOperandImm12(inst.getOperand(1)) &&
       ::cmmc::mir::RISCV::isOperandBaseLike(inst.getOperand(2)) && ::cmmc::mir::RISCV::isOperandAlign(inst.getOperand(3))
[[nodiscard]] uint32_t getOperandNum() const noexcept override {
[[nodiscard]] OperandFlag getOperandFlag(uint32_t idx) const noexcept override {
       case 0:
           return OperandFlagDef;
       case 1:
           return OperandFlagMetadata;
       case 2:
           return OperandFlagUse;
           return OperandFlagMetadata;
           reportUnreachable(CMMC_LOCATION());
[[nodiscard]] InstFlag getInstFlag() const noexcept override {
   return InstFlagNone | InstFlagLoad;
[[nodiscard]] std::string_view getUniqueName() const noexcept override {
   return "RISCV.FLW":
```



# Highlights – uArch-aware codegen

| Optimization                             | Related feat/comp   | Improvement |
|------------------------------------------|---------------------|-------------|
| Top-down postRA schedule/Remove hazards  | InOrder/Pipeline    | 2%          |
| Latency-aware mul/sdiv/srem by constants | Pipeline            | 1%          |
| Merge sw/lw                              | LD/ST               | 5-10%       |
| Short forward branch opt                 | Pipeline            | 5-10%       |
| Identical code folding                   | I-TLB, I-Cache      | 1%          |
| Tail duplication                         | BPU                 | 5-20%       |
| Fused address generation                 | RISC-V Zba Ext      | 5%          |
| Remove *W insts/sext.w                   | RISC-V C Ext        | 1%          |
| Pettis-Hansen block layout               | BPU, I-Cache, I-TLB | 5-20%       |
| Common base opt                          | Address generation  | 5%          |



## Highlights – Compiler testing

- Functional SysY2023
- Unit testing IntegerRangeAnalysis
- Fuzzing csmith: 1M testcases/day on a 40C machine
- Regression testing 1k+ testcases (self-written/LLVM)
- Differential testing GCC/LLVM
- Formal verification Alive2 <a href="https://alive2.llvm.org/ce/z/uwy3Fg">https://alive2.llvm.org/ce/z/uwy3Fg</a>

```
define i1 @src(i32 noundef %n, i32 noundef %x) {
                                                   define i1 @tgt(i32 noundef %n, i32 noundef %x) {
 %cmp = icmp sge i32 noundef %n, 0
  br i1 %cmp, label %then, label %else
                                                     %cmp = icmp sge i32 noundef %n, 0
                                                     br i1 %cmp, label %then, label %else
%else:
  ret i1 0
                                                 /%else:
                                                     ret i1 0
%then:
  %cmp1 = icmp sge i32 noundef %x, noundef %n
 %cmp2 = icmp slt i32 noundef %x, 0
                                                     %cmp1 = icmp ule i32 noundef %n, noundef %x
                                                     ret i1 %cmp1
  %or = or i1 %cmp1, %cmp2
  ret i1 %or
                                                   Transformation seems to be correct!
```





# Highlights – Continues perf monitoring



testcases

- https://dtcxzyw.github.io/cmmc/riscv/testcases
- https://dtcxzyw.github.io/cmmc/riscv/summary



| 03_sort1                  | 0.350654  | 0.357526  | 0.01   | 1.96%        |
|---------------------------|-----------|-----------|--------|--------------|
| transpose0                | 5.328811  | 5.413624  | 0.08   | 1.59%        |
| transpose2                | 13.136108 | 13.322474 | 0.19   | 1.42%        |
| layernorm1                | 3.173718  | 3.17622   | 0.00   | 0.08%        |
| layernorm3                | 3.179776  | 3.177559  | -0.00  | -0.07%       |
| layernorm2                | 3.187427  | 3.177461  | -0.01  | -0.31%       |
| median2                   | 17.790588 | 16.421455 | -1.37  | -7.70%       |
| median1                   | 0.002375  | 0.001807  | -0.00  | -23.92%      |
| shuffle1                  | 17.953774 | 13.099605 | -4.85  | -27.04%      |
| median0                   | 4.860975  | 3.17395   | -1.69  | -34.71%      |
| stencil0                  | 0.007983  | 0.005143  | -0.00  | -35.58%      |
| sl1                       | 8.0485    | 5.117927  | -2.93  | -36.41%      |
| brainfuck-mandelbrot-nerf | 60.535726 | 38.179598 | -22.36 | -36.93%      |
| shuffle0                  | 9.211274  | 5.778827  | -3.43  | -37.26%      |
| brainfuck-pi-nerf         | 3.55297   | 2.095907  | -1.46  | -41.01%      |
| sI2                       | 2.918234  | 1.682829  | -1.24  | -42.33%      |
| stencil1                  | 0.016707  | 0.009621  | -0.01  | -42.41%      |
| derich1                   | 0.024812  | 0.013055  | -0.01  | -47.38%      |
| sl3                       | 1.445692  | 0.74161   | -0.70  | -48.70%      |
| derich3                   | 0.024561  | 0.012576  | -0.01  | -48.80%<br>O |



### Summary

- Timeline: 2022/9/1 2023/8/21
- Codebase: hand-written ~57kLOC, generated ~49kLOC
- Source available on GitHub: <a href="https://github.com/dtcxzyw/cmmc/">https://github.com/dtcxzyw/cmmc/</a>
- Some related LLVM patches:
  - O D150286 [RISCV] Fold (select setcc, setcc, setcc) into and/or instructions (Ilvm.org)
  - <u>© D150862 [RISCV][CodeGenPrepare] Select the optimal base offset for GEPs with large offset (llvm.org)</u>
  - © D155412 [ConstraintElim] Add facts implied by MinMaxIntrinsic (Ilvm.org)
  - © D156390 [SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb (Ilvm.org)