#### Bachelor Thesis

### Benchmark of RISC-V in BTOR2

# Jan Krister Möller

Examiner: Dr. Mathias Fleury

University of Freiburg
Faculty of Engineering
Department of Computer Science
Chair of Computer Architecture

August 20, 2025

#### Writing Period

 $24.\,06.\,2025 - 24.\,09.\,2025$ 

#### Examiner

Dr. Mathias Fleury

# Declaration

| I hereby declare that I am the sole author and o    | composer of my thesis and that no     |
|-----------------------------------------------------|---------------------------------------|
| other sources or learning aids, other than those li | sted, have been used. Furthermore,    |
| I declare that I have acknowledged the work of oth  | ners by providing detailed references |
| of said work.                                       |                                       |
| I hereby also declare that my Thesis has not been   | n prepared for another examination    |
| or assignment, either wholly or excerpts thereof.   |                                       |
|                                                     |                                       |
|                                                     |                                       |
|                                                     |                                       |
|                                                     |                                       |
| Place, Date Sig                                     | nature                                |

# Declaration on Usage of generative AI

| I hereby declare that, with the approval of my examiner, I have employed the         |
|--------------------------------------------------------------------------------------|
| generative AI tool "GitHub Copilot" during the preparation of this thesis solely for |
| spellchecking and enhancing the formality of my written expression. Furthermore,     |
| I expressly confirm that this tool was not used to generate data or as a source of   |
| factual information or content for this thesis.                                      |
|                                                                                      |
|                                                                                      |
|                                                                                      |
|                                                                                      |
| Place, Date Signature                                                                |

# Abstract

foo bar [1] [2] [3]

## Contents

| 1 | Mot  | tivation | I                                      | 1  |
|---|------|----------|----------------------------------------|----|
| 2 | RIS  | C-V      |                                        | 3  |
|   | 2.1  | Overv    | iew                                    | 3  |
|   | 2.2  | The R    | RV64I ISA                              | 4  |
|   | 2.3  | Simul    | ation of RISC-V                        | 6  |
|   |      | 2.3.1    | Saving the State of a RISC-V Processor | 6  |
| 3 | вт   | OR2      |                                        | 9  |
|   | 3.1  | Model    | Checking                               | 9  |
|   | 3.2  | The E    | BTOR2 Language                         | 9  |
|   | 3.3  | The E    | BTOR2 Witness                          | 10 |
| 4 | Trai | nsformi  | ing RISC-V to BTOR2                    | 11 |
|   | 4.1  | The C    | Concept                                | 11 |
|   | 4.2  | Encod    | ling                                   | 12 |
|   |      | 4.2.1    | Constants                              | 13 |
|   |      | 4.2.2    | State Representation                   | 14 |
|   |      | 4.2.3    | Initialization                         | 16 |
|   |      | 4.2.4    | Fetching                               | 16 |
|   |      | 4.2.5    | Computing Values                       | 16 |
|   |      | 4.2.6    | Command Detection                      | 16 |
|   |      | 4.2.7    | Next-State Logic                       | 16 |

|   |                | 4.2.8          | Constraints                                       | 16       |
|---|----------------|----------------|---------------------------------------------------|----------|
|   | 4.3            | Testin         | g for Correctness                                 | 16       |
|   |                | 4.3.1          | State Fuzzer                                      | 16       |
|   |                | 4.3.2          | Automated Logging                                 | 16       |
|   | 4.4            | Functi         | ional vs Relational Next-State Logic              | 16       |
|   |                |                |                                                   |          |
| _ | _              |                |                                                   |          |
| 5 | Ben            | chmark         | KS .                                              | 17       |
| 5 | <b>Ben</b> 5.1 |                | Add in Functional and Relational Next-State-Logic | 17<br>17 |
| 5 |                | Multiz         |                                                   |          |
| 5 | 5.1<br>5.2     | Multiz<br>Memo | Add in Functional and Relational Next-State-Logic | 17<br>17 |
| 5 | 5.1<br>5.2     | Multiz<br>Memo | Add in Functional and Relational Next-State-Logic | 17<br>17 |

# List of Figures

| 1 | RV64I encoding formats                                | 5  |
|---|-------------------------------------------------------|----|
| 2 | Construction of .state files                          | 7  |
| 3 | Constants for transforming RISC-V to BTOR2            | 13 |
| 4 | State representation for transforming RISC-V to BTOR2 | 14 |

# List of Tables

| 1 | RV64I Instruction Subset |  |  |  |  |  |  |  |  |  |  |  |  | 6 |  |
|---|--------------------------|--|--|--|--|--|--|--|--|--|--|--|--|---|--|
|   |                          |  |  |  |  |  |  |  |  |  |  |  |  |   |  |

# List of Algorithms

# 1 Motivation

This is a template for an undergraduate or master's thesis. The first sections are concerned with the template itself. If this is your first thesis, consider reading.

### 2 RISC-V

As the first foundation for my benchmarks and, consequently, this thesis, I will discuss RISC-V and its operational principles.

#### 2.1 Overview

RISC-V is an open-source instruction set architecture first published in May 2011 by A. Waterman et al. [4]. As indicated by its name, it is based on the RISC design philosophy. (TODO: Explain RISC (compare wiki)) Since 2015, the development of RISC-V has been coordinated by the RISC-V International Association, a non-profit corporation based in Switzerland since 2020 [5]. Its objectives include providing an *open* ISA that is freely available to all, a *real* ISA suitable for native hardware implementation, and an ISA divided into a *small* base integer ISA usable independently, for example in educational contexts, with optional standard extensions to support general-purpose software development [1, Chapter 1].

Currently, RISC-V comprises four base ISAs: RV32I, RV64I, RV32E, and RV64E, which can be extended with one or more of the 47 ratified extension ISAs [1, Preface].

(EXTEND: Additional content may be required here) (TODO: Mention little endian?)

For the purposes of this work, I will focus on a subset of the RV64I ISA.

#### 2.2 The RV64I ISA

RV64I is not overly complex, but its structure is essential for understanding the subsequent work presented in this thesis. Therefore, I will explain all elements relevant to my research.

RV64I features 32 64-bit registers, labeled x0-x31, where x0 is hardwired to zero across all bits. Registers x1-x31 are general-purpose and may be interpreted by various instructions as collections of booleans, two's complement signed binary integers, or unsigned integers. Additionally, there is a register called pc, which serves as the program counter and holds the address of the current instruction [1, Chapters 4.1, 2.1].

In RV64I, memory addresses are 64 bits in size. As the memory model is defined to be single-byte addressable, the address space of RV64I encompasses 2<sup>64</sup> bytes [1, Chapter 1.4].

Like nearly all standard ISAs of RISC-V, RV64I employs a standard instruction encoding length of 32 bits, or one *word*. Only the compressed extension C introduces instructions with a length of 16 bits [1, Chapter 1.5], which is not relevant for this discussion. All RV64I instructions are encoded in one of the six formats illustrated in Figure 1.

The design of these formats results in the following features:

- Due to RISC-V's little-endian nature, the *opcode*, which encodes the general instruction, is always read first. Further specification of the instruction via funct3 and funct7 is consistently located at the same positions.
- If utilized by the instruction, the destination register rd and the source registers rs1 and rs2 are always found in the same locations, simplifying decoding.

| 31    |           | 25 24      |        | 20 :  | 19 |      | 15   | 14 | 12    | 11 |     |       | 7    | 6 |        | 0 | ,      |
|-------|-----------|------------|--------|-------|----|------|------|----|-------|----|-----|-------|------|---|--------|---|--------|
| L     | funct7    |            | rs2    |       | Ċ  | rs1  |      | fu | inct3 |    |     | rd    |      |   | opcode |   | R-Type |
| 31    |           |            |        | 20 :  | 19 |      | 15   | 14 | 12    | 11 |     |       | 7    | 6 |        | 0 |        |
| L     | imm       | [11:0]     |        |       | Ċ  | rs1  |      | fu | inct3 |    |     | rd    |      |   | opcode |   | I-Type |
| 31    |           | 25 24      |        | 20 :  | 19 |      | 15   | 14 | 12    | 11 |     |       | 7    | 6 |        | 0 | -      |
| L     | imm[11:5] |            | rs2    |       | Ċ  | rs1  |      | fu | inct3 |    | imi | m[4:0 | ]    |   | opcode |   | S-Type |
| 31 30 | )         | 25 24      |        | 20    | 19 |      | 15   | 14 | 12    | 11 |     | 8     | 7    | 6 |        | 0 | _      |
| [12]  | imm[10:5] | <u>. L</u> | rs2    |       |    | rs1  |      | fı | ınct3 | i  | mm  | 4:1]  | [11] |   | opcode |   | В-Туре |
| 31    |           |            |        |       |    |      |      |    | 12    | 11 |     |       | 7    | 6 |        | 0 | ,      |
| L     |           |            | imm[31 | L:12] | Ċ  |      |      |    |       |    |     | rd    |      |   | opcode |   | U-Type |
| 31 30 | )         |            | 2      | 1 20  | 19 |      |      |    | 12    | 11 |     |       | 7    | 6 |        | 0 | -      |
| [20]  | imm       | [10:1]     |        | 111   | Ċ  | imm[ | 19:1 | 2] |       |    |     | rd    |      |   | opcode |   | J-Type |

Figure 1: RV64I encoding formats, used in [1, Chapter 2.3]

• The highest bit of the immediate value *imm* is always bit 31, making it straightforward to sign-extend the immediate value.

Note that each immediate subfield is labeled with its bit position within the immediate value. Immediate values are always sign-extended to 31 bits, and in the case of U-, B-, and J-type formats, the missing lower bits are filled with zeros.

The instructions relevant to my work are listed in Table 1

I have divided the instructions in Table 1 into nine groups based on their operations. LUI and AUIPC move a high immediate into rd; JA\* instructions are unconditional jumps, and B\* instructions are conditional jumps. L\* instructions load sign-extended values from memory, either as Byte, Halfword, Word, or Doubleword lengths. Conversely, S\* instructions write values of the specified length to memory. (TODO: arithmetic) Note that the suffix U denotes operations where values are processed as unsigned.

I left out FENCE, ECALL and EBREAK instructions as without I/O interaction or an environment like an OS or a debugger, these are not needed.

| INSTR | TYPE | INSTR | TYPE | INSTR | TYPE | INSTR | TYPE |
|-------|------|-------|------|-------|------|-------|------|
| LUI   | U    | LW    | Ι    | XORI  | I    | SLT   | Ι    |
| AUIPC | U    | LD    | I    | ORI   | I    | SLTU  | I    |
| JAL   | J    | LBU   | I    | ANDI  | I    | XOR   | I    |
| JALR  | I    | LHU   | I    | SLLI  | I    | OR    | I    |
| BEQ   | В    | LWU   | I    | SRLI  | I    | AND   | I    |
| BNE   | В    | SB    | S    | SRAI  | I    | SLL   | I    |
| BLT   | В    | SH    | S    | ADDIW | Ι    | SRL   | I    |
| BGE   | В    | SW    | S    | SLLIW | I    | SRA   | I    |
| BLTU  | В    | SD    | S    | SRLIW | I    | ADDW  | Ι    |
| BGEU  | В    | ADDI  | Ι    | SRAIW | I    | SLLW  | I    |
| LB    | I    | SLTI  | I    | ADD   | I    | SRLW  | I    |
| LH    | I    | SLTIU | I    | SUB   | I    | SRAW  | I    |
|       |      |       |      |       |      |       |      |

Table 1: Subset of RV64I instructions (TODO: Maybe rework, not happy yet)

#### 2.3 Simulation of RISC-V

(TODO: This may be better placed in Chapter 4, but the state file is relevant here.)

#### 2.3.1 Saving the State of a RISC-V Processor

To preserve the current state of a RISC-V processor, both the registers and memory must be stored. For this purpose, I have devised the format shown in Figure 2. The minimal file consists only of the two designators "REGISTERS:" and "MEMORY:", the current pc, and one empty line.

```
1 REGISTERS:
```

- 2 PC: current pc in hex
- 3 x(0-31): value of register in hex

4

- 5 MEMORY:
- 6 (address in hex): byte, halfword, word or doubleword in hex

Figure 2: Construction of .state files

3 BTOR2

The second foundation of my benchmarks is BTOR2, a word-level model checking

format published by A. Niemetz et al. [2].

3.1 Model Checking

(TODO: Write something about model checking...)

3.2 The BTOR2 Language

Generally in BTOR2, every line represents either a sort or a node, where normally the

line number acts as an identifier. A sort behaves similar to a type as with it, either

the length of a bitvector or the size of an array of bitvectors is defined. Nodes on the

other hand represent a value of a defined sort and come as constants, operations or

constraints. These values can later on be referenced by the node identifier, so the

line number. The syntax of BTOR2 can be found at [2, figure 1] and corresponding

operators in [2, table 1]

Key features of BTOR2 include its ability to operate sequentially, which makes the

implementation of a RISC-V structure highly convenient. The main feature is the

state operator, which defines a node that is sequentially updated. With an init

node, this state can be assigned an initial value, and with next, the sequentially next

9

state can be defined. Finally, constraints can be used to specify endpoints for a model. These endpoints may indicate that something unintended has occurred or that the intended information has been found. In either case, the resulting model is provided as a witness.

#### 3.3 The BTOR2 Witness

After receiving a witness, it must be interpreted. On the second line of a witness, the constraint that was triggered is specified. Subsequently, for each sequential iteration, the witness first presents—marked with #x, where x is the iteration number—a representation of all states in the current iteration. Second, marked with @x, all inputs for the iteration are listed.

(TODO: Maybe a bit more, its a bit bare bones)

### 4 Transforming RISC-V to BTOR2

This chapter addresses the main problem of the thesis: transforming RISC-V code into the BTOR2 format for benchmarking purposes. My primary reference for this endeavor is F. Schrögendorfer's master's thesis, "Bounded Model Checking in Lockless Programs"[3], in which he describes, among other topics, an encoding concept for a minimal machine in a multiprocessor context [3, Chapter 2] and two approaches to next-state logic: a functional [3, Chapter 6] and a relational [3, Chapter 7] approach. I will focus on the relational approach; a discussion of both approaches can be found in Section 4.4.

#### 4.1 The Concept

To successfully execute a RISC-V instruction, three fundamental steps must occur in sequence:

- Fetch the current instruction from memory
- Identify the instruction
- Execute the instruction

Due to the fixed instruction length of RISC-V, as mentioned in Section 2.2, fetching the current instruction is straightforward. Ultimately, we want a node that retrieves a word from memory at the location specified by pc.

For basic identification, the opcode must be extracted and checked. Depending on the opcode, further distinctions between instructions require extracting and checking funct3 and, if necessary, funct7. Ultimately, we want a node for each instruction, which holds a boolean value indicating whether this instruction was fetched.

To execute the instruction, we need to extract the values of the immediate imm and, if used, the registers rs1 and rs2. All instructions only modify rd, pc, or memory. Therefore, the next-state logic can be generalized for these three cases.

Memory is only modified when a store instruction is identified. As all store instructions share the same type, computing the memory address is consistent across them. The final step is overwriting the memory at this address.

For the pc, except for jump commands, it always increments to point to the next instruction. The two unconditional jumps, JAL and JALR, must be handled separately. For branch instructions, after determining whether the relevant condition for the instruction holds, we can generalize, as all branch instructions execute the same operation from this point onward.

With rd, generalization across instructions is not feasible. However, we can generalize across all possible registers by adding a check in each register's update function to determine whether the register in question is rd.

### 4.2 Encoding

For better visualisation in the BTOR2 code I will mark all sort-ids in grey, all node-ids in red and all non-id numbers blue. As described in the BTOR2 syntax [2, Figure 1], each line can get an accompanying symbol. Sadly those cant be used as an alias to the line numbers, but for increased clarity, in the following figures I will use them as such aliases.

#### 4.2.1 Constants

First off, I added the sorts needed and some general purpose useful constants into the BTOR2 model as seen in Figure 3.

```
1
    sort
             bitvec
                      1
                            Bool
 2
                      16
                            AS
    sort
             bitvec
 3
                      8
                            В
             bitvec
    sort
 4
             bitvec
                      16
                            Н
    sort
 5
    sort
             bitvec
                      32
 6
    sort
             bitvec
                      64
                            D
 7
                      2 3
    sort
             array
                            Mem
8
             6
                            empty_reg
    zero
9
    constd
             5
                      31
                            register\_bitmask
10
    constd
            5
                      7
                            shift\_rd
11
    constd
             5
                      15
                            shift_rs1
12
    constd
             5
                      20
                            shift_rs2
                      12
                            shift\_funct3
13
    constd
             5
14
            5
                      25
                            shift\_funct7
    constd
15
             5
                            bit_picker
    one
16
             1
                            true
    one
17
    zero
                            false
```

Figure 3: Constants for encoding

Of note is the Representation of the memory as an array of addressable memory cells of each 1byte. Obviously, the set address space of 16bit is magnitudes away of the expected address space of 64bit, but representing a 64bit addressable memory with its resulting  $2^{64}B \approx 18Exabyte$  is not implementable. Therefore, as I needed a feasible amount of memory space, I artificially chose a 16bit address space as a soft minimum. With 65kB and therefore programs with possibly > 10000 instructions I deemed this memory sufficient for most use cases. Despite this, the encoding is implemented in such a way that the address space can be altered with. (TODO: benchmark auswirkungen von memory size)

#### 4.2.2 State Representation

The next logical step is defining a representation of a RISC-V state. Tis is straightforward as shown in Figure 4

```
n + 0
         state
                 D
                        x0
                                      n + 18
                                               state
                                                        D
                                                                x18
 n + 1
                 D
                                                        \mathbb{D}
         state
                        x1
                                      n + 19
                                               state
                                                                x19
 n + 2
         state
                 D
                        x2
                                      n + 20
                                               state
                                                        D
                                                                x20
 n + 3
                                                        D
         state
                 D
                        x3
                                      n + 21
                                               state
                                                                x21
 n + 4
         state
                 D
                                      n + 22
                                               state
                                                        \mathbb{D}
                                                                x22
                        x_4
 n + 5
         state
                 D
                        x5
                                      n + 23
                                               state
                                                        D
                                                                x23
 n + 6
         state
                        x6
                                      n + 24
                                               state
                                                                x24
 n + 7
         state
                 D
                        x7
                                      n + 25
                                               state
                                                        \mathbb{D}
                                                                x25
 n + 8
                 D
                                      n + 26
                                                        D
         state
                        x8
                                               state
                                                                x26
 n + 9
                        x9
                                      n + 27
                                                                x27
         state
                 D
                                               state
                                                        D
n + 10
         state
                 D
                        x10
                                      n + 28
                                               state
                                                        D
                                                                x28
n + 11
         state
                 D
                        x11
                                      n + 29
                                               state
                                                        D
                                                                x29
n + 12
                                      n + 30
                                                        D
         state
                        x12
                                               state
                                                                x30
n + 13
         state
                 D
                        x13
                                      n + 31
                                               state
                                                        D
                                                                x31
n + 14
         state
                 D
                        x14
                                      n + 32
                                               state
                                                        AS
                                                                pc
n + 15
         state
                 D
                        x15
n + 16
         state
                 D
                        x16
                                       n + i
                                               state
                                                        Mem
                                                                memory
n + 17
         state
                 D
                        x17
```

Figure 4: State representation for encoding

| 4.2.4 Fetching                                |
|-----------------------------------------------|
| 4.2.5 Computing Values                        |
| Opcode                                        |
| funct3 & funct7                               |
| Registers                                     |
| Immediate                                     |
| 4.2.6 Command Detection                       |
| 4.2.7 Next-State Logic                        |
| rd                                            |
| pc                                            |
| Memory                                        |
| 4.2.8 Constraints                             |
| 4.3 Testing for Correctness                   |
| 4.3.1 State Fuzzer                            |
| 4.3.2 Automated Logging                       |
| 4.4 Functional vs Relational Next-State Logic |

4.2.3 Initialization

### 5 Benchmarks

- 5.1 MultiAdd in Functional and Relational Next-State-Logic
- 5.2 Memory Operations
- 5.3 Results

### Bibliography

- [1] The RISC-V Instruction Set Manual Volume I: Unprivileged ISA, 2025, version 20250508. [Online]. Available: https://lf-riscv.atlassian.net/wiki/spaces/HOME/pages/16154769/RISC-V+Technical+Specifications
- [2] A. Niemetz, M. Preiner, C. Wolf, and A. Biere, "Btor2, BtorMC and Boolector 3.0," in *Computer Aided Verification*, H. Chockler and G. Weissenbacher, Eds. Cham: Springer International Publishing, 2018, pp. 587–595.
- [3] F. Schrögendorfer, "Bounded Model Checking of Lockless Programs," Master's thesis, Johannes Kepler University Linz, August 2021. [Online]. Available: https://epub.jku.at/obvulihs/download/pdf/6579523
- [4] A. Waterman, Y. Lee, D. A. Patterson, and K. Asanović, "The risc-v instruction set manual, volume i: Base user-level isa," UC Berkeley, Tech. Rep. UCB/EECS-2011-62, May 2011. [Online]. Available: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-62.html
- [5] "History of RISC-V," https://riscv.org/about/, accessed: 15.08.2025.