# Linux-capable RISC-V CPU for IOb-SoC

## Pedro Nuno de Melo Antunes pedronmantunes@tecnico.ulisboa.pt

## Instituto Superior Técnico, Lisboa, Portugal

#### November 2022

#### Abstract

The recent appearance of the RISC-V ISA opened many exciting possibilities for building processor-based systems without the need to license the base architecture from providers like ARM. Running applications on bare metal RISC-V machines is a good starting point, but an OS is required to ease the developers' efforts for more complex applications. Linux has been around for over three decades and is a well-polished OS. The problem is that open-source SoC platforms that run Linux and simultaneously are modular and configurable do not exist. This work aims to create an SoC capable of executing a Linux OS. The author bases the work on IOb-SoC, a modular and configurable open-source SoC platform that only runs bare-metal applications. The author achieves this thesis goal by changing the IOb-SoC CPU and adding three hardware peripherals. Additionally, the author develops software that improves the IOb-SoC platform, complements the hardware components created and allows the execution of a complete OS in the new SoC. Throughout this work, the thesis might refer to the SoC developed as IOb-SoC-Linux. The IOb-SoC-Linux uses less than 10% of the FPGA resources on the supported development boards. Moreover, the Linux OS boots in five seconds in the Kintex Ultrascale and seven seconds in the Cyclone V.

Keywords: RISC-V, Linux, Systems on-Chip (SoC), Verilog, IOb-SoC

#### 1. Introduction

Motivation and state-of-the-art... Include relevant references [1].

#### 2. Background

Place text here...

#### 2.1. Sub-section...

A generic CFD design problem can be formally described as

Minimize 
$$Y(\alpha, \mathbf{q}(\alpha))$$
  
w.r.t.  $\alpha$ , (1)  
subject to  $\mathcal{R}(\alpha, \mathbf{q}(\alpha)) = 0$   
 $C(\alpha, \mathbf{q}(\alpha)) = 0$ ,

where Y is the cost function,  $\alpha$  is the vector of design variables and  $\mathbf{q}$  is the flow solution, which is typically of function of the design variables, and C=0 represents additional constraints that may or may not involve the flow solution. The flow governing equations expressed in the form  $\mathcal{R}=0$  also appear as a constraint, as the solution  $\mathbf{q}$  must always obey the flow physics.

#### 2.2. Sub-section...

More text...

#### 3. Implementation

Place text here...

3.1. Sub-section...

More text...

3.2. Sub-section...

More text...

#### 4. Project Results

In the following chapter, the author analyses the results obtained from the hardware and software developed in this thesis project. The candidate successfully executes the minimal Linux OS in real hardware using the developed System on a chip. All the results obtained in this thesis which communicate with the FPGA board or the SoC testbench, are executing the developed *Console* program. The hardware components comprising the SoC differ depending on the software needs.

The objective of this thesis project was to run an Operating System in the *IOb-SoC-Linux*. Table 1 presents how much time it takes to build the complete OS with the command "make build-OS". The "real" time is the time that passes since the user executes the command until it finishes. The "user" time is the time the CPU takes while executing operations in the user space. The "user" time is bigger than the "real" time because it counts the time

passed in each CPU core. Part of the compilation of the RootFS and the kernel is done in parallel using two cores.

```
real 4m29,570s
user 8m12,039s
sys 0m56,887s
```

Table 1: Time it takes to build the OS.

The OS size is to big to run in the FPGA internal memory. The *OpenSBI* bootloader is 90896 Bytes. The device tree blob is 1669 Bytes. The Linux kernel is 4426152 Bytes. Lastly the root file system is 1142733 Bytes. The memory has to have at least 8 MB (2<sup>2</sup>3) to store all this software. However, the Linux kernel needs a bigger memory where it can store virtual memory pages and execute the different application processes. The device tree source describes the system had 512 MB of available memory. Consequently, the author had to implement the *IOb-SoC-Linux* on the FPGA with access to the external memory. The internal memory could never be as big as 512 MB.

In figures 1 and 2 the reader can see the start of the OS simulation with *Verilator*.

```
pedrompedro-desktop://media/data/Documentos/Tese/SoC/lob-soc-vexriscv Q = - - ×

Boot HART ID : 0
Boot HART Brast Donatin : root
Boot HART Priv Version : unknown
Boot HART Base ISA : rv32/Hac
Boot HART Priv Version : unknown
Boot HART Brast ISA Extensions : none
Boot HART Bray Middress Blts: 0
Boot HART Priv Midress Blts: 0
Boot HART Priv Midress Blts: 0
Boot HART Priv Midress Blts: 0
Boot HART MIDELEG : oxnoen00222
Boot HART MIDELEG : oxnoen00222
Boot HART MIDELEG : oxnoen00222
Boot HART MIDELEG : oxnoen00220
Goot HART MIDELEG : oxnoen002000 | 12:10, GNU Id (GNU Binuttis) 2:39) #1 Tue Oct 11 17:29:26 WEST 2022
G.000000] Or fdt: Ignoring nemory range 0x80000000 - 0x00000000 | 11:1121 randisk at 10 xQrtrval) (Gassason brival)
G.000000] Fartis Indisk at 10 xQrtrval) (Gassason british Indisk at 10
```

Figure 1: iob-UART16550 and iob-PLIC properties

Figure 1 shows the initialization of the Console program. Furthermore, it shows the instantiation of the iob-UART16550 and the iob-PLIC. The iob-UART16550 and the PLIC core have an initial block that prints their properties. The synthesis tools do not synthesise the initial block to real hardware, but the simulator executes it. Figure 2 shows the iob-bootloader and the start of the OpenSBI bootloader. The iob-bootloader in figure 2 does not transfer the software to the memory because the author executed the simulation considering that the software was already in the memory.

Figure 3 shows the end of the *OpenSBI* bootloader and the start of the Linux kernel. The first line printed by the Linux kernel indicates the author built the kernel executing, the kernel version

```
| Tob-Bootloader: connected! | Tob-Bootloader: DDR in use | Tob-Bootloader: program to run from DDR | Tob-Bootloader: Restart CPU to run user program... | Tob-Bootloader: Restart CPU to run u
```

Figure 2: IOb-SoC bootloader and OpenSBI firmware.

and which toolchain he used to compile it.

Figure 3: Start of the Linux kernel boot with *Verilator*.

While figure 3 shows the start of the Linux kernel, figure 4 shows the end of the Linux kernel booting process and the execution of the "init" script. The "init" script is the first program the OS executes after the Linux kernel mounts the RootFS and finishes booting. There exist multiple messages printed to the terminal between the output shown in figure 3 and in 4. Those messages show the progress while the Linux kernel boots. The Linux kernel boot process's last message is "Run /init as init process". After that message the SoC executes the "init" program.

Figure 5 shows the developed minimal OS running on an FPGA. The reader can see that the author has suppressed the shell warning. The initial part of the figure shows the final stage of the Linux kernel booting. After booting, the author tested the ls / command that showed the files and directories in the systems' root. Lastly the author executed the cat init command for the OS to print the contents of the "init" script to the terminal.

The time the Linux kernel takes to boot in real hardware, figure 5, is almost double what it takes

Figure 4: End of Linux kernel boot with Verilator.

Figure 5: Linux kernel boot in the FPGA.

to boot in simulation, figure 5. The time to boot is almost double because the memory module used in the simulation does not have any latency. When the L2 cache fetches data from memory in real hardware, it must wait before receiving the data burst. Using the  $CYCLONE\ V$  FPGA board the Linux kernel takes 7.01 seconds to boot. The author expected the boot to take longer since the system clock frequency used with the  $CYCLONE\ V$  is 50 MHz. The Kintex Ultrascale was able to run with a frequency of 100 MHz. The OpenSBI bootloader and the device tree blob had to be recompiled with the system frequency defined to 50 MHz to run in the  $CYCLONE\ V$ .

Tables 3 and 2 show the resources used by the *IOb-SoC-Linux* in the different FPGAs.

Tables 3 and 2 show that the resources utilization from the *IOb-SoC-Linux* is not much bigger than the *IOb-SoC*. The FPGA still has enough resources to implement hardware accelerators.

|             | Resources | FPGA usage % |
|-------------|-----------|--------------|
| ALM         | 11,227    | 10           |
| DSP         | 8         | 3            |
| FF          | 13725     | 2            |
| BRAM blocks | 234       | 19           |
| BRAM bits   | 755,424   | 9            |

Table 2: Cyclone V GT

|           | Resources | FPGA usage % |
|-----------|-----------|--------------|
| LUTs      | 23126     | 9.54         |
| Registers | 24505     | 5.05         |
| DSPs      | 10        | 0.52         |
| BRAM      | 39.5      | 6.58         |

Table 3: Kintex Ultrascale

#### 5. Conclusions

Conclusions, future work and some final remarks...

### Acknowledgements

The author would like to thank his friends and professors who helped and accompanied him through his studies. Furthermore, above all, the author is thankful for his family that has been in his life since day 0, giving advice and guiding him, leading him to where he is today.

#### References

[1] J. Nocedal and S. J. Wright. *Numerical optimization*. Springer, 1999.