## FPGA-based RISC Microprocessor and Compiler (Rev. 1.00)

PRCO304 - Final Stage Computing Project

Ben Lancaster 10424877 March 16, 2018

## **Revision History**

Table 1: Document revisions.

| Date       | Version | Changes                  |
|------------|---------|--------------------------|
| 11/03/2018 | 1       | Initial section outline. |

# **Abstract**

ben

# **Table of Contents**

| Lis | t of Figures                                                                                                                                                                                                                                                           | 5                                            |
|-----|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------|
| Lis | t of Tables                                                                                                                                                                                                                                                            | 6                                            |
|     | Introduction  1.1 Background  1.1.1 Current Implementations  1.2 Project Overview  1.2.1 Core Deliverables  1.2.2 Extended Deliverables  1.3 Legal and Ethical Considerations  1.3.1 Privacy  1.3.2 Fit for Purpose  1.3.3 Third-party Libraries  1.3.4 Generated Code | 7<br>7<br>7<br>8<br>8<br>8<br>9<br>9<br>9    |
|     | Project Management  2.1 Time Management                                                                                                                                                                                                                                | 10<br>10<br>10<br>10<br>10                   |
|     | PRCO304 Processor Design  3.1 High Level Design                                                                                                                                                                                                                        | 12<br>12<br>12<br>12<br>13                   |
|     | 4.1.1 Text Grammar                                                                                                                                                                                                                                                     | 14<br>14<br>15<br>15<br>15<br>15<br>15<br>15 |
| -   | Conclusion 5.1 Project Post-mortem                                                                                                                                                                                                                                     | <b>16</b>                                    |

| 6 | App | Appendices                                   |    |  |  |
|---|-----|----------------------------------------------|----|--|--|
|   | 6.1 | Appendix A. PRCO304 Core Reference Guide     | 17 |  |  |
|   | 6.2 | Appendix B. PRCO304 Compiler Reference Guide | 17 |  |  |
|   |     | 6.2.1 CLI Arguments                          | 17 |  |  |
|   | 6.3 | Appendix C. Project Initiation Document      |    |  |  |

# **List of Figures**

| 3.1 | test                                           | 12 |
|-----|------------------------------------------------|----|
| 4.1 | Foo                                            | 15 |
| 6.1 | UML sequence diagram for the PRCO304 compiler. | 17 |

# **List of Tables**

|   | Document revisions   |  |  | - |
|---|----------------------|--|--|---|
| 7 | I locument revisions |  |  | 7 |

## Introduction

Modern computing and electronics equipment, like function generators, oscilloscopes, and spectrum analysers, use FPGAs to implement their compute intensive logic. These FPGAs are often accompanied by a small, low-cost, microprocessor to supervise and provide interfaces to external peripherals.

The aim of this project is to implement this side-microprocessor into the FPGA to save on BOM costs, PCB space, and power costs, which contribute to higher development and product costs. While savings can be made by the lack of side microprocessor, the product may need a larger FPGA to accommodate the embedded microprocessor. The project will produce a small, soft-core, CPU design and compiler. Although there is no direct client in this project, I believe this project will produce an attractive product for FPGA-based product designers wishing to employ an embedded processor solution.

### 1.1 Background

### 1.1.1 Current Implementations

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Ut purus elit, vestibulum ut, placerat ac, adipiscing vitae, felis. Curabitur dictum gravida mauris. Nam arcu libero, nonummy eget, consectetuer id, vulputate a, magna. Donec vehicula augue eu neque. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Mauris ut leo. Cras viverra metus rhoncus sem. Nulla et lectus vestibulum urna fringilla ultrices. Phasellus eu tellus sit amet tortor gravida placerat. Integer sapien est, iaculis in, pretium quis, viverra ac, nunc. Praesent eget sem vel leo ultrices bibendum. Aenean faucibus. Morbi dolor nulla, malesuada eu, pulvinar at, mollis ac, nulla. Curabitur auctor semper nulla. Donec varius orci eget risus. Duis nibh mi, congue eu, accumsan eleifend, sagittis quis, diam. Duis eget orci sit amet orci dignissim rutrum.

### 1.2 Project Overview

This project aims to provide an efficient and cost-saving alternative for board and hardware product designers utilising side-microprocessors by designing, implementing, and demonstrating, a small, portable, FPGA processor core design to be used in-place of the side-microprocessor.

The processor core will implement it's own processor and instruction set architecture and so a compiler and assembler will also be provided so that software code can easily be executed on the processor.

#### 1.2.1 Core Deliverables

These core (C) deliverables are the base requirement for the project to be released in a functional and worthwhile state.

- C1. To improve my knowledge and experience of FPGA development, processor architecture, compilers, and embedded systems engineering.
- C2. To build a working and operational soft-core processor core capable of performing simple tasks.
- C3. Implementation of the soft-core processor design on real hardware.
- C4. To provide product designers with an affordable alternative to a side-microprocessor in their FPGA-based products.
- C5. To provide a technical processor reference guide and specification for the embedded core.

#### 1.2.2 Extended Deliverables

These extended (E) deliverables may not be achievable in the time frame specific in section 2.1 as they may require extra time to design and implement, require more experience or skill, or require resources currently unattainable.

- E1. To provide embedded products a convenient solution to in-field updating.
- E2. To provide easy interfacing between the FPGA design and the embedded core.
- E3. GCC/LLVM/8CC compiler backend for C programming.
- E4. Wishbone interface for easier modularity and inter-module communication.
- E5. Multi-core design with Wishbone (2).
- E6. Configurable build options (register/bus widths, optimisations/pipelining, user/privileged mode to support modern operating systems).
- E7. Memory management modules to provide protected and virtual memory lookup tables.

## 1.3 Legal and Ethical Considerations

### 1.3.1 Privacy

The PRCO304 processor will be able to read and write to all data passing through it and control all connected peripherals (such as UARTs, SDRAMs, and SD Cards). The processor does not track or store usage behaviour, instructions and their frequency, memory contents, or timing statistics, or any other usage metric.

### 1.3.2 Fit for Purpose

The PRCO304 processor is not designed to run general purpose operating systems, such as Linux or embedded RTOS systems. All memory devices attached to the FPGA are fully accessible to the processor core and instructions/programs running through it, meaning that operating systems or secure applications storing private and sensitive information is not protected by modern processor features such as privilege modes and virtual memory sections. The processor lacks common components required to run modern operating systems, such as a memory management unit (MMU) and privilege modes, and so should not be run on the processor.

The PRCO304 processor is not designed to run in high-reliability or safety-critical environments that require established safety standards, such as the UK Defence Standard 00-56 (?) and IEC 61508 (?).

The PRCO304 processor, by design, should be used as a replacement for a simple micro-controller accompanying a main processing module.

#### 1.3.3 Third-party Libraries

This project uses only 1 external library for the processor core's universal asynchronous receiver-transmitter (UART) module that does not depend on any other libraries. This allows me to guarantee that: the project rights are secure; and application behaviour is well-defined and predictable (no exploits introduced/injected from external libraries). The UART module does feature a large first-in-first-out (FIFO) buffer for temporary storage of in- and out- going messages. This FIFO is internal to the FPGA design and so is protected from external viewing/modification by probing the board in which the core is running on.

The compiler sub-project does not use any external library dependencies, does not record telemetry or usage statistics, and does not require an internet connection to run.

#### 1.3.4 Generated Code

The code generated by the compiler is **not guaranteed** to:

- **Produce constant time executable code for expressions**. For example, the compiler output for an *if* statement may implicitly vary depending on it's condition expression, which may have been optimised out, constant-folded, or without-optimisation. This also applies for user code aiming to create reliable and accurate time delay loops; although the processor does not perform optimisations such as instruction caching or branch prediction, access to memory and ALU operations may vary in time, resulting in unreliable instruction times.
- **Produce code for secure-environments**. The compiler will not randomise, obfuscate, or split-up and spread, output code. Output machine code will be in a predictable format (global variables in low-memory, instruction memory in middle-memory, and stack memory in high-memory) making the binary easily subject to reverse-engineering and modification.

# **Project Management**

### 2.1 Time Management

#### 2.2 Version Control

Version control will be utilised to improve work-flow, reference and review code changes, and protect the project from data loss and corruption. GitHub, a git hosting provider, will be utilised to host all project files, including documentation and design files.

The repository can be found here: https://github.com/bendl/prco304.

### 2.3 Method of Approach

Development of the **core** and **compiler** will be done in separate stages of the project (see section 2.1). The two deliverables will be split into 2 sub-projects. Both sub-projects will employ the **Agile development process**, using Agile's sprints to split up tasks into sub-tasks and Agile's scrums to discuss progress, features, and changes. This technique allows revisiting of tasks to tweak and iterate over their implementation which will be key when for incrementally adding features to both sub-projects, for example, adding to the core's ALU module to add conditional branching, or adding new instructions to the core's decoder module.

## 2.4 Requirements

### 2.5 Resources and Dependencies

For the first half of the development cycle, the core can be developed and verified using the Verilog simulator and test suite, **Verilator**, and VHDL and Verilog simulator, **iSim**.

The second half of development will require deploying and debugging on real hardware. This will require an FPGA development kit. To better emulate customer products, the development kit should feature common components such as LEDs, GPIO, USB interface, flash-based storage and memory, and optionally an analogue audio output port. The low-middle range of FPGA devices the project is targeting is the popular and affordable yet feature rich Spartan-6 and Artix-7 FPGAs. From my placement, I have gained experience in Xilinx FPGAs and so will be targeting them for this project to reduce risk and development time.

The following FPGA development kits are suitable for this project:

1. MiniSpartan6+ - Scarab Hardware - \$79 (already owned) (?). The MiniSpartan6+ features a Spartan-6 XC6SLX9 FPGA, 8 LEDs, 2 digital and analogue headers, FT2232 FTDI USB to JTAG, 64Mb SPI flash memory, 32MB SDRAM, an audio output jack, and a MicroSD socket.

2. Arty Artix-7 FPGA Development Board - Digilent - \$100 (?). The Arty development board features a larger Artix-35T FPGA with over 20x the number of logic cells and block memory compared to the LX9 in the MiniSpartan6+. The board components include 256MB DDR3 RAM, 16MBx4 SPI flash memory, USB-JTAG, 8 LEDs (4 of which are RGB), 4 switches, 4 buttons, and multiple Pmod connectors.

The greater number of IO options and larger FPGA make the Arty board better suited to emulating real customer products.

The project will require a computer or laptop to develop the core and compiler on and continuous integration systems to perform testing on the incremental builds. For the project demo, an oscilloscope (already owned) or digital logic analyser may be required to demonstrate some of the core's features.

# **PRCO304 Processor Design**

### 3.1 High Level Design

The PRCO304 processor is a modularised processor with modular logic blocks for the ALU, Registers, RAM, and it's peripherals.



Figure 3.1: test

## 3.2 Registers

## 3.3 Pipeline Architecture

## 3.4 Testing and Verification

# **PRCO304 Compiler**

#### 4.1 Introduction

The PRCO304 compiler is a command line based software tool used to convert a high-level text grammar (a programming language) into executable machine for the PRCO304 processor.

The compiler is invoked with parameters for the input code file and optional parameters defining optimisation level, target architecture, verbosity, output file name, and include directory paths. The full command line parameter list can be found in CLI Arguments.

#### 4.1.1 Text Grammar

The input to the compiler is a generic programming language similar to C.

```
def main() {
    int a = 0;
}
```

The grammar is defined below in Backus-Naur Form:

```
<word>
         ::= [a-zA-Z]+[0-9]*
         ::= """ <word> """
<string>
<number> ::= [0-9]+
<top> ::= <func_def>|<decl>|<extern>
<func_def> ::= <prot>><body>
<body>
         ::= "{" <primary> "}"
<decl>
         ::= <word> "=" <expr>
<control> ::= <if>|<for>|<while>
         ::= "if" "(" <expr> ")" <body>
<if>
<for>
         ::= "for" "(" <expr> <expr> ")" <body>
<expr>
         ::= <assign>|<binop>|<number>|<string>|"("|")"
         ::= <word> "=" <expr>
<assign>
         ::= "+"|"-"|"*"|"/" <expr>
<binop>
```

It should be noted that the grammar and compiler do not have any terminals for defining datatypes, such as "short" and "int". This is because there is only one datatype supported by both compiler and

processor. This is due to the complexity required to support different sized datatypes, for example, calculating how many 16-bit words to allocate on the stack for local parameters and accessing them through offsets is difficult and out of scope.

#### 4.2 Architecture

### 4.3 Implementation

#### 4.3.1 Text Parser

#### 4.3.2 AST Generation

#### 4.3.3 Optimisation

The PRCO304 compiler can optionally perform simple optimisations, such as unreachable code elimination and constant folding. The optimisations can be controlled by specifying the -0n parameter to the CLI, where n is the level of optimisation.

The techniques used by the optimiser to perform these optimisations is primitive; the optimiser is not given AST information in SSA form. Because of these limitations, only basic optimisations can be identified.

Constant folding is performed by the optimiser to reduce (reduce) expressions that can be identified as constant. For example, the following expression in Figure 4.1 can be identified as constant and can be reduced.

```
int a = 1 + (2 + 3) * 4;
```

Figure 4.1: An expression suitable for constant folding.

#### 4.3.4 Code Generation

The compiler Code Generation stage converts the optimised AST into an intermediatary list of struct prco\_op\_struct. It does this by iterating over each struct ast\_item in the AST and checks whether the item requires code generation. For example, an struct ast\_item with type AST\_FUNC is one which requires code generation. The AST is then passed to the cg\_func\_prco function where the code generation takes please. For this type, the stack frame header is generated first, before the body of the body. At the end of the function's body, the stack frame end code generation routing is run.

This code generation stage is named intermediatary because absolute addresses of JMP instructions have not been calculated. The calculation of these addresses is performed in the following stage, Assembling. In addition, the location (and offset's) of functions may need to be rearranged.

#### 4.3.5 Assembling

The final stage of the compiler is the assembling stage. This stage takes the list of struct prco\_op\_struct and outputs a list of machine code instructions. The assembler accomplishes this by calculating offsets and addresses of functions, branching instructions, and global variable addresses. It may also rearrange function locations so that the main function is the first instruction to be outputted.

## 4.4 Testing and Verification

# **Conclusion**

## 5.1 Project Post-mortem

# **Appendices**

- 6.1 Appendix A. PRCO304 Core Reference Guide
- 6.2 Appendix B. PRCO304 Compiler Reference Guide

### 6.2.1 CLI Arguments



Figure 6.1: UML sequence diagram for the PRCO304 compiler.

## 6.3 Appendix C. Project Initiation Document