# O MUCLEI

# **Copyright Notice**

Copyright © 2018-2021 Nuclei System Technology. All rights reserved.

Nuclei $^{\text{\tiny TM}}$  are trademarks owned by Nuclei System Technology. All other trademarks used herein are the property of their respective owners.

The product described herein is subject to continuous development and improvement; information herein is given by Nuclei in good faith but without warranties.

This document is intended only to assist the reader in the use of the product. Nuclei System Technology shall not be liable for any loss or damage arising from the use of any information in this document, or any incorrect use of the product.

# **Contact Information**

Should you have any problems with the information contained herein or any suggestions, please contact Nuclei System Technology by email <a href="mailto:support@nucleisys.com">support@nucleisys.com</a>, or visit "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a> for supports or online discussion.



# **Revision History**

| Rev · | Revision<br>Date | Revised Section | Revised Content                                  |
|-------|------------------|-----------------|--------------------------------------------------|
| 1.0.0 | 2020/4/1         | N/A             | 1. First version as the full English             |
| 1.1.0 | 202/7/20         | 5.5             | 1. Add JTAG_VPI Description                      |
| 1.2.0 | 2020/8/28        | 1, 5            | 1. Update the RTL generation flow                |
| 1.3.0 | 2020/12/21       | 3, 8            | 1. Add hbirdkit, mcu200t, ddr200t FPGA BOARD sup |
| 1.4.0 | 2021/8/2         | 3, 8            | 1. Add KU060 FPGA BOARD suport                   |

# **Table of Contents**

| COPYR        | RIGHT NOTICE                                | 0  |
|--------------|---------------------------------------------|----|
| CONTA        | ACT INFORMATION                             | 0  |
| REVISI       | ION HISTORY                                 | 1  |
|              | OF CONTENTS                                 |    |
|              | F TABLES                                    |    |
|              | F FIGURES                                   |    |
|              | T FIGURESVTRODUCTION OF RELEASE PACKAGE     |    |
| 1.1.         | Release Package                             |    |
| 1.1.         | FILES IN PACKAGE                            |    |
| 1.2.         | NAMING RULE OF CORE                         |    |
| 1.3.<br>1.4. | MODULE HIERARCHY OF CORE                    |    |
|              |                                             |    |
| 2. TO        | OP LEVEL INTEGRATION                        | 9  |
| 2.1.         | Clocks                                      | 9  |
| 2.2.         | Interfaces                                  | 9  |
| 2.3.         | Memory Map                                  | 9  |
| 3. SC        | OC, FPGA, SDK AND IDE FOR EVALUATION        | 10 |
| 3.1.         | Nuclei Evaluation SoC (Hummingbird SoC)     | 10 |
| 3.2.         | FPGA EVALUATION BOARD AND JTAG DEBUGGER     | 11 |
| 3.3.         | SOFTWARE DEVELOPMENT KIT (SDK)              | 12 |
| 3.4.         | INTEGRATED DEVELOPMENT ENVIRONMENT (IDE)    | 12 |
| 4. C         | ONFIGURE TO GENERATE RTL                    | 13 |
| 4.1.         | USE NUCLEI_GEN TOOL TO GENERATE RTL CODES   | 13 |
| 4.2.         | CHECK AND COMPILE THE VERILOG RTL           | 17 |
| 5. SI        | IMULATION WITH SIMPLE ASSEMBLY TESTCASE     | 19 |
| 5.1.         | OVERVIEW OF SELF-CHECK TESTCASE             | 19 |
| 5.2.         | TESTBENCH TO INITIALIZE SELF-CHECK TESTCASE | 20 |
| 5.3.         | Introduction of Testbench                   | 22 |
| 5.4.         | STEPS TO RUN SIMULATION                     | 23 |
| 5.5.         | INTRODUCTION OF JTAG_VPI                    | 24 |
| 6. SI        | MULATION WITH COMPREHENSIVE C PROGRAM       | 29 |
| 7. LO        | OGIC SYNTHESIS                              | 30 |



| 7.1.                | 7.1. LOGIC SYNTHESIS FOR VERILOG RTL     |    |
|---------------------|------------------------------------------|----|
| 7.2.                | Notes for Attentions                     | 30 |
| 8. FPGA PROTOTYPING |                                          |    |
|                     | FILES IN FPGA PROJECT                    |    |
| 8.2.                | GENERATE BITSTREAM (MCS FORMAT)          |    |
| 8.3.                | PROGRAM BITSTREAM (MCS FORMAT) INTO FPGA |    |



# **List of Tables**



# **List of Figures**

| Figure 1-1 Module Hierarchy of N607 Core                                                    |    |  |
|---------------------------------------------------------------------------------------------|----|--|
| Figure 3-1 Nuclei Evaluation SoC (Hummingbird SoC)                                          | 11 |  |
| Figure 4-1 The user interface of core_gen tool (picture from N203 IPlib just as an example) | 14 |  |
| Figure 4-2 PMP Configuration sub-menu (picture from N203 IPLib just as an example)          | 14 |  |
| FIGURE 4-3 PMP ENTRY NUMBER MENU (PICTURE FROM N203 IPLIB JUST AS AN EXAMPLE)               | 15 |  |
| Figure 4-4 ECLIC Base Address Configuration (picture from N203 IPLib just as an example)    | 15 |  |
| FIGURE 4-5 INPUT THE ECLIC BASE ADDRESS VALUE (PICTURE FROM N203 IPLIB JUST AS AN EXAMPLE)  | 16 |  |
| FIGURE 4-6 INPUT THE ECLIC IRQ NUMBER VALUE (PICTURE FROM N203 IPLIB JUST AS AN EXAMPLE)    | 16 |  |
| FIGURE 4-7 THE CONFIGURATION ERROR (PICTURE FROM N203 IPLIB JUST AS AN EXAMPLE)             | 17 |  |
| Figure 5-1 The code segment of add.S test                                                   | 20 |  |
| FIGURE 5-2 THE CONTENT OF RV32UI-P-ADDI.DUMP FILE                                           | 21 |  |
| FIGURE 5-3 THE CONTENT OF RV32UI-P-ADDI.VERILOG FILE                                        | 22 |  |
| FIGURE 5-4 PRINT THE PASS OR FAIL IN TESTBENCH                                              | 23 |  |
| FIGURE 5-5 JTAG VPI CONNECTION DIAGRAM                                                      | 25 |  |

# 1. Introduction of Release Package

# 1.1. Release Package

The Nuclei processor is released as a package, all the Nuclei processors (N/NX/UX-200/300/600/900 series) will keep consistent with the n600 flow shown in this Integration guide.

Using n600 as example shown in Table 1-1.

**Table 1-1 Release Packages** 

| Package Name        | Description                                                                                                                             |
|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
| n600_rls_pkg.tar.gz | Including the Verilog RTL source codes, Core generation tool, Evaluation SoC, Simulation Environment, Logic Synthesis and FPGA project. |

The release package of N600 Series Core can be licensed from Nuclei. After got the release package, user can use the following command to decompress.

```
tar -xzvf n600_rls_pkg.tar.gz
```

# 1.2. Files in Package

The files in the package are introduced as below.

```
n600 rls pkg
      I env.csh
                          // csh environment script
                          // bash environment script
      I env.sh
                         // The libc dependency file
      \mid n600.iplib // The IP Library for Core RTL generation
      | private.pem // Private Key for nuclei_gen (need to contact Nuclei)
      | nuclei_gen // Core RTL configure and generation tool
                                 // Please refer Chapter-4 for more details
After using 'nuclei_gen' to configure and generate the Core RTL, a new directory 'n600' will be generated, if using 'nuclei_gen' to
reconfigure and generate a new version of Core RTL, then the previous directory 'n600' will be moved to be 'n600.bak' and 'n600' is
updated to the new generated one.
             I----design
                              // Directory for RTL
                                              // Directory for Core
                   |----soc
                                               // Directory for bus-fab, subsystem,
```

```
// memory and peripherals in SoC
|----riscv-tests // Directory for testcases
                                      // Directory for Verilog TestBench
|----vsim
                // Directory for Simulation
                                      // Please see more details from Chapter 5.
      |----bin
                                       // Directory for functional scripts
      |----Makefile
                               // Makefile for simulation
      |----run
                                       // Directory to run
                                // Directory for FPGA project
I----fpga
                                    // Please see more details from Chapter 8.
|----syn
                                // Directory for Synthesis project
                                     // Please see more details from Chapter 7.
```

#### Note:

■ The above "n600\_" is just a general prefix, for the specific core, such as N607, will use the specific prefix "n607".

# 1.3. Naming Rule of Core

The source code of N600 Series Cores have different prefix for the files and modules, for example, if it is N607 Core, then the files and modules have the prefix "n607\_". The same naming rules applied to other Cores like N605, N608, etc.

# 1.4. Module Hierarchy of Core

Take N607 as the example, as depicted in Figure 1-1, the key points are:

- n607\_core\_wrapper is the top module of the Core, which include several key sub-modules:
  - n607\_core: The Core part.
  - n607\_rst\_ctrl: The module to sync external async reset signal to synced reset with "Asynchronously assert and synchronously de-assert" style.
  - n607\_dbg\_top: The module to handle the debug functionalities.
- n607\_ucore is under Core hierarchy, it is the main part of Core.
- Besides the n607\_ucore, there are several other sub-modules:
  - n607\_clic\_top: The private interrupt controller.
  - n607 tmr top: The private timer unit.
  - n607\_clk\_ctrl: The clock control module.



Figure 1-1 Module Hierarchy of N607 Core

# 2.Top Level Integration

#### 2.1. Clocks

Clocks to the Core are the baseline of the top level integration.

For the details of the N600 Series Cores' clocks, please refer to Section "Clock Domains" of the document <Nuclei\_N600\_Databook.pdf>, which can be easily got from "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a>.

#### 2.2. Interfaces

The interfaces of Core need to be carefully checked during the top level integration.

For the details of the N600 Series Cores' interfaces, please refer to Chapter "Core Interfaces" of the document <Nuclei\_N600\_Databook.pdf>, which can be easily got from "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a>.

# 2.3. Memory Map

There are quite several interfaces and private peripherals for the N600 Series Core, the address spaces of them are mostly configurable, hence the SoC integrator can determine the address memory map per the SoC requirements.

For the details of the N600 Series Cores' clocks, please refer to Section "Address Spaces of Interfaces and Private Peripherals" of the document <Nuclei\_N600\_Databook.pdf>, which can be easily got from "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a>.

# 3.SoC, FPGA, SDK and IDE for Evaluation

# 3.1. Nuclei Evaluation SoC (Hummingbird SoC)

To easy user to evaluate Nuclei Processor Core, the prototype SoC (called Hummingbird SoC) is provided for evaluation purpose. As depicted in Figure 3-2, this prototype SoC includes:

- Processor Core, it can be Nuclei N class, NX class or UX class Processor Core.
- On-Chip SRAMs for instruction and data.
- The SoC buses.
- The basic peripherals, such as UART, GPIO, SPI, I2C, etc.

With this prototype SoC, user can run simulations, map it into the FPGA board, and run with real embedded application examples.

For the details of the Nuclei Evaluation SoC (Hummingbird SoC), please refer to the document <Nuclei\_Eval\_SoC\_Intro.pdf>, which can be easily got from "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a>.



Figure 3-2 Nuclei Evaluation SoC (Hummingbird SoC)

# 3.2. FPGA Evaluation Board and JTAG Debugger

Nuclei have customized 3 types FPGA evaluation boards, called Hummingbird Evaluation Kit (hbirdkit for short, 100T), MCU200T Evaluation Kit, DDR200T Evaluation Kit and KU060 Evaluation Kit (please go to Nuclei website for details of these 3 types of boards: <a href="https://www.nucleisys.com/developboard.php">https://www.nucleisys.com/developboard.php</a>).

The FPGA boards can be used as the SoC prototype board directly:

- If the FPGA have been pre-burned (programmed) with "Nuclei evaluation SoC", this board can be worked as a SoC prototype directly. Since the board has been designed with buttons and extended ports names in line with the SoC GPIO pin name, the embedded software engineers can directly use this board without knowing any FPGA hardware knowledge.
- About how to generate the FPGA Bitstream (MCS) with pre-built FPGA project, please refer to Chapter 8..

Nuclei have customized a Debugger hardware (called Hummingbird Debugger Kit), which can be used to debug the RISC-V core in FPGA prototype or in real chip.

For the detailed introduction of the "Hummingbird Evaluation Kit" and "Hummingbird Debugger Kit", please refer to the document

<Nuclei\_FPGA\_DebugKit\_Intro.pdf> which can be downloaded from "Development Boards" page of Nuclei website (http://www.nucleisys.com/developboard.php).

# 3.3. Software Development Kit (SDK)

Nuclei have created a "Nuclei Software Development Kit (Nuclei-SDK)" which is an open software platform to facilitate the software development for systems based on Nuclei Processor Cores. For more details about Nuclei-SDK, please see its online doc from <a href="http://doc.nucleisys.com/nuclei\_sdk">http://doc.nucleisys.com/nuclei\_sdk</a>.

Based on the "Nuclei Evaluation SoC", and with the demo software projects from Nuclei-SDK, user can quickly familiarize the software development for Nuclei Processor Cores.

# 3.4. Integrated Development Environment (IDE)

The SES (Segger Embedded Studio) is a professional and excellent IDE (Integrated Development Environment), which support the standard GCC toolchain, have the best-in-class debugging functionalities with famous Segger J-Link. It also supports to debug with the Hummingbird Debugger Kit.

Nuclei processor core can be fully supported by Embedded Studio and the J-Link.

For the quick-start introduction of SES for Nuclei Processor Cores, please refer to document <Nuclei\_SES\_IDE\_QuickStart.pdf>, which can be easily got from "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a>.

# 4. Configure to Generate RTL

# 4.1. Use nuclei gen Tool to generate RTL Codes

Since Nuclei N600 Series Core is fully configurable, Nuclei developed a tool called *nuclei\_gen*. User can easily configure the Core according to their requirements at their field, and then generate the RTL code.

Under the n600 rls pkg directory, there are files as below:

- nuclei gen: The Core RTL configuration and generation tool.
- private.pem: The private Key to use nuclei\_gen, need to contact Nuclei to get this.
- n600.iplib: The IP library for Core RTL generation.
- env.sh: Shell environment script.
- env.csh: Environment checking script.
- libc.so.6: The libc dependency file.

#### Note:

■ Don't change the files "private.pem" and "libc.so.6", otherwise there might be errors when generating the RTL code.

Before starting the nuclei\_gen tool, there are several environment variables need to be set:

- bash environment: source env.sh
- csh environment: source env.csh

The above script will set the following environment variables:

- PROJ SRC ROOT: The directory of n600 rls pkg
- PROJ\_NAME: The Core's name.
- PROJ\_GEN\_ROOT: RTL source code directory, by default it is n600\_rls\_pkg/n600. If user wants to generate RTL code to other directory, user can change this variable.

After setting environment correctly, user can directly execute "./nuclei\_gen", it will launch the nuclei\_gen tool, the pop Window is shown as in Figure 4-3. The configurable options shown in the Window are also explained in document <Nuclei\_N600\_Databook.pdf>, which can be easily got from "Nuclei User Center" website <a href="http://user.nucleisys.com">http://user.nucleisys.com</a>.

```
ISA --->
Privilege --->
Exception --->
E-extension --->
PMP --->
Mulitiply --->
Regfile Reset --->
Debug --->
FIO Interface --->
ECLIC --->
Area Reduction --->
Performance Boost --->
Timing Boost --->
NICE Extension --->
```

Figure 4-3 The user interface of core\_gen tool (picture from N203 IPlib just as an example)

In above figure, the special string post each option is explained as below:

- If there is "--->", then indicate there are sub-menu for this option, user can enter "Space" or "Enter" key, to enter sub-menu.
- If entered the sub-menu, user can enter the "<" key, to return to previous upper menu.

For example, if user enter "PMP" sub-menu, it is as shown in Figure 4-4.

```
(Top) > PMP

N203 Core Configuration

*** PMP can be configured only when N203_CFG_HAS_UMODE is configured ***

[*] N203_CFG_HAS_PMP

Pmp Entry Number (PMP_ENTRY_NUM_IS_16) --->
```

Figure 4-4 PMP Configuration sub-menu (picture from N203 IPlib just as an example)

In above figure, the special characters along with options are explained as below:

■ [\*] Indicating this option has been chose by user. If user enter the "Space" key, then discard choosing this option.



- [ ] Indicating this option has not been chose by user. If user enter the "Space" key, then choose this option.
- -\*- Indicating this option is fixed, i.e., not configurable.
- The value in () indicating the value of this configuration. If there is a (NEW), means it is default value, and if user configured different value, then this (NEW) will be disappeared.

Continue the above example, if enter "Pmp Entry Number" sub-menu, it is as shown in Figure 4-5. In this sub-menu, use "SPACE" key to choose the option you want.

```
(Top) > PMP > N203 CFG HAS PMP > Pmp Entry Number

N203 Core Configuration

(X) PMP ENTRY NUM IS 16
```

Figure 4-5 PMP Entry Number Menu (picture from N203 IPlib just as an example)

There might be some options need to be inputted with values. For example, the ECLIC Base Address as shown in Figure 4-6. In this option, enter the "Enter" or "Space" key, the configuration input window will be shown, as in Figure 4-7.



Figure 4-6 ECLIC Base Address Configuration (picture from N203 IPlib just as an example)



Figure 4-7 Input the ECLIC Base Address Value (picture from N203 IPlib just as an example)

There might be some options need to be inputted with values, but with constraints. For example, as shown in Figure 4-8, the range of interrupt number is constrained to  $1\sim1005$ . If the inputted value is out of this range, it will be reported as "Error", as shown in Figure 4-9.



Figure 4-8 Input the ECLIC IRQ Number Value (picture from N203 IPlib just as an example)



Figure 4-9 The Configuration Error (picture from N203 IPlib just as an example)

After finished configuration, input the letter "q", save and exit. After exited, the nuclei\_gen tool will start to generate the RTL codes. It will take several minutes to generate out the codes, user need to wait with patience. The generated code will be under directory of pointed by environment variable \$PROJ\_GEN\_ROOT.

#### Note:

- The generated codes under \$PROJ\_GEN\_ROOT contain lots of codes, including the Core's codes, and the SoC's codes. If user only needs the Core's codes, just check the code under directory of "core".
- There will be a file ".config" generated under current directory. When the nuclei\_gen tool is re-opened next time, it will directly use the configuration from ".config". If this ".config" file is deleted, then the core\_gen tool will use its inherent default configurations.

# 4.2. Check and Compile the Verilog RTL

If user wants check or compile the generated RTL code, the steps are detailed as below (take N607 as example).

```
// Note: Before operation, it is required to install the "RISC-V GNU Toolchain". The toolchain can be downloaded from Nuclei website (https://www.nucleisys.com/download.php).
```

<sup>//</sup> After the "RISC-V GNU Toolchain" package downloaded and decompressed, there will be a "bin" directory under GCC folder. User need to add this "bin" path into the Linux \$PATH environment variable.

```
// Step 1: Use nuclei_gen to configure and generate the RTL code. Use the following commands:

cd n607_rls_pkg

source env.sh
./nuclei_gen

// The detailed way to configured and generate code is described in Section 4.1..

// The Core's RTL code is generated under n607_rls_pkg/n607

// Step 2: Compile the RTL, use the following commands:

cd n607_rls_pkg/n607/vsim

make install

make compile

// Compile the RTL

// Step 3: Check the RTL codes, use the following commands:

make verilog

// This command will open all of the Verilog codes, including the Testbench and Verilog source codes (for entire SoC and Core)

make verilog_core

// This command will open only the Core's Verilog RTL codes
```

# 5. Simulation with Simple Assembly Testcase

#### 5.1. Overview of Self-Check Testcase

The "Self-Check Testcase" is a kind of assembly Testcase which can self-check if it is "passed" or "failed". The Self-Check Testcase are under the following directory.

```
n600_rls_pkg
|----n600
|----riscv-tests
|----isa_origs // The directory for the source codes of
// Self-Check Testcases.
```

The "Self-Check Testcase" will set some "expected value" at the check-point, if the "real result" is not as the expected, then it will jump to the label of TEST\_FAIL, otherwise it will continue to run until it reach the final ending label of TEST\_PASS.

For example, as shown in Figure 5-10, the Testcase (source code under isa\_origs/rv64ui/add.S) is to test the "add" instruction to compute two operands' addition (e.g., 0x00000003 and 0x00000007), and then set its expected value (e.g., 0x0000000a). And then use the "compare" instruction to compare the "real result" is as expected or not, if not matched, then the test will jump to TEST FAIL.

At the label of TEST\_PASS, the test will set the value of general register X3 to 1; while at the label of TEST\_FAIL, the test will set the value of general register X3 to "not 1". Hence, the testbench can monitor the final X3 value to check the Testcase is passed or failed.

Figure 5-10 The code segment of add.S test

#### 5.2. Testbench to Initialize Self-Check Testcase

In order to have the Self-Check Testcase simulated in the Verilog Testbench, it is needed to convert the Testcase into the binary file with the format which can be initialized by Verilog Testbench.

After the "make install" command as described in Section 4.2.. The binary file (.verilog file) for each Testcase will be generated under riscv-tests/isa/generated directory, exampled as below.

The content of disassembly code (e.g., rv32ui-p-addi.dump) is as shown in Figure 5 -11.

```
file format elf32-littleriscv
Disassembly of section .text.init:
80000000 <_start>:
                                                                         80000040 <reset_vector>
80000002:
                     0001
                                                                  nop
80000004 <trap_vector>:
80000004: 34202f73
                                                                         t5,mcause
                                                          csrr
                                                          li t6,8
beq t5,t6,80000036 <write_tohost>
80000008:
                     4fa1
8000000a:
8000000e:
80000010:
                     03ff0663
4fa5
03ff0363
                                                         Deq t5,t6,80000036 <write_tonost>
    li t6,9
beq t5,t6,80000036 <write_tohost>
    li t6,11
beq t5,t6,80000036 <write_tohost>
auipc t5,0x80000
addi t5,t5,-26 # 0 <_start-0x800000000>
beqz t5,80000028 <trap_vector+0x24>
80000014:
                      4fad
80000016:
8000001a:
8000001e:
                      03ff0063
                     80000f17
fe6f0f13
000f0363
80000022:
80000026:
80000028:
8000002c:
                     8f02
34202f73
                                                          csrr
                                                                        t5,80000032 <handle_exception>
80000032 <handle_exception>
                     000f5363
                                                          bgez
80000030:
                      a009
80000032 <handle_exception>:
80000032: 5391e193
                                                          ori gp,gp,1337
80000036 <write_tohost>:
80000036: 00001f17
                                                          auipc t5,0x1
sw gp,-54(t5) # 80001000 <tohost>
j 80000036 <write_tohost>
8000003a:
                      fc3f2523
8000003e:
80000040 <reset_vector>:
80000040: f1402573
                                                                        a0,mhartid
80000044:
80000046:
80000048:
8000004c:
                     e101
4181
00000297
fbc28293
                                                                        z a0,80000044 <reset_vector+0x4>
gp,0
t0,0x0
t0,t0,-68 # 80000004 <trap_vector>
mtvec,t0
                                                                 bnez
li gp
                                                          auipc
                                                          addi
 80000050:
                      30529073
                                                          csrw
80000054:
80000058:
8000005c:
                      80000297
                                                                         t0,0x80000
                                                                         t0,t0,-84 # 0 <_start-0x80000000>
t0,80000078 <reset_vector+0x38>
                     fac28293
00028e63
                                                          addi
                                                          begz
   000060
```

Figure 5-11 The content of rv32ui-p-addi.dump file

The content of binary code (e.g., rv32ui-p-addi.verilog) is as shown in Figure 5-12, which can be read by Verilog's readmemh function in Verilog Testbench.

Figure 5-12 The content of rv32ui-p-addi.verilog file

#### 5.3. Introduction of Testbench

The Verilog Testbench source codes are under the "tb" directory as below.

```
n600
|----tb
|----tb_*.v //Verilog TestBench source codes
```

The functionality of Testbench is briefly introduced as below:

- Instantiated DUT.
- Generate the clock and reset.
- According to the run options, to recognize the Testcase name, and then use readmemh function to read the .verilog file (e.g., rv32ui-paddi.verilog), and then initialize the instruction memory in SoC.
- At the end of the simulation, check the value of X3, if the X3 value is 1, then the test is passed, print the "PASS" on the terminal, otherwise it is failed and printed as "FAIL", as shown in Figure 5-13.

Note:

- User can also integrate these tb\_\*.v files into their SoC environment, such that in the user's SoC, the Testcase can also be as the sanity Testcases.
- However, these above Testcases are very basic tests, which cannot guarantee the full coverage. If users have modified the Core's RTL code, should not assume the functional correctness can be verified by running the above Testcases.

```
@(pc_write_to_host_cnt == 32'd8)
    $display("
    $display('
                                Test Result Summary
    $display(
    $display("~
    , cycle_count(),
, valid_ir_cycle);
, pc_write_to_host_cycle);
                                                                                   x3);
if (x3 == 1) begin
$display("~~~~
$display("~~~~
                                  ~ TEST PASS ~
    $display(
                                                         ####
    $display("-
$display("-
                                               ####
                                                         ####
    $display(
                           #####
                                     ######
    $display(
$display(
                                               ####
                                                         ####
    $display(
else begin
    $display(
                                    TEST_FAIL
    $display(
    $display(
                            ######
                                                       #
    $display(
    $display(
                            #####
    $display(
                                     ######
    $display(
    $display(
                                                       ######
    $display(
```

Figure 5-13 Print the PASS or FAIL in Testbench

# 5.4. Steps to Run Simulation

The steps to run simulation are as below:

```
// Use this command to generate the tests and Testbench.
// Step 2: Compile RTL.
             // Compile the Verilog source codes.
// Step 3: If want to run one single testcase, use the following commands.
make run_test TESTNAME=rv32ui-p-add
             // This command will run the simulation for one Testcase "rv32ui-p-add"
             // from riscv-tests/isa/generated directory.
make wave TESTNAME=rv32ui-p-add
            // This command will check the generated waveform.
// Step 4: If want to run the regression, use the following commands.
make regress run
             // This command will run the regression for all the tests from
            // riscv-tests/isa/generated directory.
make regress_collect
            // This command will collect the simulation result for regression. It will print a summary result, with each line for each
testcase. For each line, if the Testcase is passed then marked as "PASS", otherwise as "FAIL".
```

# 5.5. Introduction of JTAG\_VPI

The Verilog Testbench source codes are under the "tb" directory as below.

```
n600
|----tb
|----jtag_vpi //jtag vpi source codes
```

The JTAG\_VPI module is used to test the DEBUG module in simulation, which can simulate the GDB feature without needing the FPGA environment, the waveform can also be dumped.

Below is the connection diagram:



Figure 5-14 JTAG VPI Connection Diagram

#### Here is a demo flow for reference:

It may take not short time for JTAG connection and debug, to avoid the simulation ends before JTAG connection, it is better to run a longer case or with 'while' inside.

Go to n607/vsim dir and: (here using n607 as reference)

```
make clean
make install
make run test TESTNAME=$PWD/testcase JTAGVPI=1 JTAGPORT=6666
```

Then you can get outputs as below in **the 1**<sup>st</sup> **terminal**:

```
ILM 0x00: 0cc0006f
ILM 0x01: 00000000
ILM 0x02: 00000000
ILM 0x03: 00000000
ILM 0x04: 00000000
ILM 0x05: 00000000
ILM 0x06: 00000000
ILM 0x07: 00000000
ILM 0x16: 00000000
ILM 0x20: 00000000
SEED =
             20191205175654
FORCE DELAY=
                      0
*Verdi3* Loading libsscore vcs201606.so
*Verdi3* : FSDB GATE is set.
*Verdi3* : FSDB_RTL is set.
*Verdi3* : Enable Parallel Dumping.
FSDB Dumper for VCS, Release Verdi3_L-2016.06-1, Linux x86_64/64bit, 07/10/2016
(C) 1996 - 2016 by Synopsys, Inc.
*Verdi3* : Create FSDB file 'tb_top.fsdb'
*Verdi3* : Begin traversing the scope (tb_top), layer (0).
*Verdi3* : Enable +mda dumping.
*Verdi3* : End of traversing.
TESTCASE=
               /home/xiongtao/gen/n307 p1/n307 cct/vsim/coremark/coremark
FORCE IRQ=
FORCE RESP ERR=
WFI FORCE IRQ=
init done
JTAG debug module with VPI interface enabled
JTAG VPI Listening on port 6666
```

#### Note:

'+jtagvpi' and '+jtag\_port=JTAGPORT' are added in the options when compiling for simulation, so 'JTAGVPI=1' and 'JTAGPORT=xxx' are needed to be specified in the *make* command. JTAGPORT is the port to be connected with *openocd*, this same port is also needed to be specified in the *openocd\_jtagvpi.cfg*. But this port may be already used by others, if so JTAG will choose other port automatically.

When simulation, after reset, JTAG is waiting for connection with *openocd*, but before this step, the port for both JTAG\_VPI and GDB should be set in the n607/tb/jtag vpi/openocd jtagvpi.cfg as below: (here using n607 as reference)

```
openocd_jtagvpi.cfg
source [find interface/jtag_vpi.cfg]
#jtag_vpi_set_port $::env(JTAG_VPI_PORT)
jtag_vpi_set_port 6666
#jtag_vpi_set_port 34448
set CHIPNAME riscv
jtag newtap $ CHIPNAME cpu -irlen 5
set TARGETNAME $ CHIPNAME.cpu
target create $ TARGETNAME riscv -chain-position $ TARGETNAME
$_TARGETNAME configure -work-area-phys 0x80000000 -work-area-size 10000 -work-area-backup 1
$TARGETNAME configure -work-area-phys 0x90000000 -work-area-size 10000 -work-area-backup 1
riscv set_reset_timeout_sec 3000
riscv set command timeout sec 3000
tcl_port disabled
telnet port disabled
gdb port 3333
if {[ info exists pulse_srst]} {
  ftdi_set_signal nSRST 0
  ftdi_set_signal nSRST z
halt
echo "Ready for Remote Connections"
```

#### Open the 2<sup>nd</sup> new Terminal:

openocd -f path-to/openocd\_jtagvpi.cfg

(openocd can be downloaded from <a href="https://www.nucleisys.com/download.php">https://www.nucleisys.com/download.php</a>)

Then there will be output as below to wait for GDB connection:

```
Nuclei OpenOCD, 64-bit Open On-Chip Debugger 0.10.0+dev-00012-g9c34cc5a3-dirty (2019-12-04-07:09)

Licensed under GNU GPL v2

For bug reports, read

http://openocd.org/doc/doxygen/bugs.html

Info: only one transport option; autoselect 'jtag'

Info: Set server port to 5555

Info: Set server port to 6666

Info: Set server port to 6666

Info: Connection to 127.0.0.1: 6666 succeed

Info: This adapter doesn't support configurable speed

Info: JTAG tap: riscv.cpu tap/device found: 0x13070a6d (mfg: 0x536 (Nuclei System Technology Co.,Ltd.), part: 0x3070, ver: 0x1)

Info: datacount=4 progbufsize=2

Info: Examined RISC-V core; found 1 harts

Info: hart 0: XLEN=32, misa=0x4010912d

Info: Target has dm timeouten bit, set RESETHALTREQ for verbose debugging

Info: Listening on port 3333 for gdb connections

Ready for Remote Connections

Info: telnet server disabled

Info: telnet server disabled
```

# Open the 3<sup>rd</sup> new Terminal:

riscv-nuclei-elf-gdb testcase

then GDB connection: (gdb) target remote :3333

Then start debugging:



Such as: ' $info\ reg'$ ' for checking the GPRs, ' $x\ 0xaddr'$ ' for checking the memory value, etc.

```
(gdb) info reg
                 0x80004a86
0x9000ffd0
                                      0x80004a86 <soc_init+42>
0x9000ffd0
0x900010e8
                  0x0
                  0x2000000
                                      33554432
                  0xc4
                             196
                  0xc4
                             196
                  0x90000000
                  0x0
                             100
                  0x64
                            196
5768
178
                  0x1688
                  0xb2
                  0xb3
                             179
                                      33554432
                  0x2000000
                  0x80006058
                                      -2147458984
                  0x0
                             0
                  0x0
                  0x0
                            000
                  0x0
                  0x0
                  0x0
                             Θ
Θ
                  0x0
                            0
0
195
                  0x0
                  0x0
                             196
                  0хс4
                             196
                 0x11 17
0x80004844
                                      0x80004844 <measure_cpu_freq+96>
```

# 6. Simulation with Comprehensive C Program

If user wants to run simulation with comprehensive C program, then the "Nuclei-SDK" is needed. For more details about Nuclei-SDK, please see its online doc from <a href="http://doc.nucleisys.com/nuclei\_sdk">http://doc.nucleisys.com/nuclei\_sdk</a>.

Take "dhrystone" from Nuclei-SDK as example, user can use the following steps to make it running under simulation environment.

Note: Here we use N607 Core as example case, so use the CORE=n607.

# 7. Logic Synthesis

# 7.1. Logic Synthesis for Verilog RTL

The release package have included an example synthesis project, the steps to run are as below (take N607 as example).

```
// Step 1: Enter into n607/syn

cd n607_rls_pkg/n607/syn

// Step 2: Modify the Makefile under n607/syn directory, to set up the Libaray path, Design path, frequency, etc.

// Step 3: Generate the synthesis scripts

make install

// Step 4: run synthesis

make syn

// The generated synthesis result and reports will be under the directory

// of syn_<CORE>_config_<freq>_<lib>/reports
```

#### 7.2. Notes for Attentions

The example synthesis project above is just for reference, if the user want to get more precise result, it is suggested with following notes:

- It is strongly recommended to use the "Flatten" synthesis mode to flatten the hierarchy during synthesis optimization, to achieve better result of timing and areas.
- The "clock gating module" in the Core source files, need to be replaced to the real "clock gating cell" from the ASIC process library used by user.
  - Take N607 as example, the "clock gating module" is module "n607\_clkgate", which can be searched under "n607\_rls\_pkg/n607/design/core" directory.
- jtag\_TMS is used as clock when switching between 4-wire and 2-wire JTAG modes. It is recommended to set the frequency of jtag\_TMS to half of jtag\_TCK.
- A generated clock, u\_n607\_core\_wrapper/u\_n607\_dbg\_top/u\_n607\_dbg\_2jtag/ u\_n607\_dbg\_apu/tck\_s, which is a divide-by-3 clock of jtag\_TCK and duty cycle is 1/3, should be created. Following is an example.

create\_generated\_clock -name tck\_div3 -add -edges {1 3 7} \



-source jtag\_TCK -master\_clock jtag\_TCK [get\_pins \

u\_n607\_core\_wrapper/u\_n607\_dbg\_top/u\_n607\_dbg\_2jtag/u\_n607\_dbg\_apu

/tck\_s\_reg/Q]

# 8.FPGA Prototyping

# 8.1. Files in FPGA Project

The files in the FPGA project are introduced as below.

```
N600_rls_pkg
      |----fpga
                             // Directory for the FPGA project
           |----boards
                             // Directory for the FPGA boards
                 |----share // Directory for MCU200T, DDR200T and WUHAN200T Common Kit
                        |----xdc
                                                  // Directory for the .xdc constraint files
                        |----script
                                          // Directory for TCL script
                        I----src
                                                // Directory for Verilog code
                            |----system.v
                                                 // The top file of FPGA Verilog
                  |----nuclei-master.xdc // The main .xdc constraint files
                                                 // Directory for the .xdc constraint files
                        |----script
                                          // Directory for TCL script
                        |----src
                                                   // Directory for Verilog code
                                               // The top file of FPGA Verilog
                          l----svstem.v
                               // Directory for DDR200T Evaluation Kit
                        |----nuclei-master.xdc // The main .xdc constraint files
                        |----xdc
                                                 // Directory for the .xdc constraint files
                                          // Directory for TCL script
                        |----script
                        |----src
                                                 // Directory for Verilog code
                            |----system.v // The top file of FPGA Verilog
                                        // Directory for KU060 Evaluation Kit
                        |----nuclei-master.xdc // The main .xdc constraint files
                                                 // Directory for the .xdc constraint files
                        |----script
                                      // Directory for TCL script
                        |----src
                                                // Directory for Verilog code
                             |----system.v // The top file of FPGA Verilog
                 I----Makefile
                                   // Makefile
           |----Makefile
                                   // Top Makefile
           I----common.mk
                                          //Common.mk
     Note: N200/N300 series package contains hbirdkit, mcu200t and ddr200t FPGA BOARD directory, 600 series package contains only mcu200t, ddr200t and ku060
FPGA BOARD directory.
```

There are several key notes in FPGA projects:

- FPGA Project will use Makefile to add a Macro "FPGA\_SOURCE" in the Core's defines.v file. This will make sure the FPGA project is using the RTL as FPGA version (FPGA SOURCE Macro included).
- In the top level file "system.v", there are SoC top level module (n600\_soc\_top) instantiated. Besides, there are just the Xilinx I/O Pads instantiated.

- In the top level file "system.v", the Xilinx MMCM (kind of PLL to generate clock) is instantiated. The FPGA project use the MMCM outputted clock for the SoC main system clock, and directly use the external input clock from the FPGA board (hbirdkit, mcu200t, ddr200t, ku060) as the real-time clock (32.768KHz).
- The JTAG Pads of SoC are constrained by nuclei-master.xdc, and map them to the pins of MCU\_JTAG connecter on FPGA board (hbirdkit, mcu200t, ddr200t, ku060).

### 8.2. Generate Bitstream (MCS format)

In Section 3.1., it introduced the Nuclei Evaluation SoC, the SoC can be generated as FPGA Bitstream, and program into FPGA board (hbirdkit, mcu200t, ddr200t, ku060), such that, the FPGA board can be worked as a prototype board.

The steps to generate the Bistream for FPGA board are as below (take N607 as example):

```
// Step 1: Generate the FPGA version RTL codes.
cd n607 rls pkg/n607/fpga
             // Enter into n607_rls_pkg/n607/fpga directory
Hummingbird Evaluation Kit (hbird):
make install (CORE=n607 FPGA_NAME=hbirdkit)
             // Use this command to specify the Core name and the Board name (there is default value for CORE and FPGA_NAME in the
Makefile), to generate RTL codes and Vivado scripts into the "install" directory.
MCU200T Evaluation Kit (mcu200t):
make install (CORE=n607 FPGA NAME=mcu200t)
             // Use this command to specify the Core name and the Board name (there is default value for CORE and FPGA_NAME in the
Makefile), to generate RTL codes and Vivado scripts into the "install" directory.
DDR200T Evaluation Kit (ddr200t):
make install (CORE=n607 FPGA_NAME=ddr200t)
             // Use this command to specify the Core name and the Board name (there is default value for CORE and FPGA_NAME in the
Makefile), to generate RTL codes and Vivado scripts into the "install" directory.
KU060 Evaluation Kit (ku060):
make install (CORE=n607 FPGA NAME=ku060)
             // Use this command to specify the Core name and the Board name (there is default value for CORE and FPGA_NAME in the
Makefile), to generate RTL codes and Vivado scripts into the "install" directory.
     Note: N200/N300 series package contains hbirdkit, mcu200t and ddr200t FPGA BOARD directory, 600 series package contains only mcu200t, ddr200t and ku066
FPGA BOARD directory.
// Step 2: Generate the Bitstream (MCS format).
make mcs
             // The generated MCS format Bitstream will be under
             // n607_rls_pkg/n607/fpga/gen/$FPGA_NAME/obj/system.mcs
```



# 8.3. Program Bitstream (MCS format) into FPGA

About how to program the Bitstream (MCS format) into the FPGA board (Hummingbird Evaluation Kit), please refer to the document <Nuclei\_FPGA\_DebugKit\_Intro.pdf> which can be downloaded from "Development Boards" page of Nuclei website (http://www.nucleisys.com/developboard.php).