

# SiFive U54 User Guide v19.08p2p0

© SiFive, Inc.

# SiFive U54 User Guide

#### **Proprietary Notice**

Copyright © 2018–2019, SiFive Inc. All rights reserved.

Information in this document is provided "as is," with all faults.

SiFive expressly disclaims all warranties, representations, and conditions of any kind, whether express or implied, including, but not limited to, the implied warranties or conditions of merchantability, fitness for a particular purpose and non-infringement.

SiFive does not assume any liability rising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation indirect, incidental, special, exemplary, or consequential damages.

SiFive reserves the right to make changes without further notice to any products herein.

### **Release Information**

| Version    | Date               | Changes                                                                                                                                                                                                                         |
|------------|--------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| v19.08p2p0 | December 06, 2019  | Fixed erratum in which the TDO pin may remain driven after reset                                                                                                                                                                |
| v19.08p1p0 | November 08, 2019  | <ul> <li>Fixed erratum in which Debug.SBCS had incorrect reset value for SBACCESS</li> <li>Fixed typos and other minor documentation errors</li> </ul>                                                                          |
| v19.08p0   | September 17, 2019 | <ul> <li>The Debug Module memory region is no longer accessible in M-mode</li> <li>Addition of the CDISCARD instruction for invalidating data cache lines without writeback</li> </ul>                                          |
| v19.05p2   | August 26, 2019    | Fix for errata on 5-series cores with L1 data caches or L2 caches in which CFLUSH.D.L1 followed by a load that is nack'd could cause core lockup                                                                                |
| v19.05p1   | July 22, 2019      | <ul> <li>SiFive Insight is enabled</li> <li>Enable debugger reads of Debug Module registers when periphery is in reset</li> <li>Fix errata to get illegal instruction exception executing DRET outside of debug mode</li> </ul> |
| v19.05     | June 09, 2019      | <ul> <li>Updated deliverables description</li> <li>Updated Simulation Testbench chapter to<br/>describe new Testbench and the Verilator Test-<br/>bench target</li> </ul>                                                       |
| v19.02     | February 28, 2019  | <ul> <li>Changed the date based release numbering system</li> <li>Added the Verilog defines and Test Mode signals table</li> </ul>                                                                                              |
| v1p0       | July 24, 2018      | U54 Core Complex Product User Guide                                                                                                                                                                                             |

# **Contents**

| 1 | Int | rodu  | iction                       | 4  |
|---|-----|-------|------------------------------|----|
|   | 1.1 | Abou  | it this Document             | 4  |
|   | 1.2 | Abou  | ıt this Release              | 4  |
| 2 | De  | liver | ables                        | 5  |
|   | 2.1 | Folde | er Structure                 | 5  |
| 3 | Ме  | mor   | ies                          | 7  |
|   | 3.1 | RAM   | Instances                    | 7  |
| 4 | U5  | 4 Int | erfaces                      | 9  |
|   | 4.1 | Cloc  | ∢ & Reset                    | 9  |
|   | 4   | .1.1  | Real Time Clock (rtc_toggle) | 9  |
|   | 4   | .1.2  | Peripheral Clock (clock)     | 10 |
|   | 4.2 | Ports | S                            | 10 |
|   | 4   | .2.1  | Front Port                   | 10 |
|   | 4   | .2.2  | Memory Port                  | 11 |
|   | 4   | .2.3  | Peripheral Port              | 11 |
|   | 4   | .2.4  | System Port                  | 11 |
|   | 4.3 | Mana  | agement Signals              | 11 |
|   | 4   | .3.1  | Halt                         | 11 |
|   | 4.4 | U54   | Interrupt Interfaces         | 12 |
|   | 4   | .4.1  | External Global Interrupts   | 12 |
|   | 4.5 | Debu  | ıg Output Signals            | 12 |
|   | 4.6 | JTAC  | S Debug Interface Pinout     | 12 |
| 5 | U5  | 4 Er  | ror Handling                 | 14 |
|   | 5.1 | 3/5/7 | Series Error Handling        | 14 |

| 6 | Til | leLin  | k to AXI4 Bridge (TL2AXI) | 15 |
|---|-----|--------|---------------------------|----|
|   | 6.1 | Introd | duction                   | 15 |
|   | 6.2 | Com    | pliance                   | 15 |
|   | 6.3 | Block  | Diagram                   | 16 |
|   | 6.4 | TL2A   | XI Interface              | 17 |
|   | 6.5 | Func   | tional Description        | 18 |
|   |     | 6.5.1  | AxADDR                    | 18 |
|   |     | 6.5.2  | AxCACHE                   | 18 |
|   |     | 6.5.3  | AXI IDs                   | 19 |
|   |     | 6.5.4  | AxLOCK                    | 19 |
|   |     | 6.5.5  | AxPROT                    | 19 |
|   |     | 6.5.6  | AxREGION                  | 19 |
|   |     | 6.5.7  | AxQOS                     | 19 |
|   |     | 6.5.8  | AxUSER                    | 19 |
|   |     | 6.5.9  | Bursts                    | 19 |
|   |     | 6.5.10 | Ordering                  | 19 |
|   |     | 6.5.11 | xRESP                     | 19 |
| 7 | A   | XI4 to | TileLink Bridge (AXI2TL)  | 21 |
| - | 7.1 |        | duction                   |    |
|   | 7.2 |        | pliance                   |    |
|   | 7.3 |        | c Diagram                 |    |
|   | 7.4 |        | TL Interface              |    |
|   | 7.5 |        | tional Description        |    |
|   |     | 7.5.1  | AxADDR                    |    |
|   |     | 7.5.2  | AxCACHE                   |    |
|   |     | 7.5.3  | AXI IDs                   |    |
|   |     | 7.5.4  | AxLOCK                    |    |
|   |     | 7.5.5  | AxPROT                    |    |
|   |     | 7.5.6  | AxREGION                  |    |
|   |     | 7.5.7  | AxQOS                     |    |
|   |     | 7.5.8  | AxUSER                    |    |
|   |     | 7.5.9  | Bursts                    |    |

|    | 7    | .5.10  | Priorities                                              | 26 |
|----|------|--------|---------------------------------------------------------|----|
| 8  | Do   | hua I  | nterface                                                | 27 |
| 0  |      | •      |                                                         |    |
|    | 8.1  |        | TAPC State Machine                                      |    |
|    | 8.2  |        | ing JTAG Logic                                          |    |
|    | 8.3  |        | Clocking                                                |    |
|    | 8.4  |        | Standard Instructions                                   |    |
|    | 8.5  | JTAG   | Debug Commands                                          | 29 |
|    | 8.6  | Using  | Debug Outputs                                           | 29 |
| 9  | lm   | plemo  | entation                                                | 30 |
|    | 9.1  | Top Le | evel                                                    | 30 |
|    | 9.2  | Clocki | ng                                                      | 30 |
|    | 9    | .2.1   | Clocking Guidelines                                     | 31 |
|    | 9.3  |        | ng                                                      |    |
|    | 9.4  |        | evel Simulation                                         |    |
|    | •    |        |                                                         |    |
| 10 | S    | imula  | tion Testbench                                          | 33 |
|    | 10.1 | Inclu  | ded Test Bench                                          | 33 |
|    | 1    | 0.1.1  | Executing the Testbench                                 | 33 |
|    | 10.2 | Testb  | ench Output                                             |    |
|    | 1    | 0.2.1  | Testbench Output - Trace                                |    |
|    | 1    | 0.2.2  | Testbench Output - Waves                                |    |
|    | 1    | 0.2.3  | Adding Tests To The Included Testbench                  |    |
|    | 10.3 |        | e Insight                                               |    |
|    |      | 0.3.1  | Viewing SiFive Insight Signals                          |    |
|    |      |        |                                                         |    |
|    | 1    | 0.3.2  | Enabling SiFive Insight Outside of the SiFive Testbench | 30 |

# Introduction

### 1.1 About this Document

This document describes the U54 v19.08p2p0 production deliverables. To learn more about the functionality of the U54 please read the U54 Manual.

## 1.2 About this Release

This is the Production release of the U54 v19.08p2p0.

# **Deliverables**

This chapter describes the contents of the deliverables for the U54 v19.08p2p0.

### 2.1 Folder Structure

The folder structure of the delivery is as follows:

#### arty\_a7\_100t-sifive

Contains the designs FPGA bitstream.

#### freedom-e-sdk

Software SDK for the design including Freedom Metal BSP and applications.

#### bsp

Freedom Metal BSP for the RTL testbench and, where applicable, the FPGA bitstream. Note that Freedom Metal BSPs also include a design's Device Tree file (DTS).

#### freedom-devicetree-tools

Tools used to generate Freedom Metal BSPs from DTS files. Can be used to re-generate BSPs in the case of a hand-edited DTS file.

#### freedom-metal

Source code for the Freedom Metal library.

#### scripts

Helper scripts used by the main makefile.

#### software

One folder for each included Freedom Metal application. Each application includes a pre-built hex file in its *release* directory which can be run directly on the testbench without needed to re-compile from source. Note: In some cases there will not be a pre-built hex file for every application. This will be the case when a particular application is not expected to run correctly on the selected core configuration. Possible reasons for this are:

- Executable image does not fit in configured boot memory
- Application data does not fit in configured data memory
- Multicore application excluded for single-core configuration

#### Makefile

Top-level SDK makefile which can be used to re-build all included examples from source. Readme.md; Readme file describing how to use the SDK's top-level makefile.

#### info

Files which describe the design.

#### mems.conf

Configuration file which describes the memory instances of the design.

#### modules to be retimed.txt

Contains the list of modules which need to be retimed.

#### sifive\_insight.yml

Contains a .yml description of the SiFive Insight signals included in the design. See Section 10.3 for more details on SiFive Insight.

#### rtl

The U54 RTL.

#### memories

A single verilog file containing all memories in the design.

#### testbench

Includes all the modules in the synthesizable testbench, the test driver, and extracted simulation constructs (assertions) that are bound to locations in the DUT.

#### design

The U54 itself. Includes the top-level module U5\_CoreIPSubsystem and all submodules.

#### sifive insight

Contains all the System Verilog files defining and binding the SiFive Insight signals to modules in the design.

#### .F files

Manifest files for the associated folder. A complete list of files to be synthesized as part of the design can be found in design.F.

#### Makefile

Used to execute the test bench described in Chapter 10.

# **Memories**

This chapter describes the memories used in the U54 design.

### 3.1 RAM Instances

The Core Complex RAM instances consist of synchronous single-ported SRAMs. These are contained within wrapper modules that expose a standardized generic interface for the U54. For each of the modules specified in Table 1, the implementer is required to provide a module definition that instantiates the SRAM macros and connects the macro-specific pins to the interface described in Table 2.

Behavioral models of the RAMS are provided as part of the deliverable in the file: verilog/memories/CoreIPSubsystem\*.

The U54 RAM instances are delivered as is and are not configurable. It is, however, possible to construct the memory instances from multiple smaller instances.

| Module Name         | Depth | Address Width $(N_{addr})$ | Data<br>Width<br>( $N_{data}$ ) | Write Mask<br>Granularity<br>( $N_{part}$ ) | Description          |
|---------------------|-------|----------------------------|---------------------------------|---------------------------------------------|----------------------|
| cc_banks_0_ext      | 1024  | 10                         | 72                              | 72                                          | Banked<br>Store      |
| cc_dir_ext          | 256   | 8                          | 176                             | 22                                          | Directory<br>RAM     |
| data_arrays_0_0_ext | 512   | 9                          | 156                             | 39                                          | ICache<br>Data Array |
| data_arrays_0_ext   | 512   | 9                          | 288                             | 72                                          | DCache<br>Data Array |
| I2_tlb_ram_ext      | 128   | 7                          | 47                              |                                             |                      |
| tag_array_0_ext     | 64    | 6                          | 88                              | 22                                          |                      |
| tag_array_ext       | 64    | 6                          | 112                             | 28                                          | DCache<br>Tag Array  |

**Table 1:** SRAM Modules and Configuration

| Name      | Direction | Width                | Description                                                                                                 |  |
|-----------|-----------|----------------------|-------------------------------------------------------------------------------------------------------------|--|
| RW0_clk   | Input     | Input 1 Memory clock |                                                                                                             |  |
| RW0_en    | Input     | 1                    | Active-high signal indicating that the memory is being access. This may be used for clock gating.           |  |
| RW0_addr  | Input     | $N_{addr}$           | Address of access.                                                                                          |  |
| RW0_rdata | Output    | $N_{data}$           | Read data                                                                                                   |  |
| RW0_wmode | Input     | 1                    | Active-high signal indicating that the access is a write operation.                                         |  |
| RW0_wdata | Input     | $N_{data}$           | Write data.                                                                                                 |  |
| RW0_wmask | Input     | $N_{data}/N_{part}$  | Active-high write mask. Each bit controls whether or not the corresponding $N_{part}$ -bit subword is writ- |  |
|           |           |                      | ten. This is present only in memories that require mask write functionality.                                |  |

Table 2: SRAM Signals

# **U54 Interfaces**

This chapter describes the primary interfaces to the U54.

### 4.1 Clock & Reset

The core\_clock\_0, clock, rtc\_toggle, reset, and reset\_vector\_0 inputs are described in Table 3.

The relationship between the clock input frequencies are as follows:

clock > 2 \* rtc\_toggle and core\_clock\_0 >= clock

| Name           | Direction | Width | Description                                                                                                             |
|----------------|-----------|-------|-------------------------------------------------------------------------------------------------------------------------|
| core_clock_0   | Input     | 1     | The core pipeline and cache clock                                                                                       |
| clock          | Input     | 1     | Clock input to the PLIC and the external ports. Has a $1/m$ frequency relationship with core_clock_0 where m $\geq 1$ . |
| rtc_toggle     | Input     | 1     | The Real Time Clock input. Must run at strictly less than half the rate of clock.                                       |
| reset          | Input     | 1     | Synchronous reset signal. Active high. Must be asserted for 16 cycles of clock and synchronously de-asserted.           |
| reset_vector_0 | Input     | 64    | Reset Vector Address. Implementations MUST set this signal to a valid address.                                          |

Table 3: Clock and Reset Interfaces

#### 4.1.1 Real Time Clock (rtc\_toggle)

As defined in the RISC-V privileged specification, RISC-V implementations must expose a real-time counter via the mtime register. In the U54 the rtc\_toggle input is used as the real-time counter. rtc\_toggle must run at strictly less than half the frequency of clock. Furthermore, for

RISC-V compliance, the frequency of rtc\_toggle must remain constant, and software must be made aware of this frequency.

#### 4.1.2 Peripheral Clock (clock)

The peripheral clock is used to decouple the frequency of the core from that of some of the on core complex peripherals. clock has a  $\frac{1}{m}$  frequency relationship with core\_clock\_0 where m is any positive integer. Additionally, these clocks must be phase-aligned.

The peripherals connected to clock are described in Section 9.2.

#### 4.2 Ports

This section will describe all of the Ports in the U54.

| Name               | Base<br>Address | Тор         | Protocol | Description                                   |
|--------------------|-----------------|-------------|----------|-----------------------------------------------|
| front_port_axi4_0  |                 |             | AXI4     | 64-bit data width. Syn-<br>chronous to clock  |
| mem_port_axi4_0    | 0x8000_0000     | 0x8001_FFFF | AXI4     | 128-bit data width. Syn-<br>chronous to clock |
| periph_port_axi4_0 | 0x2000_0000     | 0x2000_1FFF | AXI4     | 64-bit data width. Syn-<br>chronous to clock  |
| sys_port_axi4_0    | 0x4000_0000     | 0x4000_1FFF | AXI4     | 64-bit data width. Syn-<br>chronous to clock  |

Table 4: U54 Platform Bus Interfaces

#### 4.2.1 Front Port

The U54 has one master bus interface called the Front Port. This port can be used by external masters to read and write into any memory-mapped device in the system. Note that an external master using a Front Port can trigger an I-Cache to reconfigure itself by using the procedure described in the U54 Manual.

Reads and writes to a Front Port interface can also pass through to the Memory, Peripheral, and System bus interfaces if a transaction falls within their address space. Note that transactions through a Front Port do not pass through the Physical Memory Protection (PMP) unit.

The U54 Front Port interface is fully compliant with the *AMBA 3 AXI4-Lite Protocol*, *Version 1.0* specification, and Front Port transactions pass through a(n) AXI4 to TileLink Bridge bridge (AXI2TL).

The Front Port is described in Table 4. The AXI2TL bridge is described in Chapter 7.

#### 4.2.2 Memory Port

The U54 has one Memory Port, which is typically used on the back side of a cache to access DDR memory. Errors that propagate to the processor via a Memory Port have the effects described in Chapter 5. The Memory Port is described in Table 4.

The U54 Memory Port pass through a TileLink to AXI4 Bridge (TL2AXI) which is described in Chapter 6.

#### 4.2.3 Peripheral Port

The U54 has one Peripheral Port, which is typically used to access peripheral devices in offcore-complex address space. Errors that propagate to the processor via the Peripheral Port have the effects described in Chapter 5. The Peripheral Port is described in Table 4.

The U54 Peripheral Port pass through a TileLink to AXI4 Bridge (TL2AXI) which is described in Chapter 6.

#### 4.2.4 System Port

The U54 has one System Port, which is typically used to access higher-bandwidth peripheral devices in off-core-complex address space. Errors that propagate to the processor via the Peripheral Port have the effects described in Chapter 5. The System Port is described in Table 4.

The U54 System Port pass through a TileLink to AXI4 Bridge (TL2AXI) which is described in Chapter 6.

### 4.3 Management Signals

Signals which are used to manage the U54 are described in Table 5.

| Name                      | Direction | Width | Description                            |  |  |
|---------------------------|-----------|-------|----------------------------------------|--|--|
| halt_from_tile_0 Output 1 |           | 1     | Indicates that an unrecoverable memory |  |  |
|                           |           |       | error was reported to the tile         |  |  |

**Table 5:** Management Signals

#### 4.3.1 Halt

halt\_from\_tile asserts active high when an unrecoverable error is reported to a tile. A bus error is an example of an unrecoverable error. The halt\_from\_tile signal remains asserted until a core reset.

### 4.4 U54 Interrupt Interfaces

This chapter describes all of the interrupt signals in the U54.

#### 4.4.1 External Global Interrupts

Global interrupts are interrupts which are connected to the PLIC from external interrupt sources. Please see the U54 Manual for a detailed description of global interrupts.

| Name              | Direction | Width | Description                                     |
|-------------------|-----------|-------|-------------------------------------------------|
| global_interrupts | Input     | 127   | External interrupts from off-chip or peripheral |
|                   |           |       | sources. These are level-based interrupt sig-   |
|                   |           |       | nals connected to the PLIC and must be syn-     |
|                   |           |       | chronous with clock.                            |

**Table 6:** External Global Interrupt Interface

### 4.5 Debug Output Signals

Signals which are outputs from the debug module are shown in Table 7.

| Name           | Direction | Width | Description                                                                                                                                                                                                                                                                      |
|----------------|-----------|-------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| debug_ndreset  | Output    | 1     | This signal is a reset signal driven by the debug logic of the chip. It can be used to reset parts of the SoC or the entire chip. It should NOT be wired into logic which feeds back into the debug_systemjtag_reset signal for this block. This signal may be left unconnected. |
| debug_dmactive | Output    | 1     | This signal, 0 at reset, indicates that debug logic is active. This may be used to prevent power gating of debug logic, etc. It may be left unconnected.                                                                                                                         |

Table 7: External Debug Logic Control Pins

### 4.6 JTAG Debug Interface Pinout

SiFive uses the industry-standard JTAG interface which includes the four standard signals, TCK, TMS, TDI, and TDO. A test logic reset signal must also be driven on the debug\_systemjtag\_reset input. This reset is synchronized internally to the design. The test logic reset must be pulsed before the core reset is deasserted.

| Name                        | Direction | Width | Description                   |
|-----------------------------|-----------|-------|-------------------------------|
| debug_systemjtag_TCK        | Input     | 1     | JTAG Test Clock               |
| debug_systemjtag_TMS        | Input     | 1     | JTAG Test Mode Select         |
| debug_systemjtag_TDI        | Input     | 1     | JTAG Test Data Input          |
| debug_systemjtag_TDO_data   | Output    | 1     | JTAG Test Data Output         |
| debug_systemjtag_TDO_driven | Output    | 1     | JTAG Test Data Output Enable  |
| debug_systemjtag_reset      | Input     | 1     | Active-high Reset             |
| debug_systemjtag_mfr_id     | Input     | 11    | The SoC Manufacturer ID       |
|                             |           |       | which will be reported by the |
|                             |           |       | JTAG IDCODE instruction.      |

 Table 8:
 SiFive standard JTAG interface for off-chip external TAPC

# **U54 Error Handling**

This chapter describes how the U54 handles errors from its memories and interfaces.

Errors can be introduced to the core via ECC errors or error responses returned on the various port interfaces. For port interfaces that are not natively TileLink, their error responses are translated into TileLink errors as described in the respective bridge chapters. The core's behavior is purely determined by the type of TileLink or ECC error that it receives. The behavior is also dependent on the type of core.

### 5.1 3/5/7 Series Error Handling

For 3-, 5-, or 7-series cores, on the various interfaces to the U54:

- TileLink denied/corrupt on I\$ refills result in precise exceptions
- Correctable I\$/ITIM ECC errors inform BEU
- Uncorrectable ITIM ECC errors inform BEU and raise halt signal
- TL denied on D\$ refill prevents refill from occurring; BEU is informed
- TL corrupt on D\$ refill performs the refill, but preserves the error in the ECC bits; BEU is informed
- TL denied/corrupt on MMIO informs BEU
- D\$/DTIM correctable errors inform BEU
- D\$ uncorrectable errors (which only happen on dirty lines) cause a writeback with the TL corrupt bit set; BEU is informed; halt signal is raised
- · DTIM uncorrectable errors inform BEU and raise halt signal

# TileLink to AXI4 Bridge (TL2AXI)

#### **6.1** Introduction

SiFive's TileLink to AXI4 Bridge (TL2AXI) can be used to connect SiFive Core Complex IP to AMBA AXI4 based systems. SiFive Core Complex IP natively uses TileLink for all system communication external to the Core Complex. The TL2AXI bridge translates TileLink transactions to AMBA AXI4.

## 6.2 Compliance

- The SiFive TL2AXI is fully compliant with AMBA AXI4 and this document should be read in conjunction with the AMBA AXI4 Protocol Specification.
- The SiFive TL2AXI is fully compatible with SiFive Core Complex IP. Some properties of the TL2AXI are specific to a given Core Complex implementation. This document should be read in conjunction with the Core Complex IP Manual.

# 6.3 Block Diagram



Figure 1: TL2AXI Block Diagram

## 6.4 TL2AXI Interface

| Name    | Direction | Width | Description                                                |
|---------|-----------|-------|------------------------------------------------------------|
| AWREADY | In        | 1     | Indicates that the slave is ready to accept an address and |
|         |           |       | control signals.                                           |
| AWVALID | Out       | 1     | Indicates that valid write address and control information |
|         |           |       | are available.                                             |
| AWID    | Out       | [3:0] | Identification tag for the write address group of signals. |
| AWADDR  | Out       | [M:0] | The write address of the first transfer in a write burst   |
|         |           |       | transaction where M is the minimum width necessary for     |
|         |           |       | the address range assigned to TL2AXI in a given Core       |
|         |           |       | Complex implementation.                                    |
| AWLEN   | Out       | [7:0] | Indicates the number of transfers in a burst.              |
| AWSIZE  | Out       | [2:0] | Indicates the size of each transfer in a burst.            |
| AWBURST | Out       | [1:0] | Indicates the burst type.                                  |
| AWLOCK  | Out       | 1     | Indicates the lock type used in an atomic transfer.        |
| AWCACHE | Out       | [3:0] | Indicates the cache attributes of a transaction.           |
| AWPROT  | Out       | [2:0] | This signal indicates protection level and transaction     |
|         |           |       | access type (data or instruction).                         |
| AWQOS   | Out       | [3:0] | Quality of Service signaling.                              |

Table 9: AXI Write Address Channel

| Name   | Direction | Width | Description                                                  |
|--------|-----------|-------|--------------------------------------------------------------|
| WREADY | In        | 1     | Indicates that a slave can accept the write data.            |
| WVALID | Out       | 1     | Indicates that valid write data and strobes are available.   |
| WDATA  | Out       | [N:0] | Write data where N is dependent on a given Core Complex      |
|        |           |       | implementation.                                              |
| WSTRB  | Out       | [X:0] | Write Strobe to indicate which byte in a byte lane to update |
|        |           |       | in memory.                                                   |
| WLAST  | Out       | 1     | Indicates that the slave can accept write data.              |

Table 10: AXI Write Data Channel

| Name   | Direction | Width | Description                                                                                                                    |
|--------|-----------|-------|--------------------------------------------------------------------------------------------------------------------------------|
| BREADY | Out       | 1     | Indicates that the master can accept the response informa-                                                                     |
|        |           |       | tion.                                                                                                                          |
| BVALID | In        | 1     | Indicates that a valid write response is available.                                                                            |
| BID    | In        | [3:0] | Identification tag of the write response which matches the AWID tag of the write transaction to which the slave is responding. |
| BRESP  | In        | [1:0] | Indicates the status of the write transaction.                                                                                 |

 Table 11:
 AXI Write Response Channel

| Name    | Direction | Width | Description                                                    |
|---------|-----------|-------|----------------------------------------------------------------|
| ARREADY | In        | 1     | Indicates that the slave is ready to accept an address and     |
|         |           |       | control signals.                                               |
| ARVALID | Out       | 1     | Indicates that valid read address and control information      |
|         |           |       | are available.                                                 |
| ARID    | Out       | [3:0] | Identification tag for the read address group of signals.      |
| ARADDR  | Out       | [M:0] | The read address of the first transfer in a write burst trans- |
|         |           |       | action where M is the minimum width necessary for the          |
|         |           |       | address range assigned to TL2AXI in a given Core Com-          |
|         |           |       | plex implementation.                                           |
| ARLEN   | Out       | [7:0] | Indicates the number of transfers in a burst.                  |
| ARSIZE  | Out       | [2:0] | Indicates the size of each transfer in a burst.                |
| ARBURST | Out       | [1:0] | Indicates the burst type.                                      |
| ARLOCK  | Out       | 1     | Indicates the lock type used in an atomic transfer.            |
| ARCACHE | Out       | [3:0] | Indicates the cache attributes of a transaction.               |
| ARPROT  | Out       | [2:0] | This signal indicates protection level and transaction         |
|         |           |       | access type (data or instruction).                             |
| ARQOS   | Out       | [3:0] | Quality of Service signaling.                                  |
| RREADY  | Out       | 1     | Indicates that a slave can accept the read data.               |
| RVALID  | In        | 1     | Indicates that valid read data and strobes are available.      |

Table 12: AXI Read Address Channel

| Name  | Direction | Width | Description                                                |
|-------|-----------|-------|------------------------------------------------------------|
| RID   | In        | [3:0] | ID tag of the read data transfer which must match the ARID |
|       |           |       | value of the read transaction.                             |
| RDATA | In        | [N:0] | Read data where N is dependent on a given Core Complex     |
|       |           |       | implementation.                                            |
| RRESP | In        | [1:0] | Indicates the status of the read transfer.                 |
| RLAST | In        | 1     | Indicates the last transfer in a read burst.               |

Table 13: AXI Read Data Channel

## 6.5 Functional Description

This section will describe the functional behavior of the TL2AXI in more detail.

#### 6.5.1 AXADDR

• For a given Core Complex implementation, the width of ARADDR and AWADDR is the minimum width necessary for the address range of the TileLink bus it is connected to.

#### 6.5.2 AxCACHE

AWCACHE and ARCACHE signals are tied to 0.

#### 6.5.3 AXI IDs

- The TL2AXI uses an AXI-ID width of 4.
- The TL2AXI will issue multiple outstanding request per AXI-ID.

#### 6.5.4 AxLOCK

AWLOCK and ARLOCK signals are tied to 0.

#### 6.5.5 AXPROT

AWPROT and ARPROT are tied to 0x01: privileged secure data.

#### 6.5.6 AXREGION

TL2AXI does not generate AWREGION and ARREGION signals.

#### 6.5.7 AxQOS

AWQOS and ARQOS are tied to 0.

#### **6.5.8 AXUSER**

TL2AXI does not generate AWUSER and ARUSER signals.

#### **6.5.9** Bursts

- All transactions are of type BURST INCR and aligned to the transfer size.
- · Multi-beat narrow bursts are never issued.

#### 6.5.10 Ordering

- In a multi-master system, TL2AXI can interleave read and write requests from different masters.
- In the single master case the observable ordering of reads and writes are preserved.
  - In Example: if a processor issues 4 writes followed by 4 reads, the 4 writes will use the same AXI-ID sequentially without waiting for the write response. The reads will not be issued until ALL write responses are collected. Then the 4 reads are issued sequentially on the same AXI-ID without waiting for the read response.

#### 6.5.11 xRESP

Interleaved read responses are supported.

• When BRESP or RRESP indicate a transfer error, the signal is translated into the TileLink response **d\_error**.

# **AXI4 to TileLink Bridge (AXI2TL)**

#### 7.1 Introduction

SiFive's AXI4 to TileLink Bridge (AXI2TL) can be used to connect SiFive Core Complex IP to AMBA AXI4 based systems. SiFive Core Complex IP natively uses TileLink for all system communication external to the Core Complex. The AXI2TL bridge translates AMBA AXI4 transactions to TileLink.

## 7.2 Compliance

- The SiFive AXI2TL is fully compliant with AMBA AXI4 and this document should be read in conjunction with the AMBA AXI4 Protocol Specification.
- The SiFive AXI2TL is fully compatible with SiFive Core Complex IP. Some properties of the AXI2TL are specific to a given Core Complex implementation. This document should be read in conjunction with the Core Complex IP Manual.

# 7.3 Block Diagram



Figure 2: AXI2TL Block Diagram

## 7.4 AXI2TL Interface

| Name    | Direction | Width | Description                                                |
|---------|-----------|-------|------------------------------------------------------------|
| AWREADY | Out       | 1     | Indicates that the slave is ready to accept an address and |
|         |           |       | control signals.                                           |
| AWVALID | In        | 1     | Indicates that valid write address and control information |
|         |           |       | are available.                                             |
| AWID    | In        | [7:0] | Identification tag for the write address group of signals. |
| AWADDR  | In        | [M:0] | The write address of the first transfer in a write burst   |
|         |           |       | transaction where M is the minimum width necessary for     |
|         |           |       | the address range assigned to AXI2TL in a given Core       |
|         |           |       | Complex implementation.                                    |
| AWLEN   | In        | [7:0] | Indicates the number of transfers in a burst.              |
| AWSIZE  | In        | [2:0] | Indicates the size of each transfer in a burst.            |
| AWBURST | In        | [1:0] | Indicates the burst type.                                  |
| AWLOCK  | In        | 1     | Indicates the lock type used in an atomic transfer.        |
| AWCACHE | In        | [3:0] | Indicates the cache attributes of a transaction.           |
| AWPROT  | In        | [2:0] | This signal indicates protection level and transaction     |
|         |           |       | access type (data or instruction).                         |
| AWQOS   | In        | [3:0] | Quality of Service signaling.                              |

Table 14: AXI Write Address Channel

| Name   | Direction | Width | Description                                                  |
|--------|-----------|-------|--------------------------------------------------------------|
| WREADY | Out       | 1     | Indicates that a slave can accept the write data.            |
| WVALID | In        | 1     | Indicates that valid write data and strobes are available.   |
| WDATA  | In        | [N:0] | Write data where N is dependent on a given Core Complex      |
|        |           |       | implementation.                                              |
| WSTRB  | In        | [X:0] | Write Strobe to indicate which byte in a byte lane to update |
|        |           |       | in memory.                                                   |
| WLAST  | In        | 1     | Indicates that the slave can accept write data.              |

Table 15: AXI Write Data Channel

| Name   | Direction | Width | Description                                                |
|--------|-----------|-------|------------------------------------------------------------|
| BREADY | In        | 1     | Indicates that the master can accept the response informa- |
|        |           |       | tion.                                                      |
| BVALID | Out       | 1     | Indicates that a valid write response is available.        |
| BID    | Out       | [7:0] | Identification tag of the write response which matches the |
|        |           |       | AWID tag of the write transaction to which the slave is    |
|        |           |       | responding.                                                |
| BRESP  | Out       | [1:0] | Indicates the status of the write transaction.             |

**Table 16:** AXI Write Response Channel

| Name    | Direction | Width | Description                                                    |
|---------|-----------|-------|----------------------------------------------------------------|
| ARREADY | Out       | 1     | Indicates that the slave is ready to accept an address and     |
|         |           |       | control signals.                                               |
| ARVALID | In        | 1     | Indicates that valid read address and control information      |
|         |           |       | are available.                                                 |
| ARID    | In        | [7:0] | Identification tag for the read address group of signals.      |
| ARADDR  | In        | [M:0] | The read address of the first transfer in a write burst trans- |
|         |           |       | action where M is the minimum width necessary for the          |
|         |           |       | address range assigned to AXI2TL in a given Core Com-          |
|         |           |       | plex implementation.                                           |
| ARLEN   | In        | [7:0] | Indicates the number of transfers in a burst.                  |
| ARSIZE  | In        | [2:0] | Indicates the size of each transfer in a burst.                |
| ARBURST | In        | [1:0] | Indicates the burst type.                                      |
| ARLOCK  | In        | 1     | Indicates the lock type used in an atomic transfer.            |
| ARCACHE | In        | [3:0] | Indicates the cache attributes of a transaction.               |
| ARPROT  | In        | [2:0] | This signal indicates protection level and transaction         |
|         |           |       | access type (data or instruction).                             |
| ARQOS   | In        | [3:0] | Quality of Service signaling.                                  |
| RREADY  | In        | 1     | Indicates that a slave can accept the read data.               |
| RVALID  | Out       | 1     | Indicates that valid read data and strobes are available.      |

Table 17: AXI Read Address Channel

| Name  | Direction | Width | Description                                                |
|-------|-----------|-------|------------------------------------------------------------|
| RID   | Out       | [3:0] | ID tag of the read data transfer which must match the ARID |
|       |           |       | value of the read transaction.                             |
| RDATA | Out       | [N:0] | Read data where N is dependent on a given Core Complex     |
|       |           |       | implementation.                                            |
| RRESP | Out       | [1:0] | Indicates the status of the read transfer.                 |
| RLAST | Out       | 1     | Indicates the last transfer in a read burst.               |

Table 18: AXI Read Data Channel

## 7.5 Functional Description

This section will describe the functional behavior of the AXI2TL in more detail.

#### **7.5.1 AXADDR**

- For a given Core Complex implementation, the width of AxADDR in the AXI2TL bridge is the entire addressable address space including ITIM and DTIM. Transactions to off core complex addresses will pass through to the Core Complex's bus master interfaces.
- Writes and Reads to illegal addresses result in SLVERR in the BRESP and RRESP signals respectively.

#### 7.5.2 AXCACHE

• AWCACHE and ARCACHE signals are ignored.

#### 7.5.3 AXI IDs

- Supports an AXI-ID width of 8.
- Requests with the lowest 2 bits equal are handled in first-in first-out (FIFO).
  - Per FIFO, at most there can be 4 outstanding transactions.

#### **7.5.4 AXLOCK**

AWLOCK and ARLOCK signals are ignored.

#### **7.5.5 AXPROT**

AWPROT and ARPROT signals are ignored.

#### 7.5.6 AXREGION

AWREGION and ARREGION are not implemented.

#### 7.5.7 **AxQOS**

AWQOS and ARQOS signals are ignored.

#### **7.5.8 AXUSER**

• AWUSER and ARUSER signals are not implemented.

#### **7.5.9** Bursts

- · All legal AXI bursts are supported.
- Narrow bursts, and bursts that are not aligned to the transfer size, do not result in improved throughput.
- In the case of unaligned or narrow bursts, read responses may be interleaved with responses to other reads.
- Unaligned AXI burst reads will unconditionally round down ARADDR to the nearest multiple
  of ARSIZE. The AXI Read Data Channel does not have an equivalent to WSTRB, therefore
  it is possible for an unintended read to take place in this scenario.

### 7.5.10 Priorities

• If a Read and a Write are presented at the same time, a round-robin priority scheme is implemented to arbitrate between them.

# **Debug Interface**



Figure 3: Debug Transport Module and Debug Module for HW Debug.

The SiFive U54 includes the JTAG debug transport module (DTM) described in *The RISC-V Debug Specification, Version 0.13*. This enables a single external industry-standard 1149.1 JTAG interface to test and debug the system. The JTAG interface can be directly connected off-

chip in a single-chip microcontroller or can be an embedded JTAG controller for a RISC-V Core IP designed to be included in a larger SoC.

The DTM and debug module are depicted in Figure 3.

On-chip JTAG connections must be driven (no pullups), with a normal 2-state driver for TDO under the expectation that on-chip mux logic will be used to select between alternate on-chip JTAG controllers' TDO outputs. TDO logic changes on the falling edge of TCK.

### 8.1 JTAG TAPC State Machine

The JTAG controller includes the standard TAPC state machine shown in Figure 4. The state machine is clocked with TCK. All transitions are labelled with the value on TMS, except for the arc showing asynchronous reset when TRST=0.



Figure 4: JTAG TAPC state machine.

## 8.2 Resetting JTAG Logic

The JTAG logic must be asynchronously reset by asserting jtag\_reset before coreReset is deasserted.

Asserting jtag\_reset resets both the JTAG DTM and debug module test logic. Because parts of the debug logic require synchronous reset, the jtag\_reset signal is synchronized inside the U54.

During operation, the JTAG DTM logic can also be reset without jtag\_reset by issuing 5 jtag\_TCK clock ticks with jtag\_TMS asserted. This action resets only the JTAG DTM, not the debug module.

### 8.3 JTAG Clocking

The JTAG logic always operates in its own clock domain clocked by jtag\_TCK. The JTAG logic is fully static and has no minimum clock frequency. The maximum jtag\_TCK frequency is part-specific.

#### 8.4 JTAG Standard Instructions

The JTAG DTM implements the BYPASS and IDCODE instructions.

The Manufacturer ID field of IDCODE is provided by the RISC-V Core IP integrator, on the jtag\_mfr\_id input.

### 8.5 JTAG Debug Commands

The JTAG DEBUG instruction gives access to the SiFive debug module by connecting the debug scan register between jtag\_TDI and jtag\_TD0.

The debug scan register includes a 2-bit opcode field, a 7-bit debug module address field, and a 32-bit data field.

to allow various memory-mapped read/write operations to be specified with a single scan of the debug scan register.

These are described in *The RISC-V Debug Specification*, *Version 0.13*.

### 8.6 Using Debug Outputs

The debug module logic in SiFive Systems drives two output signals: ndreset and dmactive. These signals can be used in integration. It is suggested that the ndreset signal contribute to the system reset. It must be synchronized before it contributes back to the RISC-V Core IP's overall reset signal. This signal must not contribute to the jtag\_reset signal. The dmactive signal can be used, for example, to prevent clock or power gating of the debug module logic while debugging is in progress.

# **Implementation**

This chapter will describe the steps necessary to synthesize U54.

### 9.1 Top Level

The top level verilog module is defined in the file: verilog/design/U5\_CoreIPSubsystem.v

All top level interfaces are described in the the U54 Chapter 4.

## 9.2 Clocking

The U54 has the following main clock inputs: core\_clock\_X, clock, rtc\_toggle. core\_clock is used to clock the processor and Level 1 memory system. clock is used to clock the bus, PLIC and debug interfaces. rtc\_toggle is exposed via the architectually defined real time counter exposed in the mtime CSR.

The relationship between the clock input frequencies are as follows:

$$core\_clock \ge clock > (2 \times rtc\_toggle)$$



Figure 5: U54 Clock Diagram.

#### 9.2.1 Clocking Guidelines

Depending on the maximum frequency of a given implementation, it may be advantageous to have clock run a lower frequency to core\_clock. As depicted in Figure 5, clock is used by the bus interfaces, PLIC, and debug interfaces. It is common at higher frequencies for the busses, or a PLIC with a large number of interrupts and/or interrupt privilege levels, to become the critical path. In order to achieve higher frequencies, it is therefore recommended that the frequency of clock be half of core\_clock. As a result of this, the PLIC will have slightly increased latency generating interrupts, and the maximum bandwidth available in the bus interfaces will be reduced. In most circumstances, these are acceptable tradeoffs. If targeting lower frequencies, it is OK to have clock and core\_clock be of equal frequencies.

rtc\_toggle is used by software for time keeping and therefore not necessary to have a high frequency. Generally, rtc\_toggle is connected to either a Real Time Clock (e.g. 32.768 kHz) or to the base clock input frequency of the platform. The only restrictions on rtc\_toggle are that it must be strictly less than half the frequency of clock and that it must remain constant. Furthermore, for RISC-V compliance, software must be made aware of this frequency via a header file or similar.

### 9.3 Retiming

A list of all modules (if any) which require retiming is included in the design deliverables. The list can be found in the *info/retiming\_modules.txt* file.

### 9.4 Gate Level Simulation

To avoid X propagation, all state such as flip-flops and SRAMs must be initialized at the beginning of simulation, as well as after each power-up event if conducting power-aware simulation. Additionally, ensure that the SRAM models never produce X values by modifying them as necessary.

For Synopsys Design Compiler and IC Compiler, the command all\_registers -output\_pins can be used to enumerate all state elements in the design. The UCLI command force -deposit <inst> <value> can be used in VCS simulation to force the output pins to 0 or 1. VCS also requires a PLI table file to enable wn capability (debug access) on these instances.

Note that if the standard cell logic library contains flip-flops with inverted outputs (i.e., Q and QN), those pins must be initialized to opposite values.

# **Simulation Testbench**

### 10.1 Included Test Bench

The U54 includes an example test simulation environment which is designed to work with Synopsys VCS version K-2015.09-SP2 or higher, Cadence Xcelium version 17.05 or higher, and Verilator version 4.014 or higher. Several tests are included as part of the testbench to verify functionality and can be run using the included Makefile. The testbench tests and their sources are located in the included Freedom E SDK environment, specifically the freedom-e-sdk/software folder. The Freedom E SDK environment contains all of the necessary source code, Makefiles, and utilities necessary to recompile the included tests as well as the ability to easily extend the testbench with additional tests. Freedom E SDK requires a suitable toolchain which can be downlaoded from: https://www.sifive.com/boards

For more information on the Freedom E SDK environment, please read the readme file located in the *freedom-e-sdk* directory.

#### 10.1.1 Executing the Testbench

GNU Make is used to build the RTL into a simulator and run the included binary test files. The Make targets are described below:

- · clean Cleans the build
- · all Runs all tests on all simulators
- · all-verilator Runs all tests using verilator
- all-vcs Runs all tests using vcs
- all-xrun Runs all tests using xcelium
- all-waves Runs all tests while dumping waveforms in VPD format from all simulators
- all-verilator-waves Runs all tests using verilator while dumping wafeforms in VPD format
- all-vcs-waves Runs all tests using vcs while dumping wafeforms in VPD format
- all-xrun-waves Runs all tests using xcelium while dumping wafeforms in VPD format

- test.out Runs the test called test
- test.vpd Runs the test called test and produces a waveform in VPD format

#### Executing the command:

```
make all-waves
```

will run all of the tests in the tests folder and produce the resulting <test\_name>.out and <test\_name>.vpd files which can then be analyzed for detailed execution information.

### 10.2 Testbench Output

The testbench is capable of producing two types of output files: .out and .vpd. .out files contain a trace of all instructions executed by the processor. .vpd files contain VPD waveforms of the design which can be viewed with a waveform viewer such as Synopsys DVE.

#### 10.2.1 Testbench Output - Trace

The test bench will produce output files with the filename <test\_name>.out. The output files contain a cycle-by-cycle dump of a core's write-back stage. An example output is provided below:

#### Format:

```
core id: cycle [valid] pc=[address] Written[register=value][valid] Read[regis-
ter=value] Read[register=value]
```

#### Example:

C0: 483 [1] pc=[00000002138] W[r 3=000000007fff7fff][1] R[r 1=0000000007fffffff] R[r 2=ffffffffff8000]

C0: 484 [1] pc=[0000000213c] W[r29=000000007fff8000][1] R[r31=fffffff80007ffe] R[r31=000000000000000005]

The first [1] at cycle 483, core 0, shows that there's a valid instruction at PC 0x2138 in the writeback stage. The second [1] tells us that the register file is writing r3 with the corresponding value 0x7fff7fff. When the add instruction was in the decode stage, the pipeline had read r1 and r2 with the corresponding values next to it. Similarly at cycle 484, there's a valid instruction at PC 0x213c in the writeback stage. At cycle 485, there isn't a valid instruction in the writeback stage, perhaps, because of a instruction cache miss at PC 0x2140.

#### 10.2.2 Testbench Output - Waves

When running the included testbench with the all-waves or <test\_name>.vpd targets, the testbench will create VPD formatted waveforms which can be viewed with a waveform viewer such as Synopsys DVE. The waveform along with the trace log can be helpful when debugging tests run on the testbench.

#### 10.2.3 Adding Tests To The Included Testbench

The simplest way to add new tests to the testbench is to start from one of the included tests. The return-pass test contains an empty main function which returns 0 (pass value) and is suitable for starting a new test. The example below demonstrates how to build a new test starting from return-pass.

- In the *freedom-e-sdk/software* directory, make a copy of the *return-pass* folder and name the copied folder to the name of your test. For this example we will use \$TEST.
- In the \$TEST directory, edit the makefile variable *PROGRAM* to match the name \$TEST.
- In the \$TEST directory, change the filename of return-pass.c to \$TEST.c
- Use the Freedom E SDK makefile to build the test targeting the *RTL* BSP and the *release* Configuration. The name of your BSP can be found in the *freedom-e-sdk/bsp* directory.

make TARGET=deliverable-name-rtl PROGRAM=\$TEST CONFIGURATION=release software

• In the base directory of the deliverable, it is now possible to run the new test using the testbench makefile.

make \$TEST.out

For more information on using Freedom E SDK and its environment, please read the readme file located in the *freedom-e-sdk* directory.

### 10.3 SiFive Insight

U54 is enabled with SiFive Insight technology which provides deep visibility into the design while at the same time being easily accessible. SiFive Insight is a verilog module that contains a curated list of signals chosen by the designers and presented in an intuitive hierarchy with descriptive names. SiFive Insight is primarily meant to be used during simulation waveform debugging and allows for a deep understanding of what is happening inside the SiFive deliverable without knowing details of the design.

Note that some signals in SiFive Insight are pseudo-signals which represent several signals with logic applied to them in order to present a more useful higher-level function. For example, the Instruction Commit signal in a design may be the logical combination of several signals. SiFive Insight also manages the grouping of signals to improve readability. An example of this would be how SiFive Insight presents the mstatus CSR. SiFive Insight presents mstatus such that each field in the CSR is grouped together, improving readability directly from the waveform viewer.

A complete list of SiFive Insight signals, along with descriptions, can be found in the info/sifive insight.yml file located in the deliverables.

#### 10.3.1 Viewing SiFive Insight Signals

Follow the instructions in Section 10.1.1 to execute the testbench and generate the resulting VPD wave files. This can be done with either the make all-waves or make <test\_name>.vpd Make targets.

Once the test completes, open the VPD wave file with a waveform viewer such as Synopsys DVE. The SiFive Insight module can be found under the Verilog module hierarchy TestDriver/testHarness/system/SiFive\_Insight or by simply searching for SiFive\_Insight.

From here it is possible to add the SiFive Insight signals to the waveform viewer.

#### 10.3.2 Enabling SiFive Insight Outside of the SiFive Testbench

To enable SiFive Insight in testbenches other than the one provided with the SiFive deliverables, simply include the SiFive Insight Verilog files with the testbench and design under test during the compilation of the simulator. The SiFive Insight Verilog files are located in the verilog/sifive\_insight/directory. Waveform dumps will automatically include the SiFive Insight signals.