Skip to content

LETC Architecture Overview

John Jekel edited this page Mar 14, 2024 · 9 revisions

Overview

LETC is the Little Engine That Could (Run Linux).

We are targeting a Zync 7000 series Xilinx FPGA, specifically the one on the Digilent Cora Z7.

The star of the show is LETC Core. LETC Core is an in-order RISC-V RV32IMAZicsrZifencei implementation with a 6-stage pipeline. It has L1 caches and acts as the sole AXI manager in the LETC SoC. In addition to it's AXI port, it has two interrupt inputs for timer and external interrupts respectively.

LETC Core connects to the LETC Matrix, which performs physical address decoding and routes requests to either soft-logic peripherals and memories, or peripherals and memories in the PS over either the 64-bit AXI ACP port or the 32-bit general purpose port.

All LETC soft logic operates as a single clock domain for ease of implementation. It uses solely positive edge triggered flops and a negative-logic asynchronous reset signal. Both of these are software controllable from the ARM core, and are managed by the Conductor project to facilitate memory initialization, LETC boot configuration, and also for debugging purposes.

After FPGA initialization, LETC is held in reset while Conductor, running on the ARM core, loads a firmware / test program / kernel image into memory from either the SD card (using on of Zync 7000's SDIO peripherals) or from the host over UART0. In general all host <-> ARM communication after FPGA configuration is done over UART0.

Once LETC has been configured and its memory initialized, it is released from reset. Software running on LETC generally uses UART1 for bidirectional communication with the host.

LETC may have memory-mapped access to Cora Z7 HIDs including buttons and LEDs. During development, several pins on the board may be used for debugging purposes along with a logic analyzer.

Top Level Block Diagram

TODO stick with just using the single general-purpose GP port to begin with, then decide on either ACP or HP ports.

TODO perhaps add side-channel for Conductor tell letcsbi it wants something? Either simple AXI subordiante, hijack FCLK_RESET1_N, or EMIO with the GPIO PS peripheral.

letc_fc-v0 6 0 drawio

Requirements/Goals

  • LETC Core

    • 250MHz performance target with average CPI of 1.5 (assuming cache and TLB hits) THIS WILL LIKELY CHANGE
    • RV32IMAZicsrZifencei, but leniency of implementing certain CSRs and instructions in M-mode code.
      • We should strive to at least do multiplication in hardware. See the EEI for what we finally end up doing
    • Single hart, very basic branch prediction (static even)
    • L1 instruction and data caches, x KiB each
    • ITLB and DTLB, x entries each, (shared?) hardware page table walker
    • AXI4 manager, single port
  • LETC Matrix

    • todo

Emulated/Skipped RISC-V Features

These features will not be natively implemented in hardware. Rather they will be emulated by m-mode software running on LETC.

  • Division instructions that are part of the RISC-V M extension.
  • Potentially certain variants of the high mutliply instructions depending on how picky Vivado's DSP inference is.
  • Several CSRs, including: TBD
  • Potentially the CLINT? It is only needed for the m-mode kernel for timer interrupts. For the s-mode kernel we could just do something non-standard with CSRs
  • Potentially certain instructions in the A extension if feasible. Ideally we could emulate all of them, probably we can get away with just handling lr and sc.
  • TODO others?

These features will not be natively implemented in hardware, nor will they be emulated as they are only relevant to m-mode software (which we have complete control over) and not s-mode Linux

  • PMP registers / regions
  • Standard debug registers
  • TODO others?

Division of Memory Access Responsiblity

  • Virtual addresses are NEVER a concept outside of the Core. The core only every requests read or write accesses to PHYSICAL address
  • L1 caching is NEVER a concept outside of the Core. The requests that the core makes to the matrix do not indicate whether it is interacting with a cache inside.
  • The core should have a way to make uncached accesses to MMIO memory that bypass the L1 cache.
  • The matrix should NEVER cache access to MMIO devices in a L2 sense. Memories, due to not having side effects, may employ a L2 cache (ex. the one in the PL that is backing the DRAM controller)
Clone this wiki locally