

# What is FABulous offering?

- Fully integrated open-source FPGA framework with good quality of results (area & performance)
- Entirely open and free, including commercial use (we integrated many other projects)
- Supports custom cells (if provided)
- Supports partial reconfiguration
- Designed for ease of use while providing full control as needed
- Versatile
  - Different flows (OpenLane ←→Cadance) (Yosys/nextpnr ←→ VPR)
  - Easy to customize, including the integration of own IP

## Basic concepts



- Basic tiles have same height, but type-specific width (for logic tiles, DSPs, etc.)
- Adjacent tiles can be fused for more complex blocks (see the DSP example)

### Let's build a small eFPGA: Fabric Definition

|     | term  | term    | term | term |     |
|-----|-------|---------|------|------|-----|
| Ю   | REG   | DSP     | LUT  | LUT  | CPU |
| Pin | (mem) |         |      |      | Ю   |
|     |       | DSP_top |      |      |     |
| IO  | REG   |         | LUT  | LUT  | CPU |
| Pin | (mem) |         |      |      | Ю   |
|     |       | DSP_bot |      |      |     |
| Ю   | REG   | DSP     | LUT  | LUT  | CPU |
| Pin | (mem) |         |      |      | Ю   |
|     |       | DSP_top |      |      |     |
| IO  | REG   |         | LUT  | LUT  | CPU |
| Pin | (mem) |         |      |      | Ю   |
|     |       | DSP_bot |      |      |     |
|     | term  | term    | term | term |     |

■ 4 x register file, 2 x DSPs, 4 x LUTs (CLB), I/Os left and right,

#### Let's build a small eFPGA: Fabric Definition

|           | term  | term |   | term   | term    |         |         |        |        |
|-----------|-------|------|---|--------|---------|---------|---------|--------|--------|
| Ю         | REG   | DSP  | L | UT I   | UT CPU  | J       |         |        |        |
| Pin       | (mem) |      |   |        | IO      |         |         |        |        |
|           |       | DSP_ |   | Α      | В       | С       | D       | Е      | F      |
| IO<br>Din | REG   |      |   | Fabric |         |         |         | ,—,;   | •      |
| Pin       | (mem) | DSP_ | 2 | NULL   | N_term  | N_term  | N_term  | N_term | NULL   |
| 10        | REG   | DSP  | 3 | W_IO   | RegFile | DSP_top | LUT4AB  | LUT4AB | CPU_IO |
| Pin       | (mem) |      | 4 | W_IO   | RegFile | DSP_bot | LUT4AB  | LUT4AB | CPU_IO |
|           |       | DSP_ | 5 | W_IO   | RegFile | DSP_top | LUT4AB  | LUT4AB | CPU_IO |
| 10        | REG   |      | 6 | W_IO   | RegFile | DSP_bot | LUT4AB  | LUT4AB | CPU_IO |
| Pin       | (mem) | DSP  | 7 | NULL   | S_term  | S_term  | S_term  | S_term | NULL   |
|           | term  | term | 8 | Fabric | End     |         | 0 03 00 |        |        |

- 4 x register file, 2 x DSPs, 8 x LUT-tiles (CLB), I/Os left and right,
- A fabric is modelled as a spreadsheet (tiles are references to tile descriptors)

### Let's build a small eFPGA: Tile Definition



- Wires
- Primitives (basic elements)
- Switch matrix

#### Let's build a small eFPGA: Tile Definition



7

### Let's build a small eFPGA: Tile Definition

| 121 | TILE       | LUT4AB |          |          |             |       |       |
|-----|------------|--------|----------|----------|-------------|-------|-------|
| 122 | #direction | source | X-offset | Y-offset | destination | wires |       |
| 123 | EAST       | E1BEG  | 1        | 0        | E1END       | 6     | Wires |
| 124 | WEST       | W4BEG  | -4       | 0        | W4END       | 3     |       |
|     |            |        |          |          |             |       |       |

#### Switch matrix adjacency

|    | Α      | В      | С      |  |
|----|--------|--------|--------|--|
| 1  | LUT4AB | N1END0 | N1END1 |  |
| 2  | N1BEG0 | 1      | 0      |  |
| 3  | N1BEG1 | 0      | 0      |  |
| 4  | N1BEG2 | 0      | 1      |  |
| 5  | N1BEG3 | 1      | 0      |  |
| 6  | N2BEG0 | 0      | 0      |  |
| 7  | N2BEG1 | 0      | 0      |  |
| 8  | N2BEG2 | 0      | 1      |  |
| 9  | N2BEG3 | 0      | 0      |  |
| 10 | N2BEG4 | 0      | 1      |  |
| 11 | N2BEG5 | 0      | 0      |  |



# The FABulous eFPGA Ecosystem

- FABulous eFPGA generato
  - ASIC RTL and constraints generation
  - Generating models for nextpnpr/VPR flows
  - FPGA emulation
- Virtex-II, Lattice clones (patent-free!)
- See our FPGA 2021 paper "FABulous: An Embedded FPGA Framework"



# eFPGA - Customization and Integration

- Case study in 130 nm (~10mm²)
  Skywater process (open PDK)
- 864 x LUT-4,
- 6 x DSPs (8x8 and 20 bit ACC)
- 12 register-file slices
  - 4bit wide 32 entries deep
  - 1 write, 2 read ports
- 6 x 16kb RAM blocks (using OpenRAM)
  - 1 read, 1 write port
  - Programmable aspect ratios for read and write ports
- Partial reconfiguration
- Internal self-configuration port



## Implementing User Circuits (Verilog-to-Bitstream)

```
module demo(
                                                        Q Search..
    clock,
                                                                                                                 Window
                                                                                                                 Toggle Nets:
    a,
    b,
    C,
                                                                                                               Toggle Block Internal:
    out
                                                                                                               Toggle Block Pin Util:
                                                                                                              Toggle Placement Macros:
                                                                                                                Toggle Crit. Path:
                clock;
    input
    input
               a;
    input
                 b;
                                                                                                              Block Outline
                                                                                                              Block Text
    input
                  C;
                                                                                                              Clip Routing Util
    req
                  d;
    output
                  out;
    // ASSIGN STATEMENTS
                                                                                                              Debug
    always @(posedge clock)
    begin
           d \le a \& b;
                                                         Block Total Pin Utilization
    end
                                             Verilog-to-bitstream using VPR or Yosys/nextpnr
    assign out = c | d;
```

endmodule

Result is a FASM that we translate into the bitstream 11

# FABulous versus OpenFPGA (on Sky130)



- New optimizations gave us further 21.7% in density on the same netlist!
- Shuttle3: we plan an open-everything FPGA with better density than Shuttle2 (Cadance)

# **FABulous Contributors**

#### People:

Nguyen Dao nguyen.dao@manchester.ac.uk

Jing Li jing.li@manchester.ac.uk

Khoa Pham khoa.pham@manchester.ac.uk

Bardia Babaei bardia.babaei@postgrad.manchester.ac.uk

King Chung king.chung@student.manchester.ac.uk

Tuan La tuan.la@manchester.ac.uk

Andrew Attwood a.j.attwood@ljmu.ac.uk

Bea Healy tabitha.healy@student.manchester.ac.uk

Dirk Koch dirk.koch@manchester.ac.uk

#### See our projects under:

https://github.com/FPGA-Research-Manchester

This work is kindly supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant EP/R024642/1



The University of Manchester



