#### Tutorial 101: Benchmarking and Architecture Exploration



# OpenFPGA Tutorial

Tutorial 101: Benchmarking and Architecture Exploration

ORGANIZERS





# Objective

**Evaluate** the impact of **different FPGA architecture choic** on the given set of benchmarks **using OpenFPGA**.

#### Tutorial 101: Benchmarking and Architecture Exploration

# **Architecture Exploration Flow**



#### Tutorial 101: Benchmarking and Architecture Exploration

# Given set of Benchmarks

- 1. **ch\_intrinsics**: Memory Init
- 2. **diffeq1**: Arithmetic Unit
- 3. **diffeq2**: Arithmetic Unit
- 4. **sha**: Cryptography Unit

More benchmarks are available at openfpga\_flow /benchmarks/``

#### Tutorial 101: Benchmarking and Architecture Exploration

# **Candidate Architectures**

- 1. Homogeneous FPGA architecture
- 2. With 300 tracks/channels
- 3. All wires are length-4
- 4. All architectures have a full crossbar in the CLB



#### Tutorial 101: Benchmarking and Architecture Exploration

# Fracturable Logic Block (FLE)

Arch 1: 6-input LUT



## **Arch 2: Fracturable LUT**

(1× 6-input or 2× 5-input LUT)



#### Tutorial 101: Benchmarking and Architecture Exploration

# Fracturable Logic Block (FLE)

Arch 1: 6-input LUT (1× 6-input)



## **Arch 2: Fracturable LUT**

(1× 6-input or 2× 5-input LUT)



#### Tutorial 101: Benchmarking and Architecture Exploration

# Fracturable Logic Block (FLE)

Arch 1: 6-input LUT (1× 6-input)

# LUT6 FLE\_0 [n1\_lut6]

#### **Arch 2: Fracturable LUT**

(1× 6-input or 2× 5-input LUT)



#### Tutorial 101: Benchmarking and Architecture Exploration

# Fracturable Logic Block (FLE)

Arch 1: 6-input LUT (1× 6-input)

# LUT6 FLE\_0 [n1\_lut6]

#### **Arch 2: Fracturable LUT**

(1× 6-input or 2× 5-input LUT)



### Tutorial 101: Benchmarking and Architecture Exploration

## **Binder**



Press ctrl+shift+` to open terminal

#### Tutorial 101: Benchmarking and Architecture Exploration

# How to use OpenFPGA

# **Load OpenFPGA Environment**

> source openfpga.sh
OPENFPGA\_PATH=/opt/openfpga
shopt

# **Create OpenFPGA Task**

```
# create-task <new_task_dir_name> <template_name>
> create-task lab1 template_tasks/frac-lut-arch-explore_template
Creating task lab1
Template project template tasks/frac-lut-arch-explore template
```

## **Run OpenFPGA Task**

```
# run-task <task_dir_name>
> run-task lab1
```

**Hint:** use **Tab** to auto complete

#### Tutorial 101: Benchmarking and Architecture Exploration

# How to use OpenFPGA

## **Load OpenFPGA Environment**

> source openfpga.sh
OPENFPGA\_PATH=/opt/openfpga
shopt

## **Create OpenFPGA Task**

```
# create-task <new_task_dir_name> <template_name>
> create-task lab1 template_tasks/frac-lut-arch-explore_template
Creating task lab1
Template project template tasks/frac-lut-arch-explore template
```

## Run OpenFPGA Task

```
# run-task <task_dir_name>
> run-task lab1
```

**Hint:** use **Tab** to auto complete

#### Tutorial 101: Benchmarking and Architecture Exploration

# How to use OpenFPGA

## **Load OpenFPGA Environment**

> source openfpga.sh
OPENFPGA\_PATH=/opt/openfpga
shopt

## **Create OpenFPGA Task**

```
# create-task <new_task_dir_name> <template_name>
> create-task lab1 template_tasks/frac-lut-arch-explore_template
Creating task lab1
Template project template_tasks/frac-lut-arch-explore_template
```

## **Run OpenFPGA Task**

```
# run-task <task_dir_name>
> run-task lab1
```

**Hint:** use **Tab** to auto complete

4/18/23, 2:54 PM openfpga-tutorial-101

#### Tutorial 101: Benchmarking and Architecture Exploration

# Content of the task directory Any directory with config/task.conf file is an OpenFPGA task directory

```
> tree lab1 -L 2
lab1/
  config
    L— task.conf
   k6 frac N10 tileable.xml
   k6 N10 tileable.xml
   vtr benchmark template script.openfpga
```

# **OpenFPGA-Shell Commands**

(Similar to the TCL script file of any EDA tool)

4/18/23, 2:54 PM openfpga-tutorial-101

#### Tutorial 101: Benchmarking and Architecture Exploration

# Content of the task directory Any directory with config/task.conf file is an OpenFPGA task directory

```
> tree lab1 -L 2
lab1/
  config
    L— task.conf
   k6 frac N10 tileable.xml
   k6 N10 tileable.xml
   vtr benchmark template script.openfpga
```

# **OpenFPGA-Shell Commands**

(Similar to the TCL script file of any EDA tool)

```
vpr ${VPR ARCH FILE} ${VPR TESTBENCH BLIF} \
    --route_chan_width ${VPR_ROUTE_CHAN_WIDTH} \
    --constant net method route
```

#### Tutorial 101: Benchmarking and Architecture Exploration

# Configuration File Content

#### General Section

```
[GENERAL]
run_engine=openfpga_shell # default
power_tech_file = ${PATH:OPENFPGA_PATH}/openfpga_flow/tech/PTM_45nm/45nm.xml
power_analysis = false
spice_output=false
verilog_output=true
timeout_each_job = 20*60
fpga_flow=yosys_vpr # yosys_vpr or vpr_blif
```

## Openfpga\_shell Section

#### [OpenFPGA SHELL]

openfpga\_shell\_template=\${PATH:TASK\_DIR}/vtr\_benchmark\_template\_script.openfpga openfpga\_arch\_file=\${PATH:OPENFPGA\_PATH}/openfpga\_flow/openfpga\_arch/k6\_frac\_N10 openfpga\_sim\_setting\_file=\${PATH:OPENFPGA\_PATH}/openfpga\_flow/openfpga\_simulatio vpr\_route\_chan\_width=300

- **\${PATH:TASK\_DIR}**: Points to the root directory of the task
- **\${PATH:OPENFPGA\_PATH}**: Points to the root directory of the OpenFPGA repository

#### Tutorial 101: Benchmarking and Architecture Exploration

# Configuration File Content

#### **General Section**

```
[GENERAL]
run_engine=openfpga_shell # default
power_tech_file = ${PATH:OPENFPGA_PATH}/openfpga_flow/tech/PTM_45nm/45nm.xml
power_analysis = false
spice_output=false
verilog_output=true
timeout_each_job = 20*60
fpga_flow=yosys_vpr # yosys_vpr or vpr_blif
```

## Openfpga\_shell Section

#### [OpenFPGA SHELL]

openfpga\_shell\_template=\${PATH:TASK\_DIR}/vtr\_benchmark\_template\_script.openfpga openfpga\_arch\_file=\${PATH:OPENFPGA\_PATH}/openfpga\_flow/openfpga\_arch/k6\_frac\_N10 openfpga\_sim\_setting\_file=\${PATH:OPENFPGA\_PATH}/openfpga\_flow/openfpga\_simulatio vpr\_route\_chan\_width=300

- **\${PATH:TASK\_DIR}**: Points to the root directory of the task
- **\${PATH:OPENFPGA\_PATH}**: Points to the root directory of the OpenFPGA repository

#### Tutorial 101: Benchmarking and Architecture Exploration

# Configuration File Content

#### **General Section**

```
[GENERAL]
run_engine=openfpga_shell # default
power_tech_file = ${PATH:OPENFPGA_PATH}/openfpga_flow/tech/PTM_45nm/45nm.xml
power_analysis = false
spice_output=false
verilog_output=true
timeout_each_job = 20*60
fpga_flow=yosys_vpr # yosys_vpr or vpr_blif
```

## Openfpga\_shell Section

#### [OpenFPGA SHELL]

openfpga\_shell\_template=\${PATH:TASK\_DIR}/vtr\_benchmark\_template\_script.openfpga openfpga\_arch\_file=\${PATH:OPENFPGA\_PATH}/openfpga\_flow/openfpga\_arch/k6\_frac\_N10\_openfpga\_sim\_setting\_file=\${PATH:OPENFPGA\_PATH}/openfpga\_flow/openfpga\_simulationvpr\_route\_chan\_width=300

- **\${PATH:TASK\_DIR}**: Points to the root directory of the task
- **\${PATH:OPENFPGA\_PATH}**: Points to the root directory of the OpenFPGA repository

#### Tutorial 101: Benchmarking and Architecture Exploration

# VPR Architecture Section (2 architectures)

```
[ARCHITECTURES]
arch0=${PATH:TASK_DIR}/k6_N10_tileable.xml
arch1=${PATH:TASK_DIR}/k6_frac_N10_tileable.xml
```

• **\${PATH:VPR\_ARCH\_PATH}**: Points to VPR arch file in the openfpga repository Benchmark Section (4 benchamarks)

```
[BENCHMARKS]
bench1=${PATH:BENCH_PATH}/vtr_benchmark/ch_intrinsics.v
bench2=${PATH:BENCH_PATH}/vtr_benchmark/diffeq1.v
bench3=${PATH:BENCH_PATH}/vtr_benchmark/diffeq2.v
bench4=${PATH:BENCH_PATH}/vtr_benchmark/sha.v
```

• **\${PATH:BENCH\_PATH}**: Points to bencharks in the openfpga repository

## Synthesis Parameters

```
[SYNTHESIS_PARAM]
# Yosys script parameters
bench_read_verilog_options_common = -nolatches
bench_yosys_common=${PATH:OPENFPGA_PATH}/openfpga_flow/misc/ys_tmpl_yosys_vpr_flow
# Benchmark top-module name
bench1_top = memset
bench2_top = diffeq_paj_convert
bench3_top = diffeq_f_systemC
bench4_top = shal
```

run job name format <arch\_num>\_<top\_module>\_

#### Tutorial 101: Benchmarking and Architecture Exploration

## VPR Architecture Section (2 architectures)

```
[ARCHITECTURES]
arch0=${PATH:TASK_DIR}/k6_N10_tileable.xml
arch1=${PATH:TASK_DIR}/k6_frac_N10_tileable.xml
```

• **\${PATH:VPR\_ARCH\_PATH}**: Points to VPR arch file in the openfpga repository Benchmark Section (4 benchamarks)

```
[BENCHMARKS]
bench1=${PATH:BENCH_PATH}/vtr_benchmark/ch_intrinsics.v
bench2=${PATH:BENCH_PATH}/vtr_benchmark/diffeq1.v
bench3=${PATH:BENCH_PATH}/vtr_benchmark/diffeq2.v
bench4=${PATH:BENCH_PATH}/vtr_benchmark/sha.v
```

• **\${PATH:BENCH\_PATH}**: Points to bencharks in the openfpga repository

## Synthesis Parameters

```
[SYNTHESIS_PARAM]
# Yosys script parameters
bench_read_verilog_options_common = -nolatches
bench_yosys_common=${PATH:OPENFPGA_PATH}/openfpga_flow/misc/ys_tmpl_yosys_vpr_flow
# Benchmark top-module name
bench1_top = memset
bench2_top = diffeq_paj_convert
bench3_top = diffeq_f_systemC
bench4_top = sha1
```

run job name format <arch\_num>\_<top\_module>\_

#### Tutorial 101: Benchmarking and Architecture Exploration

## VPR Architecture Section (2 architectures)

```
[ARCHITECTURES]
arch0=${PATH:TASK_DIR}/k6_N10_tileable.xml
arch1=${PATH:TASK_DIR}/k6_frac_N10_tileable.xml
```

• **\${PATH:VPR\_ARCH\_PATH}**: Points to VPR arch file in the openfpga repository

## Benchmark Section (4 benchamarks)

```
[BENCHMARKS]
bench1=${PATH:BENCH_PATH}/vtr_benchmark/ch_intrinsics.v
bench2=${PATH:BENCH_PATH}/vtr_benchmark/diffeq1.v
bench3=${PATH:BENCH_PATH}/vtr_benchmark/diffeq2.v
bench4=${PATH:BENCH_PATH}/vtr_benchmark/sha.v
```

• **\${PATH:BENCH\_PATH}**: Points to bencharks in the openfpga repository

## Synthesis Parameters

```
[SYNTHESIS_PARAM]
# Yosys script parameters
bench_read_verilog_options_common = -nolatches
bench_yosys_common=${PATH:OPENFPGA_PATH}/openfpga_flow/misc/ys_tmpl_yosys_vpr_flow
# Benchmark top-module name
bench1_top = memset
bench2_top = diffeq_paj_convert
bench3_top = diffeq_f_systemC
bench4_top = sha1
```

run job name format <arch\_num>\_<top\_module>\_

## Tutorial 101: Benchmarking and Architecture Exploration

# Script Parameters Section

```
[SCRIPT_PARAM_]
# empty
```

# Post execution result extraction

```
[DEFAULT_PARSE_RESULT_VPR]
01_lut6_use = "lut6 : ([0-9]+)", int
02_lut5_use = "lut5 : ([0-9]+)", int
```

#### Tutorial 101: Benchmarking and Architecture Exploration

# Architecture XML File Difference

Added <mode name="n2\_lut5">...</mode>

#### Tutorial 101: Benchmarking and Architecture Exploration

# Run Directory (after task execution)

```
lab1/
 — config
   L- task.conf
  · k6 frac N10 tileable.xml
  - k6 N10 tileable.xml
   vtr_benchmark_template_script.openfpga
   latest
  run001
      - k6_frac_N10_tileable
                             <<< arch0
         — k6 N10 tileable
                              <<< bench0
         - diffeq paj convert
                               <<< bench1
         - diffeq_f_systemC
                               <<< bench2
        — shal
                               <<< bench3
      - k6 N10 tileable
                            <<< arch1
                               <<< bench0
         - memset
         - diffeq_paj_convert
                               <<<< bench1
         - diffeq_f_systemC
                               <<< bench2
         - sha1
                                <<< bench3
       task result.csv
                        <<<<<<<
```

#### Tutorial 101: Benchmarking and Architecture Exploration

# Task Completed



Result located: <a href="lab1/latest/task\_results.csv">lab1/latest/task\_results.csv</a>

#### Tutorial 101: Benchmarking and Architecture Exploration

# Finding the Results



Result located: <a href="lab1/latest/task\_results.csv">lab1/latest/task\_results.csv</a>

# Tutorial 101: Benchmarking and Architecture Exploration

# **Analyze Results**

| name                   | 01_lut6_use | 02_lut5_use | clb_blocks | total_wire_length |
|------------------------|-------------|-------------|------------|-------------------|
| 00_memset_             | 270         |             | 31         | 2210              |
| 01_memset_             | 46          | 224         | 26         | 2203              |
| 00_diffeq_paj_convert_ | 3540        |             | 368        | 43628             |
| 01_diffeq_paj_convert_ | 1316        | 2224        | 275        | 38642             |
| 00_diffeq_f_systemC_   | 3392        |             | 354        | 37096             |
| 01_diffeq_f_systemC_   | 1215        | 2177        | 260        | 33857             |
|                        | 1616        |             | 168        | 16027             |
| 01 sha1                | 885         | 731         | 154        | 15182             |

#### Tutorial 101: Benchmarking and Architecture Exploration

# **Exercise**

- Consider more VTR benchmarks for performance comparistereovision3, blob\_merge, and bgm
- Extend the evaluation metrics and identify number final gr size of the FPGA for each benchmark (*Hint*: look for "FPGA sized to" sentence in *openfpgashell.log* file)

#### Tutorial 101: Benchmarking and Architecture Exploration

# **Exercise**

- Consider more VTR benchmarks for performance comparistereovision3, blob\_merge, and bgm
- 2. Extend the evaluation metrics and identify number final gr size of the FPGA for each benchmark (*Hint*: look for **"FPGA sized to"** sentence in *openfpgashell.log* file)

Answer: 03\_grid\_size = "FPGA sized to(.\*) x", str