



# **ASIP Lab Report**

Marc Donald Tala, Sébastien Thill

#### CES – Chair for Embedded Systems



### **Starting Point**



- Basic Processor
  - Benchmarks done with framework
  - Can already play the audio file on the FPGA

| Cycles  | Critial Path | Power | Area        | FPGA Freq |
|---------|--------------|-------|-------------|-----------|
|         | [ns]         | [W]   | [Sli. LUTs] | [MHz]     |
| 444 239 | 15.418       | 1.489 | 5577        | 100       |

# **Optimizations**



- Area
- Performance

## **Area: Reduce complexity**



- Identifying components
  - Unused instructions

Unused hardware units

Reduce amount of available resource

## Area: $32 \rightarrow 16$ registers



- Default 32 registers
  - Next step 16 registers

- Bad tool support
  - Requires a lot of manual work
  - Time constraint
    - 3 Weeks \* 5h = 15h available for project

Worth the risk?

### **Area: Multiplication**



- Used exactly once (in assembly file)
  - SomeVariable = 2 \* sizeof(...);
  - Replaced with an addition

Removed hardware resource (MUL0) & instructions

| Cycles                  | Critial Path         | Power                  | Area         | FPGA Freq         |
|-------------------------|----------------------|------------------------|--------------|-------------------|
|                         | [ns]                 | [W]                    | [Sli. LUTs]  | [MHz]             |
| <i>444 2</i> 39 444 239 | <i>15.418</i> 15.698 | <i>1.4</i> 89<br>1.486 | 5577<br>5281 | <i>100</i><br>100 |

#### **Area: Division**



- Used a few times, but only for printing!
  - Removed printing from application

Removed hardware resource (DIV0) & instructions

| Cycles                 | Critial Path         | Power                  | Area         | FPGA Freq  |
|------------------------|----------------------|------------------------|--------------|------------|
|                        | [ns]                 | [W]                    | [Sli. LUTs]  | [MHz]      |
| <i>444 239</i> 436 917 | <i>15.418</i> 16.171 | <i>1.4</i> 89<br>1.477 | 5577<br>4858 | 100<br>100 |

### **Area: Remaining OPs**



Exhaustive search of all used instructions in assembly file

Removed unused instructions

| Cycles         | Critial Path  | Power        | Area        | FPGA Freq  |
|----------------|---------------|--------------|-------------|------------|
|                | [ns]          | [W]          | [Sli. LUTs] | [MHz]      |
| <i>444 239</i> | <i>15.418</i> | <i>1.489</i> | 5577        | <i>100</i> |
| 436 917        | 15.580        | 1.465        | 4815        | 100        |

#### **Area: Summary**



- Removed MUL0, DIV0 and unused instructions
  - Slightly reduced amount of cycles (1.7%)
  - Critical path and frequency constant
  - Slightly reduced power consumption (1.7%)
  - Noticeable area reduction (13.7%)

| Cycles                  | Critial Path         | Power                  | Area                | FPGA Freq  |
|-------------------------|----------------------|------------------------|---------------------|------------|
|                         | [ns]                 | [W]                    | [Sli. LUTs]         | [MHz]      |
| <i>444 2</i> 39 436 917 | <i>15.418</i> 15.580 | <i>1.4</i> 89<br>1.465 | <i>5577</i><br>4815 | 100<br>100 |

#### Performance: customization of Instructions



- Identifying possible block of Instructions
  - Branch statement

Shift and Add operations

```
vpdiff += step>>1;
vpdiff += step>>2;
```

Prediction operation

```
if ( sign ) valpred -= vpdiff;
else valpred += vpdiff;
```

#### Performance: Define new Instructions in CPU



- New instruction type
  - 4 registers

- New instructions Field
  - Calculate(for prediction)
  - Clamp(clamp output value)
  - Shift(for ADD and Shift operation)

### **Performance: Summary**



- ADD clamp, calculate, and shift Instructions
  - reduced amount of cycles (22.7%)
  - Shorter critical path
  - Slightly reduced power consumption (1.02%)
  - increasing area

| Cycles  | Critial Path | Power         | Area        | FPGA Freq |
|---------|--------------|---------------|-------------|-----------|
|         | [ns]         | [W]           | [Sli. LUTs] | [MHz]     |
| 559 177 | 15.418       | <i>1.4</i> 89 | 5577        | 100       |
| 455 395 | 8.328        | 1.507         | 6215        | 100       |

### Feedback (1)



- (1) Der notwendige Arbeitsaufwand für Lehrveranstaltung ist...angemessen oder unangemessen? What do you think? Why do you think so?
  - Required information scattered in lab manual
  - Tedious tasks (example adding bgeu op to dlxsim)
  - No Copy & Paste in ASIPMeister

### Feedback (2)



(2) Wie ist die Lehrveranstaltung strukturiert? sehr gut sehr schlecht?

What do you think? Why do you think so?

- No session specific tutorials
- Outdated manual / information on session sheets

### Feedback (3)



- (3) Which difficulties you faced in doing this lab, please share your feedback.
  - Instructions often unclear
  - ASIPMeister
  - Missing / outdated information