

Prof. Dr. Armin Biere Dr. Mathias Fleury Freiburg, 22/12/2022

# Computer Architecture Exercise Sheet 4 (version 2022-4)

### Exercise 1

Using our imaginary bfloat8 (1 sign bit, 3 fraction bits, 4 exponent bits, exponent bias 7), calculate:

$$1.010_2 \cdot 2^4 - 1.111_2 \cdot 2^{-5}$$
$$1.010_2 \cdot 2^4 - (-1.111_2 \cdot 2^{-5})$$

What is the point of using GRS (in terms of hardware)?

(The URL from the lecture to test more examples: https://cca.informatik.uni-freiburg.de/teaching/grs-bits/rechner.html)

#### Exercise 2

Consider a machine that does not have a pipeline. The instruction processing phases IF, ID, EX, MEM and WB require execution phases of 50 ns, 50 ns, 60 ns, 50 ns and 50 ns. Furthermore assume that inserting a pipeline into this machine would create an overhead of 5 ns per phase. Which *speed-up* is achieved using this pipeline?

## Exercise 3

We extend our pipelined RISC-V processor with the popent instruction that counts the number of bits that are one. It takes two arguments: the destination register and the register containing the number to count the bits.

- a) What does popcnt return for the number  $4094_{10}$ ?
- b) What is the maximum value of the result?
- c) Convert the following program to RISC-V

```
a += _mm_popent(x[0]);
a += _mm_popent(x[1]);
a += _mm_popent(x[2]);
a += _mm_popent(x[3]);
```

assuming that x is an array of 32-bit words.

- d) Write what each pipeline stage does during execution of the program (we assume that every stage has a running time of 200 ps).
- e) To save registers, you put all the results of popcnt in the *same* register. However, your processor has a performance bug: You can only execute the popcnt once the destination register has been written. This called a *false data dependency*. How long does the program take to be executed?
- f) Actually, decide to add before each instruction a mv <dest>,zero. Our RISC-V processor recognize this as a special operation during decoding and is now able to see that there is no data dependency because the destination register is initialized to 0. How long does the program take to be executed?

(based on https://stackoverflow.com/questions/25078285/replacing-a-32-bit-loop-counter-with-62 the false data dependency was a real bug that has been fixed since)

#### Exercise 4

Write an implementation of a multiplexer for 2, 3, and 4 inputs using binary AND and OR gates and unary NOT gates.

No submission is needed. The exercise sheet will be discussed on December 22th, 2022 in the online exercise session, 10:00 at:

https://uni-freiburg.zoom.us/j/65775356475?pwd=dmUvei8ybDN4RFlmT1JUZnRtYlBGZz09