# Lab 8, Spring 2016

### Register file and RAM

#### **Important:**

- While collaborations are allowed, names of all the collaborators must be listed in the assignment. Each student should submit individual assignments for grading, despite collaboration.
- The following items should be submitted to *Courseweb* before the next recitation:
  - Solution files (\*.circ files) for this assignment
  - Your Answers for Section 3 "Problems"
- The zip file should be named after your Pitt id. Please zip the files together and submit them.
- Don't forget to submit your answers for Section 3 "Practice Problems".

In this lab, you will build two sample circuits using Logisim. Logisim can be downloaded from http://www.cburch.com/logisim/download.html. There are two parts to this lab. In the first part, you will design a Register File. In the second part, you will work with RAM.

## 1 Register File

A register file is a subcircuit that lets the surrounding circuitry read from one or more registers. A register file also lets the surrounding circuitry optionally write to one or more registers. In this part of the lab, you will build a register file that has two read ports and one write port. That means that your register file circuit should let outside circuitry read two registers at a time while also supporting writing to one register at a time. You should support four registers in total. Each register must hold 16 bits of data. You should use the



Figure 1

built in register component of *Logisim*. Save your register file circuit as lab8part1.circ.

Here is a picture of what your register file subcircuit, symbolically, looks like. Bold labels indicate that an input/output contains more than 1 bit of data.

- ReadRegister1: specifies which register should be read from. That registers data should be output on readData1.
- ReadRegister2: specifies which register should be read from. That registers data should be valid output on readData2.
- WriteRegister: specifies which register should be written to.
- WriteData: the data that will be written to the register specified by writeRegister. On a rising clock edge, the data will be written to the specified register.
- WriteEnable: if true, the write data will be written to the register specified by writeRegister, otherwise, the WriteData will be ignored and no register will be written to.
- Clock: the clock signal.

Before starting, ask yourself the following questions:

- 1. How many bits will be output on ReadData1? On ReadData2?
- 2. How many bits specify ReadRegister1? ReadRegister2?
- 3. How many bits specify WriteRegister?
- 4. How many bits specify WriteData?
- 5. How many bits specify WriteEnable?
- 6. Why is WriteEnable necessary?
- 7. Why does the circuit not have ReadEnable inputs?
- 8. Why does the register file have a clock input? Why doesnt it use its own clock instead of requiring a clock from outside circuitry?

To assist you, heres some tips about sizing different bits:

- 1. Since registers are 16 bits, and since ReadData1 will output the entire contents of a register, readdata1 will be 16 bits wide. By the same reasoning, ReadData2 will be 16 bits wide
- 2. Since in this lab there are 4 registers, then 2 bits are needed to name a register. 00 is register 0, 01 is register 1, 10 is register 2, and 11 is register 3. ReadRegister1 and ReadRegister2 are therefore each 2 bits wide
- 3. WriteRegister describes which register is being written to. Therefore, WriteRegister is 2 bits wide
- 4. WriteData specifies the data that will be written into a register. Since registers are 16 bits, WriteData will be 16 bits



Figure 2: Skeleton circuit using which you can build your register file

- 5. WriteEnable says whether or not a register write should actually be performed (yes or no). Therefore, WriteEnable is 1 bit wide Here are some general tips to help you implement the register file:
- 6. The data that should be written to a register must be able to be given to any of the 4 registers in the register file (e.g., your circuit must support being able to write to any register)
- 7. Registers can be made read-only by setting one of their pins appropriately. This can let you control which register, if any, is written to
- 8. Given multiple inputs, a multiplexer lets you select from exactly 1 of them. This component is under the plexers category. The number of inputs that a multiplexer chooses from can be configured by changing its select bits property. E.g., select bits equal to 2 lets it pick from 4 things. The multiplexers data bits property controls how many bits are on each of its inputs. For example, if if the MUX was choosing one of eight 32-bit inputs, select bits would equal 3 and data bits would be 32.
- 9. A decoder has multiple output wires, but only one of those output wires is true at any given time. It is easy to control which output is true. Decoders are found under the plexers category. A demultiplexer is like a decoder in that it sends its input to exactly 1 output (the other outputs will be set to 0).
- 10. Pins (whether input or output pins) can be made to support multiple bits at a time by changing their data bits property. You can use the poke tool to test out different values for an input pin (e.g., to simulate outside circuitry trying to read to different registers)
- 11. You can poke a register and give it a value. This will let you give each register a different value to make sure that the circuitry reads from the correct register (according to ReadRegister1 and ReadRegister2)

### 2 Memory

Logisim provides a RAM component, which can store up to 224 values, each up to 32 bits wide. To store and load data from RAM, you have to specify an address, which is up to 24 bits wide. The RAM component provides either two separate ports for loads and stores or one single load/store port that is shared by reads and writes. For this problem, we will be using a RAM memory with 256 8-bit values and a single load/store port. You will have to configure the RAM component appropriately. Consider the problem of setting every byte of the RAM to a specified value. For example, this might need to be done when a device is first powered on. We can set each byte of RAM to a predetermined value by having a counter generate every address of the memory and storing the predetermined value at each memory location. The component that looks like a triangle is called a Controlled Buffer. Its purpose is to put the value of its input at its output only when the controlling signal is high ("1"). When it is not high ("1"), the output is held floating, which means that another component can control the signal A controlled buffer lets a part of the circuit disconnect itself from part of the circuit.

Build a circuit in Logisim that writes the value  $0\times42$  to every memory location. Your circuit should allow the user to reset the counter anytime (via a button). In addition, the circuit should stop writing values to memory after it has already written all memory locations exactly once. The circuit shown in above picture is not your final circuit. You will have additional components and slightly changed components. Save this circuit file as lab8part2.circ.

Your circuit must detect that all values have been written to. It should then disable the RAM component and/or controlled buffer, so that no more write "commands" are sent to the RAM module and so that no more data is being placed on the data input of the RAM module. By no longer placing data on the RAM component, other circuits (which you do not have to build) could then use the RAM component.

Easily made mistakes for this part of the lab include:

- 1. Correctly detecting that the automated writes should stop, but still placing data into the RAM component and clocking it (e.g., having the circuit do the last write operation over and over again, forever)
- 2. Writing to all spots in RAM except the last one, then stopping all future writes
- 3. Writing to all spots in RAM, detecting that all spots have been written to, but then accidentally also rewriting to the very first spot in RAM one more time (or, writing to the last spot twice) before stopping all future writes

There are a several valid ways to have your circuit stop writing values. Pick a method or invent your own:

- 1. Stop clocking the RAM component
- 2. Force the RAM component to become read-only
- 3. Use controlled buffers to disconnect both the counters output (which determines the address to write) and the data to be written

Here are a few hints:

1. RAM components must be set so that they accept writes or reads. Use the technical documentation to find out how to write to RAM



Figure 3: Skeleton circuit using which you can build your RAM

2. Combinational logic cannot cause delays. If for some reason you need to introduce a delay, you will need to rely on sequential logic

#### 3 Practice Problems

1. (Q4.1) Consider the following instruction:

Instruction: AND Rd, Rs, Rt

Interpretation: Reg[Rd] = Reg[Rs] AND Reg[Rt]

**Q4.1.1** What are the values of control signals generated by the control in Figure 4.2 for the above instruction?

**Q4.1.2** Which resources (blocks) perform a useful function for this instruction?

**Q4.1.3** Which resources (blocks) produce outputs, but their outputs are not used for this instruction? Which resources produce no outputs for this instruction?

2. (Q4.2) The basic single-cycle MIPS implementation in Figure 4.2 (see textbook) can only implement some instructions. New instructions can be added to an existing Instruction Set Architecture (ISA), but the decision whether or not to do that depends, among other things, on the cost and complexity the proposed addition introduces into the processor datapath and control. The first three problems in this exercise refer to the new instruction:

Instruction: LWI Rt, Rd(Rs)

Interpretation: Req[Rt] = Mem[Req[Rd]+Req[Rs]]

**Q4.2.1** Which existing blocks (if any) can be used for this instruction?

Q4.2.2 Which new functional blocks (if any) do we need for this instruction?

**Q4.2.3** What new signals do we need (if any) from the control unit to support this instruction?

3. (Q4.3) When processor designers consider a possible improvement to the processor datapath, the decision, usually depends on the cost/performance trade-off. In the following three problems, assume that we are starting with a datapath from Figure 4.2 (see textbook), where I-Mem, Add, Mux, ALI, Regs, D-Mem, and Control blocks have latencies of 400 ps, 100 ps, 30ps, 120 ps, 200 ps, 350 ps, and 100 ps, respectively, and costs of 1000, 30, 10, 100, 200, 2000 and 500, respectively.

Consider the addition of a multiplier to the ALI. This addition will add 300 ps to the latency of the ALU and will add a cost of 600 to the ALU. The results will be 5% fewer instructions executed since we will no longer need to emulate the MUI instruction.

- **Q4.3.1** What is the clock cycle time with and without this improvement?
- **Q4.3.2** What is the speedup achieved by adding this improvement?
- Q4.3.3 Compare the cost/performance ratio with and without this improvement.
- 4. (Q4.4) Problems in this exercise assume that logic blocks needed to implement a processor's datapath have the following latencies:

| $\overline{I-Mem}$ | Add  | Mux  | ALU  | Regs | D-Mem | Sign-Extend | Shift-Left-2 |
|--------------------|------|------|------|------|-------|-------------|--------------|
| 200ps              | 70ps | 20ps | 90ps | 90ps | 250ps | 15ps        | 10ps         |

- **Q4.4.1** If the only thing we need to do in a processor is fetch consecutive instructions (Figure 4.6, see textbook), what would the cycle time be?
- **Q4.4.2** Consider a datapath similar to the one in Figure 4.11 (see textbook), but for a processor that only has one type of instruction:

unconditional PC-relative branch. What would the cycle time be for this datapath?

**Q4.4.3** Repeat 4.4.2, but this time we need to support only conditional PC-relative branches.

The remaining three problems in this exercise refer to the datapath element Shift-Left-2:

- **Q4.4.4** Which kinds of instructions require this resources?
- **Q4.4.5** For which kinds of instructions (if any) is this resource on the critical path?
- **Q4.4.6** Assuming that we only support beq and add instructions, discuss how changes in the given latency of this resource affect the cycle time of the processor. Assume that the latencies of other resources do not change.