## **Module-5**

# **Basic Processing Unit**

## **Some Fundamental Concepts**

- Processor fetches one instruction at a time, and performs the operation specified.
- Instructions are fetched from successive memory locations until a branch or a jump Instruction is encountered.
- Processor keeps track of the address of the memory location containing the next Instruction to be fetched using Program Counter (PC).
- When an instruction is fetched, it is placed in the instruction register(IR) and the IR holds the instruction until its execution is completed.

## To execute an instruction, the processor has to perform the following steps:

• Fetch the contents of the memory location pointed to by the PC. The contents of this location is the instruction to be executed; hence they are loaded into the IR. (Fetch phase)

$$IR \leftarrow [[PC]]$$

Increment the PC to point to the next instruction. Assuming that the memory is byte addressable, the PC is incremented by 4. (Fetch phase)

$$PC \leftarrow [PC] + 4$$

• Carry out the operation specified by the instruction in the IR.(Execution phase)

# Processor Organization (Single bus organization of a datapath inside a processor)



- ALU and all the registers are interconnected via a single common bus.
- The data and address lines of the external memory bus connected to the internal processor bus via the memory data register, MDR, and the memory address register, MAR respectively.
- Register MDR has two inputs and two outputs.
- Data may be loaded into MDR either from the memory bus or from the internal processor bus.
- The data stored in MDR may be placed on either bus.



- The input of MAR is connected to the internal bus, and its output is connected to the external bus.
- The control lines of the memory bus are connected to the instruction decoder and control logic.
- This unit is responsible for issuing the signals that control the operation of all the units inside the processor and for increasing with the memory bus.
- The MUX selects either the output of register Y or a constant value 4 to be provided as input A of the ALU.
- The constant 4 is used to increment the contents of the program counter.

With few exceptions, the operation specified by an instruction can be carried out by performing one or more of the following actions:

- Read the contents of a given memory location and load them into a processor register.
- Read data from one or more processor registers.
- Perform an arithmetic or logic operation and place the result into a processor register.
- Store data from a processor register into a given memory location.

# **Register Transfers**



- Instruction execution involves a sequence of steps in which data are transferred from one register to another.
- For each register two control signals are used to place the contents of that register on the bus or to load the data on the bus into register.



- The input and output of register Ri in and Ri out is set to 1,
- When Ri in is set to 1, the data on the bus areloaded into Ri.
- Similarly, when Ri out is set to 1, the contents of register Ri are placed on the bus.
- While Ri out is equal to 0, the bus can be used for transferring data from other registers.

#### Example: Move R1, R4

Transfer the contents of register R1 to register R4. This can be accomplished as follows.

- Enable the output of registers R1 by setting R1out to 1. This places the contents of R1 on the processor bus.
- Enable the input of register R4 by setting R4in to 1. This loads data from the processor bus into register R4.
- All operations and data transfers within the processor take place wit in time periods defined by the processor clock.
- The control signals that govern a particular transfer are asserted at the start of the clock cycle.



Figure: Input and output gating for one register bit.

# Performing an Arithmetic or Logic Operation

- The ALU is a combinational circuit that has no internal storage.
- ALU gets the two operands from MUX and bus. The result is temporarily storedin register Z.





#### Example: Add R1, R2,R3

Add the contents of register R1 to those of R2 and store the result in R3

- o R1out, Yin
- o R2out, SelectY, Add, Zin
- o Zout, R3in
- All other signals are inactive.
- In step 1, the output of register R1 and the input of register Y are enabled, causing the contents of R1 to be transferred over the bus to Y.
- Step 2, the multiplexer's select signal is set to Select Y, causing the multiplexer to gate the contents of register Y to input A of the ALU.
- At the same time, the contents of register R2 are gated onto the bus and, hence, to input B.
- The function performed by the ALU depends on the signals applied to its control lines.
- In this case, the ADD line is set to 1, causing the output of the ALU to be the sum of the two numbers at inputs A and B.
- This sum is loaded into register Z because its input control signal is activated.



In step 3, the contents of register Z are transferred to the destination register R3. This last transfer cannot be carried out during step 2, because only one register output can be connected to the bus during any clock cycle.

## **Fetching a Word from Memory**

- The processor has to specify the address of the memory location where this information isstored and request a Read operation.
- This applies whether the information to be fetched represents an instruction in a programor an operand specified by an instruction.
- The processor transfers the required address to the MAR, whose output is connected to the address lines of the memory bus.



Figure: Connection and control signals MDR.

- At the same time, the processor uses the control lines of the memory bus to indicate that a Read operation is needed.
- When the requested data are received from the memory they are stored in register MDR, from where they can be transferred to other registers in the processor.
- The response time of each memory access varies (cache miss, memory-mapped I/O etc.)
- To accommodate this, the processor waits until it receives an indication that therequested operation has been completed (Memory-Function-Completed, MFC).



### Example: Move (R1), R2

- The output of MAR is enabled all the time.
- Thus the contents of MAR are always available on the address lines of the memory bus.
- When a new address is loaded into MAR, it will appear on the memory bus at the beginning of the next clock cycle.
- A read control signal is activated at the same time MAR is loaded.
- This means memory read operations requires three steps, which can be described by the signals being activated as follows
  - o R1 out, MAR in, Read
  - MDR inE ,WMFC
  - o MDR out ,R2 in

## **Storing a word in Memory**

- Writing a word into a memory location follows a similar procedure.
- The desired address is loaded into MAR.
- Then, the data to be written are loaded into MDR, and a write command is issued.



Figure: Connection and control signals MDR.



Example: Move R2,(R1) requires the following steps

- o R1 out ,MAR in
- o R2 out ,MDR in ,Write
- o MDR outE ,WMFC

## **Execution of a complete Instruction**

Consider the instruction

Add (R3), R1

This adds the contents of a memory location pointed to by R3 to register R1. Executing this instruction requires the following actions

- Fetch the instruction
- Fetch the first operand (the contents of the memory location pointed to by R3)
- Perform the addition
- Load the result into R1

## Control sequence for execution of the instruction Add (R3),R1.

#### **Step Action**

- 1. PC out, MAR in, Read, Select4, Add, Z in
- 2. Zout, PC in, Yin, WMF C
- 3. MDR out, IR in
- 4. R3 out, MAR in, Read
- 5. R1 out, Y in, WMFC
- 6. MDR out, SelectY, Add, Z in
- 7. Z out, R1 in, End

In step 1, the instruction fetch operation is initiated by loading the contents of the PC into the MAR and sending a Read request to the memory. The Select signal is set to Select4, which causes the multiplexer MUX to select the constant 4. This value is added to the operant at input B, which is the contents of the PC, and the result is stored in register Z. The updated value is moved from register Z back into the PC during step 2, while waiting for memory to respond. In step 3, the word fetched from the memory is loaded into the IR.



#### **Execution of Branch Instructions**

- A branch instruction replaces the contents of PC with the branch target address, which is usually obtained by adding an offset X given in the branch instruction.
- The offset X is usually the difference between the branch target address and the address immediately following the branch instruction.
- Control sequence for an unconditional branch instruction.

### **Step Action**

- 1. PC out, MAR in, Read, Select4, Add, Z in
- 2. Z out, PC in, Y in, WMF C
- 3. MDR out, IR in
- 4. Offset-field-of-IRout, Add, Zin
- 5. Z out, PC in, End

Processing starts, as usual, with the fetch phase. This phase ends when the instruction is loaded into the IR in step 3. The offset value is extracted from the IR by the instruction decoding circuit. Since the value of the updated PC is already available in register Y, the offset X is gated onto the bus in step 4, and an addition operation is performed. The result, which is the branch target address, ia loaded into the PC in step5.

# **Multiple-Bus Organization**

- To reduce the number of steps needed, most commercial processors provide multiple internal paths that enable several transfers to take place in parallel.
- Figure shows a three bus structure used to connect the register and the ALU of the processor.
- All general purpose registers are combined into a single block called the register file. It consists of three ports. There are two outputs, allowing the contents of two different registers to be accessed simultaneously and have their contents placed on bus A and B. The third port allows the data on bus C to be loaded into a third register during the same clock cycle.



- Bus A and B are used to transfer the source operands to the A and B inputs of the ALU, where an arithmetic or logic operation may be performed. The result is transferred to the destination over bus C
- Incrementer unit, which is used to increment the PC by 4.



Figure: Three bus organization of the datapath.

Consider three operand instruction

Example: ADD R4, R5, R6

#### **Step Action**

- 1. PC out, R=B, MARin, Read, IncPC
- 2. WMF C
- 3. MDR outB, R=B, IR in
- 4. R4 outA, R5outB, SelectA, Add, R6in, End



#### **Hardwired Control**

- To execute instructions, the processor must have some means of generating the control signals needed in the proper sequence.
- The hardwired control is used to solve this problem.
- Hardwired system can operate at high speed; but with little flexibility.

The required control signals are determined by the following information:

- Contents of the control step counter
- Contents of the instruction register
- Content of the condition code flags
- External input signals, such as MFC and interrupt requests



Figure: Control unit organization

- The decoder/encoder block is a combinational circuit that generates the required control outputs, depending on the state of all its inputs.
- The step decoder provides a separate signal line for each step or time slot, in the control sequence.



- The output of the instruction decoder consists of a separate line for each machine instruction.
- For any instruction loaded in the IR, one of the output lines INS1 through INSm is set to 1, and all other lines are set to 0.
- The input signals to the encoder block is combined to generate the individual control signals Yin, PCout, Add, End< and so on.
- An example of how the encoder generates the Z in control signal for the processor organization

$$Z in = T 1 + T 6 \cdot ADD + T 4 \cdot BR + ...$$

## **A Complete Processor**

- A complete processor can be designed using the structure shown in figure.
- This structure has an instruction unit that fetches instructions from an instruction cache or from the main memory when the desired instructions are not already in the cache.
- It has separate processing units to deal with integer data and floating point data.
- A data cache is inserted between these units and the main memory.
- The processor is connected to the system bus and, hence, to the rest of the computer, by means of a bus interface.



Figure: Block diagram of a complete processor



**Department of ECE** 

## **Microprogrammed Control**

- Control signals are generated by a program similar to machine language programs.
- Control Word (CW); microroutine; microinstruction
- Control signals are generated for each execution step based on the instruction in the IR.
- In hardwired control, these signals are generated by circuits that interpret the contents of the IR as well as the timing signals derived from a step counter.
- Instead of employing such circuits, it is possible to use a "software" approach, in which the desired setting of the control signals in each step is determined by a program stored in a special memory.
- The control program is called a microprogram to distinguish it from the program being executed by the processor.
- The microprogram is stored on the processor chip in a small and fast memory called the microprogram memory or the control store.



depicts typical **Figure** organization of the hardware needed for microprogrammedcontrol.



- It consists of a microinstruction address generator, which generates the addressto be used for reading microinstructions from the control store.
- The address generatoruses a microprogram counter, uPC, to keep track of control store addresses when readingmicroinstructions from successive locations.
- Themicroinstruction address generator decodes the instruction in the IR to obtain the startingaddress of the corresponding microroutine and loads that address into the µPC.
- This is theaddress that will be used in the following clock cycle to read the control word. As execution proceeds, the microinstruction address generator increments the µPC to read microinstructions from successive locations in the control store.
- One bit in themicroinstruction, which we will call End, is used to mark the last microinstruction in a givenmicroroutine. When End is equal to 1, the address generator returns to the microinstruction, which causes a new machine instruction to be fetched.