# **H05d3A: Computer Architectures**

# **Exercise sessions 1-2-3**

## Goal of these exercise sessions

In these exercise sessions, you will develop a complete microprocessor in Verilog. You start by assembling the necessary components to a single cycle processor and run a simple program on it. (For specifications of the architecture please refer to Chapter 4 Section 4.4 or the course slides) Next, you will increase the performance, by pipelining the processor. The main objective of this lab session is ensuring the functionality of our CPU as well as to understand its performance and necessary hardware resources. The processor will be synthesized in a 32 nm CMOS technology using generic libraries from Synopsis using regular Vt Cells. The clock frequency used for synthesis is 10 MHz, which means a clock period equals to 100 ns.

## Information about the tools

In order to complete your assignment, you will work with a set of tools. These tools are set up for you, but they are also very typical tools used in industrial digital design. The performance of the processor will be evaluated through **cycle-accurate** **simulation** using NC Verilog. Through the use of **synthesis tools** from Design Vision, the hardware resources needed for its implementation will be evaluated.

### Project workspace

To create your workspace go to Toledo and download the file *processor\_design.zip located* in the folder *Exercise Sessions.* Extract it to some location of your system. The folders that compose our project workspace are:

* tools\_scripts: Scripts for running the tools used during the exercises.
* RTL: All the Verilog files that define the logic of the processor.
* SIM: Files for running the simulation of the HDL code.
  + SIM/data: Subfolder that contains the assembly code to run.
* SYNTHESIS: Files for running the synthesis of the processor.

### TO Run CYCLE-accurate simulation (to check functional correctness):

To run the simulation of the processor:

* ![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAARMAAAFbCAYAAAD2omq5AAATBElEQVR42u3d7atl10HH8fkThCKCiPhCfaViOzqD0xc1FjovwlCROjJQNfjKijChnWBNaWYsbSFCRSU+DNQpWJswSiW2QooorWh7qVIzSZoxjTNNCcRJ0jyQJ/I826yNE+6ce/fZa6299vMn8OX2nnPu2vvcOfvb3157r989cOXKlWpMTp8+XY29DwC6c8AvAcAiZCKZAGQCAJIJAMkEAJlIJgCZAMBaksnx48erEydO7Et4zj84sCCZBHm0kTt2kEbTf+G5Yr+0AweuY9tzm89vvjbl8abnc/cl9fu2fW56vzG/i9TfEcikVRa5Mjl69GgtjNdee626ePHidYTHwnPhNSVEsu2x2APw2mPbDr5tB22MAFL3tW2sbbLI3UaMtB2oZBIli80PS45MgiSOHDlSC+PFF1/cI5PwWHguvKarUFJTQ98y6SK2kskkdRux+0YmZDKoTHYnk2eeeWaPTMJjpZJJzMGe8v/cKQdc2wGWI4OmA7/UKVjbaViOdEAm18mi6Xw+N5kcPny4Fsbjjz++RybhsfBceE0poTRJJWbOpA+ZtB3EbfvS9DVFJrnvNfVUDWSSlExSv15LJo8++ugemYTHSiaTLqcOqROj+x3csfMQKUmlq0xy9yFlDghkMsicyaFDh2phXLp0aY9MwmPhufCaPuZMusxTxJ5upJ6O5MzflBwz97RqP9ESCpmMcjXngQce2COT8NhUruakyiTnis0cZJK7byCTXu8zCZI4ePBgLYydnZ09MgmPhefCa0oJJfbejSFk0rQ/Q8ok9t6brvsGMhnkDthjx4413gEbnvMPDlibA4BMrBoGyAQAJBMAkgkAMpFMADIBAMkEgGQCgEymlUx0wALW5uiATViPsoYOWCuGyWT0VcM6YOffAavThEyyZKIDVgds23NkQiaDyWR3MtEBm78vU+uAdZpDJlEy0QFbViZL7YCVTMikczLRAdsskCV3wJIJmYw+Z6IDtvv8Tckxu5xWkQmZTOJqjg7YecjEBCyZTPI+Ex2wy+uANQFLJqPeAasDFrA2BwCZWDUMQDIBIJkAIBMAZCKZAJBMAEgmAMgEAJnMK5nogAWszdEBm7CwbQ0dsLnbglXDxVYN64BdRgdsbrEU9Jl0kokO2P5lkiqDLhUEXbcPMimSTHTA5u/LlDpgYyRMLGSy9XxeB6wO2Jj3JKmQiQ7YKzpg+0x5IJPOcyY6YLvP35Qcs8RpFZmQyahXc3TAzkMmpSZgycR9Jjpge7hku6QOWPeZkMmod8DqgAWszQFAJlYNA5BMAEgmAMgEwPJkcvXqP1VjEpLJ2PsAoDsH/BIALEImkglAJgCwzGTS1gHrHxxYkExi1ubkjt3WATvLf6D/X4PS9D2wapm03XeSM25sByyZAGSyVSQpHbAlDu5tB3mbAGKFsN+2tm0HIJOCyaStA7avlJBy0KdIoOlnCQVkEtEBmyOSlA7YUjJJFU1JmTjtAZlEJpPUr7EdsCWTCZkAC5wzSemAJROATFrnTNo6YMeeM9lv8jZ2m20TsoD7TDreZ5LaATv01ZyuV2FczQGZDHwHbFsH7JD3gwBkAjIBrBoeb0Vyrkz2u0ROTCATAGQimQCQTACML5Mlt9Nrvge00wMgE8kEIBPs4ZZPn69u/cO73+aWT52P/tmf/PQhv0OQydqTycnTn6/+4I/vru75l/+ovvHNe6uHHnqo2vnmf1Vfuufr1Wf+8h+rm898oTpz5sx13H777dWdd95Zfe6uc9WN5365+pGPvrN6x2//lA8ayGStfOj3/6r63N/+Z3Xp8iN15WP479QnP19/femll+qFg2f/5t9rgYSulN08+eST1a1fva368U/+XPWODw0rkmudMEvb3u7tDP0eIZl0GvvW2/+uevh/Lr9dRn316tXq6Ac/Vb3++uvVm2++WX+9//77a5l84WtfrH7hz99X/evFf6tl8tlvnasO3vGeWiQ/cOInyGQB24RkksVv3nxH9aWvfKMWRpDINYJMXn311bpPNnx97rnnapkcOfP+6kc/9q7qvX99rPrit/++OnT2F6sf/N2frg6eel/ng2Z3A93mwbT5XNPrS42/+XOxB3bT+E2pI2afCIVMZpFMbvrI2erCfRfrBPLGG2+8zftPnK7+9/z56vnnn6+efvrp6uWXX65l8q2HLlQ/dvPhWijvPPvu6odv+dlaMJ/50z8q9v+++x1wXU4BSowfe1DHyqRtnyQTMpkdHzz5F9Vn79qp08crr7xSf33p+9+vzr77xlooIaGcPH3u7WQSTm2+duHrtVB+6OTP1F/D9+G5PqJ87EFXak4i9rmc8WPfJ5mQyWSSScrXX/udP6ke+O/vVi+88EJ9SvPcW49/59Sp6tnHHqueffbZeoL1sbf+d+iV3T0BG+ZOwjxJ+Bq+n7pM9ivzHlsmMftEJmQyG278jU9U//CVneqpp56q/0TGd267rXrqLXGE75944olaJI888kh9qXjz0vCHP37Ldd9PVSa5wugqk6avTnPIZLFXc2768J9Vly5/77rHrqWRy5cv1yK57777qlOfOFf9/Htv6v2KRe7B3uf4qROhKROxZEImi+GGD3y0uuvLF+sEco0gkYcffrh68MEHq3vvvbd+/pd+9WO9XgLddmWj6wG+OX7MRGjO1ZyUVBOzT67mkMns7oC94QO3Vr/1kTuq83f/cy2QII8LFy5UX77nq9XJj5+tn/cBAiSTKN51w69X7/mV36uv4FwjfB8en/Q/6pYqyCVsD2Ri1TBAJgAgmQCQTACQiWQCkAkASCaSCSCZACCTxSeT48ePVydOnNiX8JwPE8hkQW8myKON3LGDNJr+C8/1fUdrydcBkknHsXK3dfTo0VoYoVz64sWL1xEeC8+F15AJyGRByaS0TIIkjhw5UgsjlCVtyiQ8Fp4LrykplKbVufpRQSYLSCahLGlTJuGx0skktpO1TTSAZNJRJk2rV3OTyeHDh2thhDrGTZmEx8Jz4TWlhNJ3ORIgmRRKJqlfryWT0LS2KZPwWJ/JhExAJguaMzl06FAtjEuXLu2RSXgsPBdeI5mATMyZRM2ZhD8LuimT8NhYcyZt8ymAZDKh+0yCJA4ePFgLY2dnZ49MwmPhufCaIa7mtF3pIRRIJhMf+9ixY413wIbnfJhAJn4JACQTK5IBMgFAJtKDZAJIJgAkE8kEIBMAZCKZSCaAZAJAMlnP2PplQSYowpj9shYIQjJZyNhj98uSCSSTBTB0v2xTT2xsXwogmcwgmfTdLxsrEDKBZDLTZDJ0vyyZQDKZ6Nhdvw7ZL7tfyTaZQDJZSDIZql/WaQ4kk5XMmfTdL5sjEzWQkExmlEyG7JfdPMVJ7ZgFJJOJj61fFmQCAJKJVcOAZAKATCQTgEwAQDKRTADJBACZSCYAmQCAZDKvsXXAgkxQhKV3wG4b39ofMpFMdMCSCSSTKa0aXlIHbFPx0raVymRCJpKJDtjs8bdtD2SCTJEspQM29lRGkxuZSCYNY+uAJRNIJpNIJkvpgCUTSCYTmTOZeweswmpIJiMnkyV1wG67WrRtWz4LZCKZFBpLByzIBOv5oGxM3jalEJCJZGLVMCCZACATyQQgEwCQTCQTQDIBQCaSCUAmACCZTHxsHbAgExRh6R2wU7yb1+dOMlnsquEld8CSCZlggFXDS++A3VydvN94MT9Xahsp76lp+zHVC037QCaSSe/JZOkdsDnba+uLzdlGakfLfpJK2ac1pSHJZORkstQO2JwDLvUALLGNrr+jmKSyltMqyaTj2DpguwsjZr9S3gOZSCarTCZL7YAtkT5i5yemKhOnOZLJKHMmS++A7XowlpJJ7ntK/X2ZgMXgyWQtHbBdrubkvodSV3NixBXzlwrJRDLpfeyld8C6r8N9JsDWidLYU4apykSvrWRi1bBkAjIBQCbSg2QCSCYAJBPJBCATAGQimUgmgGQCQDJZ5dg6YEEmKMIQHbBj3zg29PbXtmKXTCSTwTpg1yYTd+BKJqtbNTxUB+y2lbt9dK2m/lyX6oSY7RGKZLKaZDJEB2xsx8cYfa65MskpMIJksshkMlQH7Bhdq7HbL7XfZCKZzHrsuXTApkqgRNcqmZAJBkwmU+uALXUAkwmZSCYjzZkM0QHbpWG9z+dSTrHIhEzQIJKhOmBjrnxsuyqT2rWaIqgUEfZdIg3JZNZjL70DFmQCDHJDmR5WMpFMfIAAyQSAZCKZAGQCgEykB8kEkEwASCaSCUAmAMhEMhlwbB2wIBMUYYgO2L7uXh3y53LGdyetZLK6VcN9d8CuSSZjbYtM/BJGXTU8VAfsthW9Mat991v+37aqOHYFcawI9L+SiWQSkUyG6ICNPRXI7VktUaKUss/6X8kEu0QyVgdsKdHEvK4vmeTsFySTyY491w5YMoFksrBkMlYHLJlAMlnonMkQHbBdD8wxZaL/lUzQIpKhOmC7Xs1pO8C7jBmzz10kRyiSyWrG1gELMgEKJSX9r2QimQCQTABIJpIJQCYAyER6kEwAyQSAZCKZAGQCgEwkk4HG1gELMlkRQR5t5I49Rgds7spdQDLpeazcbY3VAUsmkEwmIr3NgyxHJmN1wDatum3reW1bedwmpLZqgJge2hI/32UVMySTScpkdzIZsgO2VDdISrrZrwA6p4e268+TiWQymWTStMI1N5mM1QEb27yW0tCWOl5XoZUoeCITyWSyyWQuHbDbTnPaDsTUU52Udvu2fSgtEyKRTBYzZzJWB+zQpzmpY/eVbLbNw0AyWcTVnCE7YMeYM5nSaQ6ZSCaLu89krA7YmKs5qX/Jr6tMYq7GlPh5cyWSyaLH1gE70oefTCQTTO+gnFuXK5FIJlYNg0jIBACZSCaSCSCZAJBMJBOATACQiWQimQCSCQDJZLVj64AFmVibM5sO2K43avVxo5cqADKxariHVcNDd8BOTSYgk1Ulk20HwtQ7YPdLAU3rZ2JXCsd2qsZ0yaauANZDSyaLSSZL7YBNaUXbJo/U50r1ucbWL+ihlUxGl94SOmBzagxzO1W7NqilbrvLtvTQSiaTSiZz6ICNPS1Zo0z00Eoms50zGaMDNqeJbI3JpK9ks20ehkxczZldB2yfcb7vOZMlnOaQiftMZtsBW+I0Z9uHP/fKTO7VnD5kErtdPbSSyWTH1gG74nszyASY3kGph1YyMTYkEjIBAMlEMgEkEwBkIpkAZAIAkolkAkgmAMhEMolAByzIxNqc2XTAwk1mkslKVg1PvQOWTMhEMulJJnPtgE3pN43piG3qlo1dRdulJChmH3O6V9r6alPfF5lIJr3LZHcyGaIDtnSna9MBFjtOW6dqSlIo1XhWus+FTNAokyV0wKYcHF0O1G37kNN2Fnva0ZdoSvwOyEQyWVQH7JAy6dKpOmWZxLwvMsGgcyZjdMAOJZMSlY1zSCYmaiWTSV3NGaIDtnRBdOrfgulTJqXmfJzmSCazvM9kjA7Y2H7SLldzto3T9nOpf6Uv9WpOyv7HTDYv8e/fSCYzHnuIDti+PuTuv4Bk4qar4nMvpfd3bp2uZCKZrGLVcMmD0IENyQSAZCKZAGQCgEwkE8kEkEwASCaSCUAmAMhEMhlnbB2wIBMUYW4dsEMtvS+xghiSyWrGnlsHbNOBSyaQTEZkrA7Y3JW1TSuAm9bs5Nxyv63CMbaLNWfbqauIm35mzYKTTCaSTMbsgG2rBEjtIcldDNh310iKxHIqCtaelCSTkZPJFDpgSxw0pWXS9nOlm89KvK+1L36UTDqO3fXrmB2wsT2mOTLJqQ2IPVhj9rfLtslEMpllMplCB2zKa0tWL+Zud4hTrJL7QCaSyaBzJkN0wOYKYy1zJjnpywSsZDKZZDJ0B2zsFZeYScixr+bE/gW+mH1IfV+u5kgmkxx7iA7YIe73gPtM/CKQPb+Se+fs0L2uOmUlE6uGJ3TLPCCZAJBMJBOATACQiWQimQCSCQDJxNgAmQAgE8lk3LF1wIJMUAQdsP3cxu+GO8lklauGdcCSCZmgk0h0wOavGm76udTt63+VTBaVTHTAlu8z0f8qmawqmeiALbtPJSSW854sipRMOo+tA/bK1tOnUh2wbWM2PUYmkslqkokO2Cud0kesGPqUidMcyWRScyY6YMedM8lJXiZgJZNJJRMdsP1czUk59dD/KpksZmwdsCATLPNDsbAOWP2vkolVwxO7ZR6QTABIJpIJQCYAyEQykUwAyQSAZGJsgEwAkIlkMu7YOmBBJijCnDpgU9e69H0jnZvqyEQy2Vg1PJcO2LHW8gzZPQvJZJarhufUAdu2knZbethWXtQmhCG6ZyGZLCaZLLUDNqY9LTWZlO5RgWSyiGSylA7YrmORCZmsPpmsuQO2KV3Edrf2IRMVA5LJapPJnDtgu6SXIZIJJJNV3aW69A5YpzlkgoFEMucO2G2nOSlXc0rJxNUcyWT162d0wIJMsMwPxQinC3payUQyWbBQHMyQTABIJpIJQCYAyEQyASCZACATyQRYCv8H3RzI5ULPI1IAAAAASUVORK5CYII=)Run the tool script going to the folder tools\_scripts. Run the command *source incisiv\_15.20.058.rc*
* Go to the folder SIM. Run the command *make nc\_gui* through a terminal (Optionally run the command *make nc* to use the tool without graphical interface which checks the correctness of the program). This command will call all the hdl files declared in SIM/files\_verilog.f and will execute the testbench RTL/cpu\_tb.v. This testbench (1) declares an instance of the cpu, (2) loads the instruction memory from the file SIM/data/imem\_content.txt, (3) loads the data memory from the file SIM/data/dmem\_content.txt, and (4) Starts the execution of the instructions.
* ![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABwAAAAZCAYAAAAiwE4nAAABd0lEQVR42u2WQUsCURDH/XQeCvoAdalD3yEIDKXC2hI3M6nNVBLDRIvq0iVTs7SiWtfVrC51SHLVQ2WHLhMjKI7PQ+8JXmrhd5mZ//7YZeA9U7lchkFi+rvCofFFGJmU+2J4Qvq9EAP9PvgObuHOYYEQOtCZWi+EhB+Nbwju64S14CWEj4pMvRvMcgtfXt8hENMITt85GPVG80u7e51glluoFt9gK5IjOLzp5u8yag3Y3ssz/RaY5RbGM8+ghO8IS0qqvRSV2if4ozlmBsEstzB2XAJP6JZgX0+RTUSpN6Iyc5jlFvqjGqwGbwjzniSz/kb9C5RdlcxhlluIQTlwTZh1nzLCwlOVmcMst3BByYLDd0WwuuJEpj9WmRkEs9xCi5wCaTNLmJFP2rL8g8H0W2CWWzi1nAD7RoZgdSWaMq1kML1OMCsknPNcMOTuKz3rnQgLbe60EMJCy8qZEMLCaWdSCG6hedTW9wFsHrP932kGIJQkCQbJDx1+GFhWwTuJAAAAAElFTkSuQmCC)After running the command, the NC verilog graphical interface will open. In the left part of the window the whole hierarchy of the design could be found. With the objective of displaying the internal signals for debugging go to the target submodule and select it. Afterwards do right-click over the signals to display and select “Send to Waveform Window”. Finally run the simulation with the button:
* To change the current executed program (to change instruction and data memory content) modify the hexadecimal files imem\_content.txt and dmem\_content.txt (contained in the folder SIM/data) with the instruction memory and data memory respectively. Several example programs could be found in the folder SIM/data/testcode. The fileswith the format *NAME\_OF\_PROGRAM\_imem\_content.txt* and *NAME\_OF\_PROGRAM\_dmem\_content.txt*, which contain the data for the instruction memory as well as the data for the data memory respectively, must be copied into the files SIM/data/*imem\_content.txt* and SIM/data/*dmem\_content.txt.* These last-mentioned files are the files effectively loaded by the simulation.

### To run synthesis (for power and area estimates):

To synthesize your Verilog code into a netlist which can be used for area and power simulations:

* Run the tool script going to the folder tools\_scripts. Run the command: *source synopsys\_2017.09.rc*
* Go to the folder SYNTHESIS. Run the synthesis through the command: *source synthesis.sh*

This command will load the libraries from the technology (standard cells, memory macros, etc) as well as the design constrains through the sdc file: SYNTHESIS/design\_constraints.sdc (This file defines specifications from the designer such as the frequency, false paths, etc).

The netlist obtained through this process could be found in the folder SYNTHESIS/gate/cpu.v.

The reports of area, power, resources and timing are generated in the folder SYNTHESIS/rep.

## TASKS to be done by you in these exercise sessions:

* The datapath as well as its submodules have been already implemented for you in Verilog. However, the functionality of the processor must be completed by carrying out the tasks described underneath.
* The assignments must be completed in groups of 2 people.
* After completing each one of the exercises, the results must be shown to the TAs for its correspondent evaluation.
* You will hand in the final version of the code together with a small report based on the template that you find in the folder. This counts for 2 points in the final H05d3A grade.
* **This report is handed in through Toledo by April 10th, 2020.**

### 1. Single Cycle processor with adder

Complete the internals of the control unit to handle the following instructions: (BEQ, JUMP, LW, SW, ADDI as well as R-type ALU instructions called ‘ ALU\_R’). Set the outputs of the control unit correctly depending on the fetched instruction, ensure the correct datapath behavior.

==> For debugging purposes, you can use the test code *simple\_program* contained in the folder SIM/data, containing all the supported instructions interleaved by NOP instructions. Next, the code *MULT1* executes the multiplication of 5 integers and sum all the results. Test this code to ensure that your design is correctly implemented.

==> After obtaining a functional model run the synthesis and fill in the table in the report.

### 2. Single Cycle processor with multiplier

The processor completed in the last exercise is able to execute multiplications through sums and shifting since there is no hardware support to carry out the multiplication in one cycle. With the purpose of boosting performance, your task will be to add hardware support for multiplication and evaluating its impact on performance, area and timing. The processor must be able to process the multiplication instruction MULT. The format of the instruction is the following:

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
| 000000 | XXXXX | XXXXX | XXXXX | 011000 |

[31:26] Opcode [25:21] source reg 1 [20:16] source reg 2 [15:11] destination reg [5:0] function Field

Three files might need to be modified for this purpose: control.unit.v, alu\_control.v and alu.v. To test the performance and correctness of your design, load the assembly test code *MULT2* which uses now the implemented MULT instruction to carry out the same 5 multiplications and sums of the testcode “*MULT1*”. After updating your design, run the simulation and synthesis again.

*Is it necessary to modify the control unit for the insertion of the MULT instruction?*

### 3. PIPELINED processor

In this part of the session we will modify our processor to convert it to a pipelined implementation. For the following exercises assume data and control hazards are solved by NOP instructions.

For a processor with 5 pipelined stages (Instruction Fetch (IF), Instruction Decode (ID), Execute (EXE), Memory (MEM) and Write Back (WB)), identify which hardware resources belong to each of the mentioned stages, and which signals should go from stage to stage. Try to match the HDL code with the Architecture, and complete the tables underneath.

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
|  | IF | ID | EXE | MEM | WB |
| Datapath Resources | Instruction memory  pc | Control unit  alu control  register file  regfile\_dest\_mux | Alu  branch unit | Data memory | (Register file)  regfile\_data\_mux |

|  |  |  |  |  |
| --- | --- | --- | --- | --- |
|  | IF > ID | ID > EXE | EXE > MEM | MEM > WB |
| Signals | Instruction  pc + 4 | Read data 1  read data 2  immediate extended  branch,,mem\_read,  mem\_2\_reg,  mem\_write,  alu\_src,  jump  alu\_control | Branch pc  jump pc  alu result  zero flag,  mem\_2\_reg,  mem\_write, | ReadDATA  alu result,  mem\_2\_reg, |

Insert the pipelined registers where necessary, using the module *reg\_arstn\_en*. This module implements a register with variable width, which is set through the parameter DATA\_W. See the inset on the right for an example for such a pipelined register between the Fetching (signals *signal\_IF*) and Decode stage (*signal\_ID*). Connect the enable signal of the pipelined registers to the global enable signal called *enable*. After updating your design, run the test code *MULT2* and ensure the correct functionality. Finally, run the synthesis scripts.

*reg\_arstn\_en #(.DATA\_W(16)) signal\_pipe\_ID\_EX(*

*.clk (clk ),*

*.arst\_n(arst\_n ),*

*.din (signal\_IF ),*

*.en (enable ),*

*.dout (signal\_ID)*

*);*

### 4. PIPELINED processor with Data-hazard resolution

Since the pipelined processor implemented does not have support for handling data hazards, the execution of more complex code containing branches or data dependencies would not work properly. Add hardware support for forwarding and stalling in order to achieve this functionality. Run synthesis, and compare the results with previous implementations. After updating your design, run the test code *MULT3* and ensure the correct functionality. Finally, run the synthesis scripts for architecture comparisons.

### 5. Optional Exercise

Modify the processor architecture in order to improve the performance. Some of the possible techniques include SIMD, control hazard resolution, hardware loop counter, etc. As benchmark program, run the test code *OPTIONAL4* and ensure correct functionality. Try to reduce the execution time. Finally, run the synthesis scripts for architecture comparisons.