**NATIONAL CHENG KUNG UNIVERSITY**

**College of Electrical Engineering and Computer Science**

**DEPARTMENT OF ELECTRICAL ENGINEERING**

**VLSI System Design (Graduate Level)**

**Fall 2023**

**Summary of Final Project**

**Please don’t just write yes/no if there need more details,** **and use single-sided printing**

|  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- |
| **Simulate at SoC(yes/no)** | | | **yes** | | | | | |
| **Upload 5-min media(yes/no)** | | | **yes** | | | | | |
| **Basic** | | | | | | | | |
| **MCU** | **Pipeline** | | **Stage** | | **Max working Freq.** | | | **Data Width** |
| **5 stage** | | **100MHZ** | | | **32 bits** |
| **Number of Instructions** | | **53** | | | | | |
| **Realized Cache Specification** | | **Direct mapped L1 Cache (instruction and data)**  **Write through**  **Cache Size : 1kB**  **Block Size (cache line size) : 16Byte**  **Entry (#Block) : 64** | | | | | |
| **Cache Hit Rate of each program** | | **Prog0 (IM): 99.23%**  **Prog0 (DM\_Read): 95.47%**  **Prog0 (DM\_Write): 21.53%**  **Prog\_PROJECT\_01 (IM): 99.4%**  **Prog\_PROJECT\_01 (DM\_Read): 99.3%**  **Prog\_PROJECT\_01 (DM\_Write): 17.85%**  **Prog\_PROJECT\_02 (IM): 99.6%**  **Prog\_PROJECT\_02 (DM\_Read): 99.4%**  **Prog\_PROJECT\_02 (DM\_Write): 27.61%**  **Prog\_PROJECT\_03 (IM): 99.6%**  **Prog\_PROJECT\_03 (DM\_Read): 99.4%**  **Prog\_PROJECT\_03 (DM\_Write): 32.53%** | | | | | |
| **List of Realized Forwarding in Types and Stages** | | **Read After Write**  **Load Use Data Hazard** | | | | | |
| **Realized Performance Counters (IPC) of each program** | | **Prog0(TA)**  **Cycle:134686**  **Inst:27834**  **IPC:0.206**  **Prog\_PROJECT\_01**  **Cycle: 1182419**  **Inst: 48136**  **IPC: 0.04**  **Prog\_PROJECT\_02**  **Cycle: 2175974**  **Inst: 68917**  **IPC: 0.032**  **Prog\_PROJECT\_03**  **Cycle: 2773564**  **Inst: 81703**  **IPC: 0.029** | | | | | |
| **Interrupt mechanism** | | **External Interrupt: WFI (CPU stall) -> interrupt -> ISR -> mret**  **Timer Interrupt: interrupt -> ISR -> mret** | | | | | |
| **Memory** | **On-chip memory**  **(Total size <= 320KB)** | | **IM** | | | **DM** | | |
| **64 KB** | | | **64 KB** | | |
| **Off-chip memory** | | **SDRAM** | | | **ROM** | | |
| **8 MB** | | | **8 KB** | | |
| **ASPU** | **Max working Freq.** | | **50MHZ** | | | | | |
| **Processing speed (throughput or… )** | | **36.4 (640x480 image)/second** | | | | | |
| **Realized Specification of Functionalities in details** | | **H264 Encoder**  **預測模式:全I幀 幀內預測**  **色度:黑白** | | | | | |
| **Comparison with other works if any** | | **N/A** | | | | | |
| **BUS** | **Specify Memory and I/O mapping** | | **Slave** | **Start address** | | | **End address** | |
| **ROM** | **0x0000\_0000** | | | **0x0000\_1FFF** | |
| **IM** | **0x0001\_0000** | | | **0x0001\_FFFF** | |
| **DM** | **0x0002\_0000** | | | **0x0002\_FFFF** | |
| **DRAM** | **0x2000\_0000** | | | **0x207F\_FFFF** | |
| **SCTRL** | **0x1000\_0000** | | | **0x1000\_03FF** | |
| **WDT** | **0x1001\_0000** | | | **0x1001\_03FF** | |
| **DMA** | **0x0003\_000c** | | | **0x0003\_0014** | |
| **EPU** | **0x0010\_0004** | | | **0x0010\_0024** | |
| **Implemented Features of AXI Bus, Level of Realization, Operating Frequency,**  **Outstanding number** | | **40MHz**  **Support Burst Mode 1~64**  **3 master, 8 slave**  **Outstanding = 1** | | | | | |
| **System** | **Specify** **Cooperation between CPU, Bus, Memory, ASPU and others** | | **共有3個master (M0 : Instruction, M1 : Data, M2 : DMA)，8個slave (S0: ROM, S1: IM, S2 : DM, S3 : SCTRL, S4 : WDT, S5 : DRAM, S6 : EPU, S7 : DMA)。**   1. **一開始CPU會到ROM讀取開機指令，將存放在DRAM的program指令和初始化資料搬運到IM和DM。** 2. **接著CPU會到IM讀取program指令，啟動EPU和DMA，CPU會設定好DMA讀資料和寫資料的位址，DMA會將DRAM中的原始影像資料搬運到EPU，一次會搬運1個macroblock資料，期間CPU會進入WFI模式，等待DMA完成搬運後會發出interrupt，接著CPU會設定好下個macroblock的讀取位址，DMA繼續讀取下個macroblock資料。** 3. **過程中若EPU的內部buffer已滿，DMA會改成將壓縮完的資料從EPU搬運到DRAM，期間CPU會進入WFI模式，等待DMA完成搬運後會發出interrupt，接著DMA會繼續將DRAM中的原始影像資料搬運到EPU。** 4. **重複以上步驟直到處理完所有影像。** | | | | | |
| **Specify Hardware interrupt & Interrupt service routines** | | 1. **SCTRL interrupt: Write data to DRAM when the counter in sensor control is full** 2. **DMA interrupt: mret (nop)** 3. **WDT interrupt: mret (nop)** | | | | | |
| **Specify Mechanism for Booting from an external ROM** | | **CPU會到ROM讀取開機指令(boot.c)，將存放在DRAM的program指令(main.c)和初始化資料搬運到IM和DM。** | | | | | |
| **Specify Realized DMA(Direct Memory Access) and Usage** | | **CPU會設定DMA中的讀取位址(source address)、寫入位址(destination address)、搬運資料量(data number)，以及是否啟動(DMA enable)，DMA完成單次搬運後拉起interrupt。** | | | | | |
| **Code analysis (Superlint)** | | | **Warning rate :632 / 16482 = 3.7%**  **Warning rate exclude asynchronous reset warning :**  **632 - (294) / 16482 = 2%** | | | | | |
| **System w/ ASPU (yes/no)** | | | **yes** | | | | | |
|  | | | **Synthesis** | | | **APR** | | |
| **clock period** | | | **10.0ns** | | | **10.0ns** | | |
| **Power** | | | **330.9602 mW** | | | **349 mW** | | |
| **Area** | | | **11834500.93 um^2** | | | **11604258.32 um^2** | | |
| **Verification** | **MCU** | **prog0 pass ratio** | **100%** | | | | | |
| **ASPU** | **# and types of Direct test or constrained random test** | **Direct test for whole system:)**  **project01: BlowingBubbles\_416x240 1 frame**  **project02: BlowingBubbles\_416x240 2 frame**  **project03: Rabbit\_320x160 5 frame**  **project04: kunkun\_640x480 2 frame**  **project05: akiyo\_176x144 5 frame**  **project06: foreman\_176x144 10 frame**  **project07: carphone\_176x144 7 frame**  **project08: mobile\_352x288 4 frame**  **project09: asiagodtone\_1280x720 1 frame**  **project10: rickroll\_1280x720 2 frame** | | | | | |
| **Specify types, length, operation conditions of benchmarks** | **Direct test for whole system:)**  **project01: BlowingBubbles\_416x240 1 frame**  **project02: BlowingBubbles\_416x240 2 frame**  **project03: Rabbit\_320x160 5 frame**  **project04: kunkun\_640x480 2 frame**  **project05: akiyo\_176x144 5 frame**  **project06: foreman\_176x144 10 frame**  **project07: carphone\_176x144 7 frame**  **project08: mobile\_352x288 4 frame**  **project09: asiagodtone\_1280x720 1 frame**  **project10: rickroll\_1280x720 2 frame** | | | | | |
| **S**  **Y**  **S**  **T**  **EM** | **prog0 PR**  **simulation time** | **1347150 ns** | | | | | |
| **prog1 PR**  **simulation time** | **12307650 ns** | | | | | |
| **Specify types, length, operation conditions of benchmarks** | **Prog0:TA**  **Prog1:TA**  **project01: BlowingBubbles\_416x240 1 frame**  **project02: BlowingBubbles\_416x240 2 frame**  **project03: Rabbit\_320x160 5 frame**  **project04: kunkun\_640x480 2 frame**  **project05: akiyo\_176x144 5 frame**  **project06: foreman\_176x144 10 frame**  **project07: carphone\_176x144 7 frame**  **project08: mobile\_352x288 4 frame**  **project09: asiagodtone\_1280x720 1 frame**  **project10: rickroll\_1280x720 2 frame** | | | | | |

|  |  |
| --- | --- |
| **Advanced** | |
| **Synthesize AXI bus with burst and fully work with IPs** | **N/A** |
| **30 more instructions** | **N/A** |
| **64-bit add/sub, store/load** | **N/A** |
| **I/O PADs** | **N/A** |
| **More cache (L2 or L3)** | **N/A** |
| **dynamic branch prediction** | **N/A** |
| **CRT for more than two IPs** | **Yes (ISA Formal)** |
| **floating-point co-processor** | **N/A** |
| **Bootable by an operating system** | **N/A** |
| **Verify with FPGAs, specify FPGA board, what module has been put on the board and how you confirm results** | **N/A** |
| **Other Properties, please specify** | **ISA Formal Verification on RV32I instructions** |
| **References** | **N/A** |