Skip to content

KimiroDev/RISC-V-PROCESSOR

Repository files navigation

RISC-V 5-Stage Pipelined Processor — Documentation

Architecture Overview

This project implements a 32-bit RISC-V RV32I pipelined processor in Verilog, targeting simulation in ModelSim. It is a classic 5-stage in-order pipeline with forwarding and hazard detection.

The design is split into the following source files:

File Description
dut.v Top-level pipeline datapath — connects all stages
control.v Main decoder — generates control signals from opcode
alu.v 32-bit ALU with 10 operations
alu_control.v ALU decoder — maps alu_op + funct3 + funct7 to alu_ctrl
imm_gen.v Immediate generator — sign-extends all 5 encoding types
register_bank.v 32x32 register file, write on negedge, read combinational
data_memory.v Byte-addressed data memory with funct3-controlled access width
forwarding_unit.v EX-EX and MEM-EX forwarding
hazard_detection.v Load-use hazard detection and stall generation
PC.v Program counter register
instruction_memory.v Instruction memory, loaded from .mem file

The processor is instantiated inside top.sv which provides clock, reset, and memory file loading for simulation.


Pipeline Stages

The pipeline follows the classic 5-stage RISC-V structure. Each stage is separated by a set of pipeline registers that latch values on the rising clock edge.

IF → IF/ID → ID → ID/EX → EX → EX/MEM → MEM → MEM/WB → WB

Stage 1 — Instruction Fetch (IF)

The PC register outputs the current instruction address. The instruction memory reads the instruction at that address combinationally. On each rising clock edge, the PC advances to mux_to_pc, which selects between PC+4, a branch target, or a jump target.

The IF/ID pipeline register latches the instruction and PC address. It is frozen when a load-use stall is detected (if_id_stall) and flushed when a branch is taken or a jump resolves (pipeline_flush).

Signals latched into IF/ID:

  • if_id_instruction — raw 32-bit instruction word
  • if_id_pc_addr — address of the fetched instruction

Stage 2 — Instruction Decode (ID)

The instruction word is decoded by the control unit using the opcode field ([6:0]). The register file is read using rs1 and rs2 fields from the instruction. The immediate generator sign-extends the immediate value according to the instruction encoding type.

The ID/EX pipeline register latches all decoded values. It is flushed on reset, load-use hazard (id_ex_flush), or branch/jump taken (pipeline_flush), inserting a NOP bubble into the pipeline.

Signals latched into ID/EX:

  • Register data: id_ex_rs1, id_ex_rs2
  • Register addresses: id_ex_rs1_addr, id_ex_rs2_addr (for forwarding)
  • Destination register: id_ex_rd
  • Instruction fields: id_ex_funct3, id_ex_funct7
  • Immediate: id_ex_imm
  • PC values: id_ex_pc, id_ex_pc_plus4
  • Control signals: id_ex_alu_op, id_ex_alu_src_a, id_ex_alu_src_b, id_ex_reg_write, id_ex_mem_to_reg, id_ex_mem_read, id_ex_mem_write, id_ex_branch, id_ex_jump

Stage 3 — Execute (EX)

The ALU decoder produces a 4-bit alu_ctrl signal from alu_op, funct3, and funct7. The forwarding unit selects the correct ALU inputs — either from the ID/EX registers or forwarded from EX/MEM or MEM/WB. The ALU computes the result, zero flag, and signed/unsigned comparison flags (lt, ltu). The branch target address is computed as PC + imm.

Forwarding encoding (fwd_a, fwd_b):

  • 2'b00 — no forwarding, use ID/EX register data
  • 2'b10 — forward from EX/MEM ALU result
  • 2'b01 — forward from MEM/WB write data

ALU source mux encoding:

  • alu_src_a: 00=RS1, 01=PC, 10=zero (LUI)
  • alu_src_b: 00=RS2, 01=immediate, 10=4 (PC+4)

Signals latched into EX/MEM:

  • ex_mem_alu_result, ex_mem_zero, ex_mem_lt, ex_mem_ltu
  • ex_mem_rs2_data — forwarded RS2 value for stores
  • ex_mem_branch_target — computed branch destination
  • ex_mem_pc_plus4 — for JAL/JALR writeback
  • ex_mem_rd, ex_mem_funct3
  • Control signals: ex_mem_reg_write, ex_mem_mem_to_reg, ex_mem_mem_read, ex_mem_mem_write, ex_mem_branch, ex_mem_jump

Stage 4 — Memory (MEM)

The data memory is accessed for load and store instructions. The funct3 field controls access width (byte, halfword, word) and sign extension. The PC mux is also resolved here — branch condition is evaluated using the ALU flags and funct3, and the correct next PC is selected.

Branch condition logic:

funct3 Instruction Condition
000 BEQ zero == 1
001 BNE zero == 0
100 BLT lt == 1
101 BGE lt == 0
110 BLTU ltu == 1
111 BGEU ltu == 0

PC mux priority:

  1. Jump (ex_mem_jump) → ex_mem_alu_result (computed target)
  2. Branch taken → ex_mem_branch_target
  3. Default → pc_addr + 4

Signals latched into MEM/WB:

  • mem_wb_alu_result, mem_wb_read_data, mem_wb_pc_plus4
  • mem_wb_rd, mem_wb_reg_write, mem_wb_mem_to_reg

Stage 5 — Writeback (WB)

The writeback mux selects the value to write back to the register file based on mem_to_reg:

mem_to_reg Source
2'b00 ALU result
2'b01 Memory read data
2'b10 PC+4 (JAL/JALR link)

The register file write occurs on the falling clock edge, one half-cycle after the MEM/WB latch updates on the rising edge.


Hazard Handling

Load-Use Hazard

Detected by hazard_detection.v when a load instruction is in EX and the immediately following instruction needs its result. Resolution: freeze the PC and IF/ID register for one cycle, flush ID/EX to insert a NOP bubble.

Data Hazards (RAW)

Handled by forwarding_unit.v. EX/MEM forwarding takes priority over MEM/WB forwarding when both match the same destination register.

Branch Hazard

Branches resolve in the MEM stage. When a branch is taken or a jump executes, two incorrectly fetched instructions are in IF and ID. These are flushed by zeroing IF/ID and ID/EX (pipeline_flush).

Branch penalty: 2 cycles on every taken branch or jump.


Known Issues and Limitations

R-type Forwarding (EX-EX path)

R-type instructions that depend on the result of a previous R-type instruction in the immediately preceding cycle do not forward correctly in all cases. The fwd_a signal (ALU input A) does not always assert when it should, causing the instruction to read a stale value from the register file. The root cause is a 1-cycle misalignment between when id_ex_rs1_addr is valid after a branch flush and when the result is available in the forwarding network. This affects loops and any sequence of dependent R-type instructions.

Workaround: Insert an independent instruction between dependent R-type instructions, or use immediate instructions (addi, etc.) instead.

Branch Penalty Not Hidden

The 2-cycle branch penalty is not mitigated. No branch prediction or delayed branching is implemented. Every taken branch wastes 2 cycles.

No Exception or Interrupt Handling

There is no CSR (Control and Status Register) support, no trap mechanism, and no privilege levels. ecall, ebreak, and all CSR instructions are unsupported.

Memory Map

Instruction and data memory are separate modules (Harvard architecture). Data memory is byte-addressed starting at address 0. There is no memory-mapped I/O.

Instruction Support

The following base RV32I instructions are implemented and tested:

  • I-type: addi, slti, sltiu, xori, ori, andi, slli, srli, srai, lw, lh, lb, lhu, lbu, jalr
  • S-type: sw, sh, sb
  • B-type: beq, bne, blt, bge, bltu, bgeu
  • R-type: add, sub, sll, slt, sltu, xor, srl, sra, or, and (forwarding issues, see above)
  • U-type: lui, auipc
  • J-type: jal

The following are not implemented: fence, ecall, ebreak, all CSR instructions, and the M/A/F/D extensions.


Assembler

A two-pass Perl assembler (rvc.pl) is provided that translates RISC-V assembly to the .mem format expected by the simulation. It supports all base RV32I instructions and the standard pseudoinstructions (li, mv, la, not, neg, j, ret, call, tail, beqz, bnez, blez, bgez, bltz, bgtz, bgt, ble, bgtu, bleu, seqz, snez, sltz, sgtz).

Usage:

perl rvc.pl <input.asm> <output.mem>

Note: Immediate values must be in decimal. Hex literals (0x...) are not currently supported and will be silently misassembled.


Scripting and Simulation Automation

A key focus of this project was building a repeatable, automated simulation flow using Perl and ModelSim, reducing manual steps and enabling consistent regression testing across design iterations.


Assembler (rvc.pl)

A two-pass Perl assembler translates RISC-V assembly source files into the .mem binary format consumed by the ModelSim simulation environment.

Design decisions:

  • Two-pass architecture — Pass 1 strips comments, resolves label addresses, and computes instruction sizes (accounting for pseudoinstructions that expand to 2 words such as li, la, call, and tail). Pass 2 performs the actual encoding with all label addresses known, enabling correct forward references in branches and jumps.
  • Recursive pseudo expansion — Pseudoinstructions are expanded by recursively calling the assemble subroutine with rewritten operands, keeping encoding logic in a single place and avoiding duplication.
  • Immediate encoding correctness — B-type and J-type immediates use the RISC-V scrambled bit layout. The assembler correctly handles the upper/lower split for 32-bit li, accounting for sign extension of the lower 12 bits when constructing the lui upper immediate.
  • Padding — Output is zero-padded to fill the full memory size, matching the array dimensions declared in the HDL memory module.

Supported pseudoinstructions: nop, mv, li (full 32-bit), la, not, neg, seqz, snez, sltz, sgtz, beqz, bnez, blez, bgez, bltz, bgtz, bgt, ble, bgtu, bleu, j, jr, ret, call, tail.

Known limitation: Immediate values must be decimal. Hex literals (0x...) are not currently parsed and will be misassembled silently.


Simulation Runner (run_waveform.pl)

A Perl script automates the full ModelSim compile and simulate flow, replacing a static .do file with a dynamically generated one. This was motivated by the need to support multiple source file configurations without manually editing the simulation script each time.

What it does:

  1. Reads a sources.txt file listing all HDL source paths to compile, allowing the file list to be updated without touching the script itself.
  2. Dynamically constructs a vlog compile command with all source files, then a vsim invocation with the .mem file passed as a plusarg (+MEMFILE=).
  3. Generates a .do script on the fly that includes wave group definitions, a run -all, and wave zoom full, then invokes ModelSim with it via system().
  4. Integrates with VS Code via a tasks.json build task, allowing the simulation to be launched with a single keypress.

Design benefit: The separation between source list (sources.txt), simulation parameters (passed as arguments), and the script logic means the runner is reusable across different test programs and design configurations without modification — matching the goal of multiple-project compatibility described in automation tooling roles.

Usage:

perl run_waveform.pl scripts/p1.mem

Simulation Environment

The HDL is compiled and simulated using ModelSim (Intel FPGA Edition). The flow is:

vdel -all          ← clean previous compilation
vlib work          ← create work library
vmap work work     ← map library
vlog -sv <files>   ← compile all HDL sources
vsim top +MEMFILE= ← launch simulation with memory initialisation

Wave groups are organised by pipeline stage and module (control, alu, dut_signals, reg_bank) for efficient signal navigation during debug. The +MEMFILE plusarg mechanism allows different test programs to be loaded without recompiling the HDL, which is important for regression testing where the same design is exercised with many different stimulus files.


Directed Stimulus Approach

Each test program is a hand-written RISC-V assembly file that exercises a specific subset of the instruction set. Tests follow a common convention: on completion, register x10 holds 1 for pass or 0 for fail, and the processor halts in an infinite loop (j end). This makes result extraction straightforward — reading x10 from the register file at end of simulation gives an unambiguous pass/fail indicator without requiring a complex self-checking testbench.

This approach mirrors directed stimulus methodology used in gate-level verification, where targeted sequences are written to exercise specific datapath conditions (hazards, forwarding paths, branch penalties) and verify conformance to the ISA specification.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors