|corev| Pipeline
|corev| has a 4-stage in-order completion pipeline, the 4 stages are:
- Instruction Fetch (IF)
- Fetches instructions from memory via an aligning prefetch buffer, capable of fetching 1 instruction per cycle if the instruction side memory system allows. The IF stage also pre-decodes RVC instructions into RV32I base instructions. See :ref:`instruction-fetch` for details.
- Instruction Decode (ID)
- Decodes fetched instruction and performs required register file reads. Jumps are taken from the ID stage.
- Execute (EX)
- Executes the instructions. The EX stage contains the ALU, Multiplier and Divider. Branches (with their condition met) are taken from the EX stage. Multi-cycle instructions will stall this stage until they are complete. The address generation part of the load-store-unit (LSU) is contained in EX as well.
- Writeback (WB)
- Writes the result of ALU, Multiplier, Divider, or Load instructions instructions back to the register file.
:numref:`Cycle counts per instruction type` shows the cycle count per instruction type. Some instructions have a variable time, this is indicated as a range e.g. 1..32 means that the instruction takes a minimum of 1 cycle and a maximum of 32 cycles. The cycle counts assume zero stall on the instruction-side interface and zero stall on the data-side memory interface.
Cycle counts per instruction typeInstruction Type | Cycles | Description |
---|---|---|
Integer Computational | 1 | Integer Computational Instructions are defined in the RISCV-V RV32I Base Integer Instruction Set. |
CSR Access |
1 (all the other CSRs) |
CSR Access Instruction are defined in 'Zicsr' of the RISC-V specification. |
Load/Store | 1 2 (non-word aligned word transfer) 2 (halfword transfer crossing word boundary) |
Load/Store is handled in 1 bus transaction using both EX and WB stages for 1 cycle each. For misaligned word transfers and for halfword transfers that cross a word boundary 2 bus transactions are performed using EX and WB stages for 2 cycles each. |
Multiplication | 1 ( 4 ( |
|corev| uses a single-cycle 32-bit x 32-bit multiplier with a 32-bit result. The multiplications with upper-word result take 4 cycles to compute. |
Division Remainder |
3 - 35 3 - 35 35 (cpuctrl.dataindtiming is set) |
The number of cycles depends on the divider operand value (operand b), i.e. in the number of leading bits at 0. The minimum number of cycles is 3 when the divider has zero leading bits at 0 (e.g., 0x8000000). The maximum number of cycles is 35 when the divider is 0 |
Jump | 2 (PC hardening disabled) 3 (target is a non-word-aligned non-RVC instruction, PC hardening disabled) 3 (PC hardening enabled) 4 (target is a non-word-aligned non-RVC instruction, PC hardening enabled) |
Jumps are performed in the ID stage. Upon a jump the IF stage (including prefetch buffer) is flushed. The new PC request will appear on the instruction-side memory interface the same cycle the jump instruction is in the ID stage. |
mret | 2 (PC hardening disabled) 3 (target is a non-word-aligned non-RVC instruction, PC hardening disabled) 3 (PC hardening enabled) 4 (target is a non-word-aligned non-RVC instruction, PC hardening enabled) |
Mret is performed in the ID stage. Upon an mret the IF stage (including prefetch buffer) is flushed. The new PC request will appear on the instruction-side memory interface the same cycle the mret instruction is in the ID stage. |
Branch (Not-Taken) PC hardening disabled | 1 3 (cpuctrl.dataindtiming is set) 4 (cpuctrl.dataindtiming is set and target is a non-word-aligned non-RVC instruction) |
Any branch where the condition is not met will not stall. |
Branch (Not-Taken) PC hardening enabled | 2 3 (cpuctrl.dataindtiming is set) 4 (cpuctrl.dataindtiming is set and target is a non-word-aligned non-RVC instruction) |
Any branch where the condition is not met will not stall. |
Branch (Taken) | 3 4 (target is a non-word-aligned non-RVC instruction) |
The EX stage is used to compute the branch decision. Any branch where the condition is met will be taken from the EX stage and will cause a flush of the IF stage (including prefetch buffer) and ID stage. |
fence.i |
5 6 (target is a non-word-aligned non-RVC instruction) |
The fence.i instruction is defined in 'Zifencei' of the
RISC-V specification. Internally it is implemented as a
jump to the instruction following the fence. The jump
performs the required flushing as described above.
A fence.i instruction will not complete until
the external handshake has been completed. |
fence |
5 6 (target is a non-word-aligned non-RVC instruction) |
The fence instruction is implemented as a jump to the
instruction instruction following the fence. |
Zba, Zbb, Zbc, Zbs | 1 | All instructions from Zba, Zbb, Zbc, Zbs take 1 cycle. |
Zcmt | 2 (PC hardening disabled) 4 (PC hardening enabled) |
Table jumps take 2 cycles without PC hardening. |
Zcmp | 2 - 18 (PC hardening disabled) 2 - 19 (PC hardening enabled) |
The number of cycles depends on the number of registers saved or restored by the instructions. |
Zca, Zcb | 1 | Instructions from Zca and Zcb take 1 cycle. |
wfi , wfe |
2 - | Instructions causing sleep will not retire until wakeup. |
The |corev| experiences a 1 cycle penalty on the following hazards.
- Load data hazard (in case the instruction immediately following a load uses the result of that load).
- Jump register (
jalr
) data hazard (in case that ajalr
depends on the result of an immediately preceding non-load instruction).- An instruction causing an implicit CSR read in ID (
mret
,wfi
,wfe
or table jump) while a CSR access instruction or an instruction causing an implicit CSR access is in the WB stage.- An instruction causing an implicit CSR read in EX while a CSR access instruction or an instruction causing an implicit CSR access is in the WB stage.
- An instruction causing an explicit CSR read in EX while an instruction causing an implicit CSR write is in the WB stage.
- An instruction causing an explicit CSR read in EX while there is a RAW hazard with an explicit CSR write in WB.
The |corev| experiences a 2 cycle penalty on the following hazards.
- Jump register (
jalr
) data hazard (in case that ajalr
depends on the result of an immediately preceding load instruction).- An instruction causing an implicit CSR read in ID (
mret
,wfi
,wfe
or table jump) while a CSR access instruction or an instruction causing an implicit CSR access is in the EX stage.
Note
Implicit CSR reads are reads performed by non-CSR instructions or CSR instructions reading CSR values from another CSR. Explicit CSR reads and writes are CSR instructions accessing the CSR encoded in the instruction word.