Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rtl] CPU: optimizations and cleanup #462

Merged
merged 6 commits into from Dec 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -32,6 +32,7 @@ mimpid = 0x01040312 => Version 01.04.03.12 => v1.4.3.12

| Date (*dd.mm.yyyy*) | Version | Comment |
|:-------------------:|:-------:|:--------|
| 21.12.2022 | 1.7.8.11 | CPU: remove explicit reset-to-don't-care; branch and CSR access check logic optimizations; close further illegal instruction encoding hole; [#462](https://github.com/stnolting/neorv32/pull/462) |
| 20.12.2022 | 1.7.8.10 | SOC: rework r/w access logic; split read and write accesses into two processes; removed explicit reset-to-don't-care; [#461](https://github.com/stnolting/neorv32/pull/461) |
| 18.12.2022 | 1.7.8.9 | `mtval` is no longer read-only and can now be written by machine-mode software; [#460](https://github.com/stnolting/neorv32/pull/460) |
| 17.12.2022 | 1.7.8.8 | :bug: fix incorrect value written to `mepc` when encountering an "instruction access fault" exception; [#458](https://github.com/stnolting/neorv32/pull/458) |
Expand Down
11 changes: 7 additions & 4 deletions docs/datasheet/cpu_csr.adoc
Expand Up @@ -8,14 +8,15 @@ understood by the assembler/compiler. The *[C]* names are defined by the NEORV32
used as immediate in plain C code. The *R/W* column shows whether the CSR can be read and/or written.

.CSRs that are not Implemented
[NOTE]
[IMPORTANT]
All CSR bits that are unused / not implemented / not shown are _hardwired to zero_. All CSRs that are not
implemented (not supported or disabled) will raise an illegal instruction exception when being accessed.
implemented, not supported or disabled will raise an illegal instruction exception when being accessed.

.WARL Behavior
[NOTE]
[IMPORTANT]
All writable CSRs provide **WARL** behavior (write all values; read only legal values). Application software
should read back a CSR after writing to check if the targeted bits can actually be modified (or are read-only).
should always read back a CSR after writing to check if the targeted bits can actually be modified (or are
just read-only).

.Debug-Mode CSRs
[NOTE]
Expand Down Expand Up @@ -318,6 +319,8 @@ Machine-mode software can discover available `Z*` _sub-extensions_ (like `Zicsr`
| 0 | _CSR_MCOUNTEREN_CY_ | r/w | **CY**: User-level code is allowed to read `instret[h]` CSRs when set
|=======================

If User mode is not implemented this register is read-only and always return zero when read.

.HPM Access
[NOTE]
Bits 3 to 31 are used to control user-level access to the <<_hardware_performance_monitors_hpm_csrs>>. In the NEORV32
Expand Down
8 changes: 4 additions & 4 deletions rtl/core/neorv32_cpu_alu.vhd
Expand Up @@ -160,16 +160,16 @@ begin
-- > "cp_start" is high for one cycle to trigger operation of the according co-processor
cp_start(4 downto 0) <= ctrl_i(ctrl_cp_trig4_c downto ctrl_cp_trig0_c);

-- co-processor operation done? --
-- > "cp_valid" signal has to be set (for one cycle) one cycle before output data (cp_result) is valid
-- (iterative) co-processor operation done? --
-- > "cp_valid" signal has to be set (for one cycle) one cycle before CP output data (cp_result) is valid
idone_o <= cp_valid(0) or cp_valid(1) or cp_valid(2) or cp_valid(3) or cp_valid(4);

-- co-processor result --
-- > "cp_result" data has to be always zero unless unique co-processor was actually triggered
-- > "cp_result" data has to be always zero unless the specific co-processor has been actually triggered
cp_res <= cp_result(0) or cp_result(1) or cp_result(2) or cp_result(3) or cp_result(4);


-- Co-Processor 0: Shifter Unit (CPU Base ISA) --------------------------------------------
-- Co-Processor 0: Shifter Unit ('I'/'E' Base ISA) ----------------------------------------
-- -------------------------------------------------------------------------------------------
neorv32_cpu_cp_shifter_inst: neorv32_cpu_cp_shifter
generic map (
Expand Down
4 changes: 2 additions & 2 deletions rtl/core/neorv32_cpu_bus.vhd
Expand Up @@ -377,8 +377,8 @@ begin
if (rstn_i = '0') then
arbiter.pend <= '0';
arbiter.err <= '0';
arbiter.pmp_r_err <= '-';
arbiter.pmp_w_err <= '-';
arbiter.pmp_r_err <= '0';
arbiter.pmp_w_err <= '0';
elsif rising_edge(clk_i) then
arbiter.pmp_r_err <= ld_pmp_fault;
arbiter.pmp_w_err <= st_pmp_fault;
Expand Down
161 changes: 74 additions & 87 deletions rtl/core/neorv32_cpu_control.vhd

Large diffs are not rendered by default.

13 changes: 6 additions & 7 deletions rtl/core/neorv32_cpu_cp_bitmanip.vhd
Expand Up @@ -228,11 +228,11 @@ begin
begin
if (rstn_i = '0') then
ctrl_state <= S_IDLE;
cmd_buf <= (others => '-');
rs1_reg <= (others => '-');
rs2_reg <= (others => '-');
sha_reg <= (others => '-');
less_reg <= '-';
cmd_buf <= (others => '0');
rs1_reg <= (others => '0');
rs2_reg <= (others => '0');
sha_reg <= (others => '0');
less_reg <= '0';
clmul.start <= '0';
shifter.start <= '0';
valid <= '0';
Expand Down Expand Up @@ -402,8 +402,7 @@ begin
case ctrl_i(ctrl_ir_funct3_2_c downto ctrl_ir_funct3_1_c) is
when "01" => opb_v := rs1_reg(rs1_reg'left-1 downto 0) & '0'; -- << 1
when "10" => opb_v := rs1_reg(rs1_reg'left-2 downto 0) & "00"; -- << 2
when "11" => opb_v := rs1_reg(rs1_reg'left-3 downto 0) & "000"; -- << 3
when others => opb_v := (others => '-'); -- undefined
when others => opb_v := rs1_reg(rs1_reg'left-3 downto 0) & "000"; -- << 3
end case;
adder_core <= std_ulogic_vector(unsigned(rs2_reg) + unsigned(opb_v));
end process shift_adder;
Expand Down
2 changes: 1 addition & 1 deletion rtl/core/neorv32_cpu_cp_cfu.vhd
Expand Up @@ -113,7 +113,7 @@ begin
cfu_control: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
res_o <= (others => '-'); -- no actual reset required
res_o <= (others => '0');
control.busy <= '0';
elsif rising_edge(clk_i) then
res_o <= (others => '0'); -- default; all CPU co-processor outputs are logically OR-ed
Expand Down
72 changes: 36 additions & 36 deletions rtl/core/neorv32_cpu_cp_fpu.vhd
Expand Up @@ -358,13 +358,13 @@ begin
ctrl_engine.state <= S_IDLE;
ctrl_engine.valid <= '0';
ctrl_engine.start <= '0';
fpu_operands.frm <= (others => '-');
fpu_operands.rs1 <= (others => '-');
fpu_operands.rs1_class <= (others => '-');
fpu_operands.rs2 <= (others => '-');
fpu_operands.rs2_class <= (others => '-');
funct_ff <= (others => '-');
cmp_ff <= (others => '-');
fpu_operands.frm <= (others => '0');
fpu_operands.rs1 <= (others => '0');
fpu_operands.rs1_class <= (others => '0');
fpu_operands.rs2 <= (others => '0');
fpu_operands.rs2_class <= (others => '0');
funct_ff <= (others => '0');
cmp_ff <= (others => '0');
elsif rising_edge(clk_i) then
-- arbiter defaults --
ctrl_engine.valid <= '0';
Expand Down Expand Up @@ -1246,24 +1246,24 @@ begin
begin
if (rstn_i = '0') then
ctrl.state <= S_IDLE;
ctrl.norm_r <= '-';
ctrl.cnt <= (others => '-');
ctrl.cnt_pre <= (others => '-');
ctrl.cnt_of <= '-';
ctrl.cnt_uf <= '-';
ctrl.rounded <= '-';
ctrl.res_exp <= (others => '-');
ctrl.res_man <= (others => '-');
ctrl.res_sgn <= '-';
ctrl.class <= (others => '-');
ctrl.flags <= (others => '-');
ctrl.norm_r <= '0';
ctrl.cnt <= (others => '0');
ctrl.cnt_pre <= (others => '0');
ctrl.cnt_of <= '0';
ctrl.cnt_uf <= '0';
ctrl.rounded <= '0';
ctrl.res_exp <= (others => '0');
ctrl.res_man <= (others => '0');
ctrl.res_sgn <= '0';
ctrl.class <= (others => '0');
ctrl.flags <= (others => '0');
--
sreg.upper <= (others => '-');
sreg.lower <= (others => '-');
sreg.dir <= '-';
sreg.ext_g <= '-';
sreg.ext_r <= '-';
sreg.ext_s <= '-';
sreg.upper <= (others => '0');
sreg.lower <= (others => '0');
sreg.dir <= '0';
sreg.ext_g <= '0';
sreg.ext_r <= '0';
sreg.ext_s <= '0';
--
done_o <= '0';
elsif rising_edge(clk_i) then
Expand Down Expand Up @@ -1626,18 +1626,18 @@ begin
begin
if (rstn_i = '0') then
ctrl.state <= S_IDLE;
ctrl.cnt <= (others => '-');
ctrl.sign <= '-';
ctrl.class <= (others => '-');
ctrl.rounded <= '-';
ctrl.over <= '-';
ctrl.under <= '-';
ctrl.unsign <= '-';
ctrl.result <= (others => '-');
ctrl.result_tmp <= (others => '-');
sreg.int <= (others => '-');
sreg.mant <= (others => '-');
sreg.ext_s <= '-';
ctrl.cnt <= (others => '0');
ctrl.sign <= '0';
ctrl.class <= (others => '0');
ctrl.rounded <= '0';
ctrl.over <= '0';
ctrl.under <= '0';
ctrl.unsign <= '0';
ctrl.result <= (others => '0');
ctrl.result_tmp <= (others => '0');
sreg.int <= (others => '0');
sreg.mant <= (others => '0');
sreg.ext_s <= '0';
done_o <= '0';
elsif rising_edge(clk_i) then
-- defaults --
Expand Down
26 changes: 13 additions & 13 deletions rtl/core/neorv32_cpu_cp_muldiv.vhd
Expand Up @@ -123,11 +123,11 @@ begin
begin
if (rstn_i = '0') then
ctrl.state <= S_IDLE;
ctrl.rs2_abs <= (others => '-');
ctrl.cnt <= (others => '-');
ctrl.cp_op_ff <= (others => '-');
ctrl.rs2_abs <= (others => '0');
ctrl.cnt <= (others => '0');
ctrl.cp_op_ff <= (others => '0');
ctrl.out_en <= '0';
div.sign_mod <= '-';
div.sign_mod <= '0';
elsif rising_edge(clk_i) then
-- defaults --
ctrl.out_en <= '0';
Expand Down Expand Up @@ -223,9 +223,9 @@ begin
-- no parallel multiplier --
multiplier_core_parallel_none:
if (FAST_MUL_EN = false) generate
mul.dsp_x <= (others => '-');
mul.dsp_y <= (others => '-');
mul.dsp_z <= (others => '-');
mul.dsp_x <= (others => '0');
mul.dsp_y <= (others => '0');
mul.dsp_z <= (others => '0');
end generate;


Expand Down Expand Up @@ -270,8 +270,8 @@ begin
-- no serial multiplier --
multiplier_core_serial_none:
if (FAST_MUL_EN = true) generate
mul.add <= (others => '-');
mul.p_sext <= '-';
mul.add <= (others => '0');
mul.p_sext <= '0';
end generate;


Expand Down Expand Up @@ -314,10 +314,10 @@ begin
-- no divider --
divider_core_serial_none:
if (DIVISION_EN = false) generate
div.remainder <= (others => '-');
div.quotient <= (others => '-');
div.sub <= (others => '-');
div.res_u <= (others => '-');
div.remainder <= (others => '0');
div.quotient <= (others => '0');
div.sub <= (others => '0');
div.res_u <= (others => '0');
div.res <= (others => '0');
end generate;

Expand Down
3 changes: 2 additions & 1 deletion rtl/core/neorv32_package.vhd
Expand Up @@ -62,7 +62,7 @@ package neorv32_package is

-- Architecture Constants (do not modify!) ------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01070810"; -- NEORV32 version - no touchy!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01070811"; -- NEORV32 version - no touchy!
constant archid_c : natural := 19; -- official RISC-V architecture ID - hands off!

-- Check if we're inside the Matrix -------------------------------------------------------
Expand Down Expand Up @@ -440,6 +440,7 @@ package neorv32_package is
constant funct3_csrrw_c : std_ulogic_vector(2 downto 0) := "001"; -- csr r/w
constant funct3_csrrs_c : std_ulogic_vector(2 downto 0) := "010"; -- csr read & set bit
constant funct3_csrrc_c : std_ulogic_vector(2 downto 0) := "011"; -- csr read & clear bit
constant funct3_csril_c : std_ulogic_vector(2 downto 0) := "100"; -- undefined/illegal
constant funct3_csrrwi_c : std_ulogic_vector(2 downto 0) := "101"; -- csr r/w immediate
constant funct3_csrrsi_c : std_ulogic_vector(2 downto 0) := "110"; -- csr read & set bit immediate
constant funct3_csrrci_c : std_ulogic_vector(2 downto 0) := "111"; -- csr read & clear bit immediate
Expand Down