Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor doesn't park itself in a know good state prior to going to sleep #744

Closed
mikaelsky opened this issue Dec 5, 2023 · 3 comments · Fixed by #746
Closed

Processor doesn't park itself in a know good state prior to going to sleep #744

mikaelsky opened this issue Dec 5, 2023 · 3 comments · Fixed by #746
Assignees
Labels
enhancement New feature or request HW hardware-related

Comments

@mikaelsky
Copy link
Collaborator

Describe the bug
When utilizing clock gating driven be the sleep signal the processor hangs on wakeup. This is root caused to be because the fetch engine ends up hanging in the IF_PENDING state which it never exists as the response from the memory system never arrives.
In sleep state the fetch engine is holding the RE signal on the instruction bus high, which prevents non-processor bus managers from taking the bus.

To Reproduce
Implement clock gating for the riscv and peripherals as seperate controls. Have the riscv clock gate but have the memories still run. This happens in systems where the processor can go to sleep but an external port like I2C access the memories while the processor is asleep.
This will allow the memories to respond and move on, locking out the riscv fetch engine.

Expected behavior
When going to sleep the processor should park itself in a know good state from which it can safely resume.

Environment:

  • OS: RHEL 7
  • GCC Version RISCV 13.2

Additional context
As I'm still on the older version of the core I can't push the implemented fix. The fix is somewhat simple and involves touch the fetch engine and execute engine in neorv32_cpu_control.vhd.
With the provided the changes the processor safely parks itself prior to issuing sleep and safely resumes at PC+4 after going out of sleep.

Changes to the cpu_sleep indicator. We only assert sleep when the processor is in CPU_SLEEP and the fetch engine is safely parked in IF_WAIT.

-- MM
--  ctrl_o.cpu_sleep    <= '1' when (execute_engine.state = CPU_SLEEP) else '0';
  ctrl_o.cpu_sleep    <= '1' when ((execute_engine.state = CPU_SLEEP) and (fetch_engine.state = IF_WAIT)) else '0';

Changes to the fetch engine. This ensures that IF_PENDING goes to IF_WAIT to park the fetch engine when we go to sleep. We are using the execute engine current and next state being CPU_SLEEP to prevent IF_PENDING for going back to IF_REQUEST when the next processor state is sleep.
In IF_WAIT we hang around until the processor is out of sleep.

        when IF_PENDING => -- wait for bus response and write instruction data to prefetch buffer
        -- ------------------------------------------------------------
          if (fetch_engine.resp = '1') then -- wait for bus response
            fetch_engine.pc        <= std_ulogic_vector(unsigned(fetch_engine.pc) + 4);
            fetch_engine.unaligned <= '0';
            if (fetch_engine.restart = '1') or (fetch_engine.reset = '1') then -- restart request (fast)
              fetch_engine.state <= IF_RESTART;
            -- do not trigger new instruction fetch when a branch instruction is being executed (wait for branch destination)
            elsif ((execute_engine.ir(instr_opcode_msb_c downto instr_opcode_lsb_c+2) = opcode_branch_c(6 downto 2)) and
                   (execute_engine.ir(31) = '1')) or -- predict: taken if branching backwards
                  (execute_engine.ir(instr_opcode_msb_c downto instr_opcode_lsb_c+2) = opcode_jal_c(6 downto 2)) or    -- always taken
                  (execute_engine.ir(instr_opcode_msb_c downto instr_opcode_lsb_c+2) = opcode_jalr_c(6 downto 2)) then -- always taken
              fetch_engine.state <= IF_WAIT;
              -- MM changes
            elsif ((execute_engine.state = CPU_SLEEP) or (execute_engine.state_nxt = CPU_SLEEP))  then -- we are going to sleep
              fetch_engine.state <= IF_WAIT;
            else -- request next instruction word
              fetch_engine.state <= IF_REQUEST;
            end if;
          end if;

        when IF_WAIT => -- wait for branch instruction
        -- ------------------------------------------------------------
        -- MM changes
          if (execute_engine.state /= CPU_SLEEP) then 
            if (fetch_engine.restart = '1') or (fetch_engine.reset = '1') then -- restart request (fast) if taken branch
              fetch_engine.state <= IF_RESTART;
            else
              fetch_engine.state <= IF_REQUEST;
            end if;
          end if;
@stnolting
Copy link
Owner

I never thought about using the sleep signal for some clock gating... but I like the idea!

How do you do the clock gating? If you turn off the CPU's clock during sleep mode then it won't be able to wake up again since the internal interrupt logic would also be shut down?! 🤔

This is root caused to be because the fetch engine ends up hanging in the IF_PENDING state which it never exists as the response from the memory system never arrives.

Changes to the cpu_sleep indicator. We only assert sleep when the processor is in CPU_SLEEP and the fetch engine is safely parked in IF_WAIT.

This makes sense. I will add this "feature" to the CPU. Thanks for that!

As I'm still on the older version of the core [...]

I can see that 😅 The fetch engine (and also the execute engine) got a whole make-over resulting in a smaller size and even higher f_max. Maybe you should consider an update. 😉

@stnolting stnolting self-assigned this Dec 6, 2023
@stnolting stnolting added enhancement New feature or request HW hardware-related labels Dec 6, 2023
@mikaelsky
Copy link
Collaborator Author

mikaelsky commented Dec 6, 2023

Working on on-boarding the update, will get there :)

As you said the processor doesn't have the ability to go out of sleep with no clock, but thats where the "magic" happens. Its a fundamentally simple combinatorial function that you can apply inside the cpu_control block.

Something like this:
assign wakeup = (|(csr.mie & {FIRQ, nonFIRQ})) | debug_mode;
So just a bitwise and of the csr.mie with all the IRQs which is then has an reduction or operator applied. We then or that with debug_mode.
A note here is that WFI will cause the processor to wake even if mstatus.mie is disabled. If mstatus.mie is disabled the WFI should go PC+4 vs taking the trap.

The clock gating circuit can either use a library cell or just use a plain old gate function:

logic clock_enable;
always @(negedge clk_i or negedge rst_n_i) begin
  if (rst_n_i == 1'b0)
    clock_enable <= 1'b1; // default to clock on
  else begin
    clock_enable <= 1'b1; // assume the clock is always on
    if ((sleep == 1'b1) && (wakeup == 1'b0))
      clock_enable <= 1'b0; // turn off the clock
  end
end
assign riscv_clk = clock_enable && clk_i; // gate off the clock after the falling edge of clk_i

@stnolting
Copy link
Owner

That's a really smart solution!

But there is one big problem: the FIRQs are high for just a single cycle when firing. And this single cycle is to short to bring the CPU out of sleep mode and to re-enable the clock.

One possible solution would be to split the CPU's clock domain into a switchable one and an always-on one (just for the the interrupt logic). However, I would like to keep all the CPU logic inside one clock domain - otherwise we would need at least two FPGA clock tiles for all the logic that might increase critical path length.

However, I really like the idea to turn of the CPU clock while being in sleep mode. So I would suggest to move the "clock enable" logic outside of the CPU: disable the CPU clock when the sleep signals gets high and re-enable the clock as soon as there is a rising-edge on any of the interrupt lines (FIRQ, non-FIRQs and debug-mode entry-IRQ.

@stnolting stnolting linked a pull request Dec 9, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request HW hardware-related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants