-
Notifications
You must be signed in to change notification settings - Fork 6
Suspend Resume
The main SRAM for Precursor is battery-backed, and is persistent even when the SoC is fully powered down.
Therefore, not all servers need to be aware of suspend/resume: only servers which store state in volatile hardware registers or RAM require coordination. Thus, most "application-level" servers will be unaware of suspend/resume, as their state is stored in non-volatile memory.
Hardware servers which need to store volatile state (e.g. CSR configurations) will need to do the following:
- Most hardware-facing servers have an
implementation
block which allows for two different views of the server, one for actual hardware, and one for emulation ("hosted mode"). Theimplementation
block which touches the hardware CSRs will need to allocate aRegManager
structure, which is the backing storage for the hardware CSRs. The structure takes a parameter that specifies the number of registers or bitfields that need to be backed up. As a rough initial guess, a_NUMREGS
constant is generated by the UTRA which is the number of registers in the CSR (don't forget to wrap it in curly braces so Rust knows it's aconst_generic
and not a type). Once the code is stabilized, this can be trimmed down. If too few entries are allocated, the manager will respond with a panic when trying to push the register that can't fit. - All CSRs that need backing should be added to the
RegManager
structure using a.push()
method. The.push()
method takes as an argument theRegister
orField
you wish to store. - A pair of methods should be added to the
implementation
block that correspond to asuspend
or aresume
operation. In the simplest case, these methods simply call the.suspend()
or.resume()
auto-traits on theRegManager
structure. - An
Opcode
should be added to the server's API for a Suspend callback. By convention, we name the OpcodeSuspendResume
- In the initialization routine, the server will allocate a
Susres
object, and call.hook_suspend_callback()
on the object, handing it a local connectionCID
for the incomingSuspendResume
message - In the main loop, the
SuspendResume
opcode should be handled. This is the general form of the handler:
Some(api::Opcode::SuspendResume) => xous::msg_scalar_unpack!(msg, token, _, _, _, {
implementation.suspend();
susres.suspend_until_resume(token).expect("couldn't execute suspend/resume");
implementation.resume();
}),
That's it. From the server's standpoint, when a SuspendResume
message comes in, it comes with a token
which is used to help the suspend/resume manager tally who is ready for the suspend operation.
The suspend_until_resume()
call looks like it does nothing, but in fact it blocks execution
until the system powers down. On power-up, the system resumes execution within the routine, and
then returns back to the hardware server, which is why the .resume()
call is the next method to
be invoked.
It's assumed that there is a process called the susres
server which coordinates suspend/resume.
The process has a thread that runs an execution_gate
server. This is a unique SID whose sole purpose is to receive blocking scalars, and block until a resume happens. The "resume" state is coordinated by some AtomicBool
value within the susres
server.
The susres
server also owns a software interrupt. The interrupt handler has a structure like this:
fn susres_handler() {
if !resume_register() {
shutdown_system();
loop {}
} else {
try_send_message(sus_main_cid, ResumeMessage);
}
}
The ticktimer
is augmented to split out some CSRs to a different virtual memory page, so that the susres
server can manage the ticktimer
state directly. This ensures that system time is precisely kept between suspend and resume operations. A strong assumption was made about the monotonicity of the ticktimer (it's a 64-bit milliseconds counter, so in practice it will never roll over), and so by allowing susres
to manage the ticktimer, we can get fine-grained accuracy on the ticktimer
state without having to do funny games with inter-process thread scheduling.
The "clean suspend" marker is a page in RAM should contain at a minimum:
- A random nonce
- The
BtSeed
of the FPGA - A hash of the above
The purpose of this marker is to make sure we don't try to resume from a "random" state of RAM. The potential failure mode we'd like to avoid is that we had a partial power-down, such that RAM state had decayed, but not sufficiently so to cause the hash check to fail. Thus for the "random nonce" perhaps we should fill most of the page with random data, under the theory that it becomes increasingly unlikely that we miss an actual power outage event as more bits are included in the hash check.
- During the first boot, a copy of the kernel's computed arguments are kept in the loader page, and the loader page is mapped as used, preventing it from being overwritten in Xous.
- A Suspend request is sent to the
susres
server - Callbacks to all suspend subscribers to prep for suspend
- Suspend subscribers handle the suspend request per their own implementation, but at a minimum they all guarantee this behavior:
- store hardware registers
- send a
SuspendReady
scalar to thesusres
server indicating it is ready for being suspended - send a blocking scalar
SuspendingNow
to thesusres
execution_gate
server - the subscriber thread that owns the CSR page does not modify registers after the
SuspendReady
message is sent (this may include disabling any interrupt handlers by settingEV_EVENT
to 0)
-
susres
server waits until the sooner of all suspend subscribers indicatingSuspendReady
or a timeout. - record if any hardware did not successfully suspend within the timeout
- set the "clean suspend" marker in RAM. Note that this record should be derived from
BtSeed
as well, so we can catch if the FPGA image has updated during the suspend (in which case we should have a clean boot). - note the PID of the
susres
server in the "clean suspend" marker, so we know which process to resume into. -
susres
server ensures the resume bit is cleared, and trips the interrupt to execute thesusres_handler()
noted above. This causes the kernel to save the last thread context and shut the system down somewhere inside the interrupt handler.
- Power on, parse kernel args to figure out how big RAM is, etc.
- Check if clean suspend marker; if no, do cold boot. Either way, zero out the marker. If yes, extract the
susres
PID. - Re-initialize kernel peripherals (e.g. TRNG)
- Note if we had a clean suspend in the
susres
register, setup the resume interrupt, and trigger it (but interrupts are still masked -- we will handle this later) - Flip the bit on the "resume" hardware
susres
register; ensure the ticktimer is paused so the user-space code can reload it right away. - Reload the backup kernel arguments; patch the PID of
susres
server into SATP. - Boot into the kernel with the resume argument set; the asm.S post-amble for the loader contains the code that sets up the SATP and brings us into virtual memory mode.
- A separate asm.S pre-amble for the kernel checks the resume argument; if true, it sets up the system as if it were entering an interrupt context in the
susres
handler by setting the default stack pointer, enabling interrupts, and settingscause
so an interrupt appears to be triggered. - Jump to the interrupt dispatch routine in Xous, e.g.
_start_trap_rust
- Xous enters the
susres
resume interrupt handler, but with theresume
hardware bit set, causing it to pick the resume path. - The
susres
server gets the resume message, sets theAtomicBool
that gates theexecution_gate
to ungate its execution - All the blocking scalars from the
SuspendReady
call are unblocked. This allows thread execution to resume, at which point servers restore their hardware registers. - Execution resumes as normal.
This is the meat of the code that enables the MMU, given that the tables have been set up already: https://github.com/betrusted-io/xous-core/blob/main/loader/src/asm.S#L26-L54
With RISC-V, there are three modes: Machine, Supervisor, and User. Machine mode is always physical, Supervisor and User depend on the state of a bit in... I want to say MSTATUS. So what you have to do is set MSTATUS such that when you return from an interrupt, it goes into Supervisor mode. You set the return-from-interrupt address to be the address of main. When that happens, the MMU gets enabled and you enter Supervisor mode.
This is the loader-to-kernel jump point: https://github.com/betrusted-io/xous-core/blob/main/loader/src/main.rs#L1095-L1110
... and who is responsible for saving and restoring them.
- x0-x31 + pc: stored on trap kernel/asm.S, managed by kernel/riscv/process.rs
- f0-f31: not implemented
Let's work back from the source code to figure out what we can affect in a "restore" context. Here is a map of writable CSRs on the VexRiscV (as read out of the source code / these aren't analyzed yet):
Writable CSRs on the vexriscv
These are not used by Xous, because Xous does not use machine mode:
- 0xBC0: zz_256 -> masks externalInteruptArray_regNext for machine mone -> MIM
- 0x303: mideleg
- 0x302: medeleg
- 0x304: can write both mie/sie; 0x104 (sie) used by Xous
The "sstatus" registers are maintained by the kernel, and do not need an explicit "restore":
- 0x300: mstatus/sstatus/status
- 0x100: sstatus/status
These need to be restored by the loader, prior to Xous resume:
- 0x180: satp. This needs to have a PID set to the
susres
PID, as that's the process we are resuming into. The rest does not need to be touched, as the kernel occupies a megapage that sits at the top of every process (so any PID will have valid kernel pages at the right spot). - 0x9c0: zz_258 -> masks externalAinterruptArray_regNext for supervisor mode -> SIM. The value is stored in
SIM_BACKING
in kernel/src/arch/riscv/irq.rs. - 0x344, 0x144 (true alias): sip - read prior to handling an interrupt. Does not seem to side-effect clearing the bit. No explicit restore is required, just need to trigger the resume software interrupt and the normal mechanisms should "do the right thing".
- 0x104: sie - static values loaded in
- In addition, the loader should block until the TRNG kernel port shows availability of data; then the first entry should be read and discarded as it is an invalid pipeline value.
- 0x105: stvec. This was not found in the code review, but this also needs to be setup properly. This is statically mapped to
_start_trap
, and is necessary for the interrupt handler to return as it returns by triggering an "instruction page fault" (e.g. returning to a known "bad" instruction page, and using that mechanism as the dedicated return-from-interrupt handler)
This is an ad-hoc list of things that keep me up at night when I think about suspend/resume.
- I2C - split transactions in progress
- Audio - currently playing audio buffer
- Engine25519 - currently computing, plus microcode/computation state
- SHA - currently computing, plus digest state
- AES - CPU AES registers?
- Memlcd - redraw in progress
- SPINOR - erase/program in progress
- JTAG - eFuse operation in progress