-
Notifications
You must be signed in to change notification settings - Fork 7
Suspend Resume
The main SRAM for Precursor is battery-backed, and is persistent even when the SoC is fully powered down.
Therefore, not all servers need to be aware of suspend/resume: only servers which store state in volatile hardware registers or RAM require coordination. Thus, most "application-level" servers will be unaware of suspend/resume, as their state is stored in non-volatile memory.
Hardware servers which need to store volatile state (e.g. CSR configurations) will need to do the following:
- Most hardware-facing servers have an
implementationblock which allows for two different views of the server, one for actual hardware, and one for emulation ("hosted mode"). Theimplementationblock which touches the hardware CSRs will need to allocate aRegManagerstructure, which is the backing storage for the hardware CSRs. The structure takes a parameter that specifies the number of registers or bitfields that need to be backed up. As a rough initial guess, a_NUMREGSconstant is generated by the UTRA which is the number of registers in the CSR (don't forget to wrap it in curly braces so Rust knows it's aconst_genericand not a type). Once the code is stabilized, this can be trimmed down. If too few entries are allocated, the manager will respond with a panic when trying to push the register that can't fit. - All CSRs that need backing should be added to the
RegManagerstructure using a.push()method. The.push()method takes as an argument theRegisterorFieldyou wish to store. - A pair of methods should be added to the
implementationblock that correspond to asuspendor aresumeoperation. In the simplest case, these methods simply call the.suspend()or.resume()auto-traits on theRegManagerstructure. - An
Opcodeshould be added to the server's API for a Suspend callback. By convention, we name the OpcodeSuspendResume - In the initialization routine, the server will allocate a
Susresobject, and call.hook_suspend_callback()on the object, handing it a local connectionCIDfor the incomingSuspendResumemessage - In the main loop, the
SuspendResumeopcode should be handled. This is the general form of the handler:
Some(api::Opcode::SuspendResume) => xous::msg_scalar_unpack!(msg, token, _, _, _, {
implementation.suspend();
susres.suspend_until_resume(token).expect("couldn't execute suspend/resume");
implementation.resume();
}),That's it. From the server's standpoint, when a SuspendResume message comes in, it comes with a token
which is used to help the suspend/resume manager tally who is ready for the suspend operation.
The suspend_until_resume() call looks like it does nothing, but in fact it blocks execution
until the system powers down. On power-up, the system resumes execution within the routine, and
then returns back to the hardware server, which is why the .resume() call is the next method to
be invoked.
It's assumed that there is a process called the susres server which coordinates suspend/resume.
The process has a thread that runs an execution_gate server. This is a unique SID whose sole purpose is to receive blocking scalars, and block until a resume happens. The "resume" state is coordinated by some AtomicBool value within the susres server.
The susres server also owns a software interrupt. The interrupt handler has a structure like this:
fn susres_handler() {
if !resume_register() {
shutdown_system();
loop {}
} else {
try_send_message(sus_main_cid, ResumeMessage);
}
}The ticktimer is augmented to split out some CSRs to a different virtual memory page, so that the susres server can manage the ticktimer state directly. This ensures that system time is precisely kept between suspend and resume operations. A strong assumption was made about the monotonicity of the ticktimer (it's a 64-bit milliseconds counter, so in practice it will never roll over), and so by allowing susres to manage the ticktimer, we can get fine-grained accuracy on the ticktimer state without having to do funny games with inter-process thread scheduling.
The "clean suspend" marker is a page in RAM should contain at a minimum:
- A random nonce
- The
BtSeedof the FPGA - A hash of the above
The purpose of this marker is to make sure we don't try to resume from a "random" state of RAM. The potential failure mode we'd like to avoid is that we had a partial power-down, such that RAM state had decayed, but not sufficiently so to cause the hash check to fail. Thus for the "random nonce" perhaps we should fill most of the page with random data, under the theory that it becomes increasingly unlikely that we miss an actual power outage event as more bits are included in the hash check.
- During the first boot, a copy of the kernel's computed arguments are kept in the loader page, and the loader page is mapped as used, preventing it from being overwritten in Xous.
- A Suspend request is sent to the
susresserver - Callbacks to all suspend subscribers to prep for suspend
- Suspend subscribers handle the suspend request per their own implementation, but at a minimum they all guarantee this behavior:
- store hardware registers
- send a
SuspendReadyscalar to thesusresserver indicating it is ready for being suspended - send a blocking scalar
SuspendingNowto thesusresexecution_gateserver - the subscriber thread that owns the CSR page does not modify registers after the
SuspendReadymessage is sent (this may include disabling any interrupt handlers by settingEV_EVENTto 0)
-
susresserver waits until the sooner of all suspend subscribers indicatingSuspendReadyor a timeout. - record if any hardware did not successfully suspend within the timeout
- set the "clean suspend" marker in RAM. Note that this record should be derived from
BtSeedas well, so we can catch if the FPGA image has updated during the suspend (in which case we should have a clean boot). - note the PID of the
susresserver in the "clean suspend" marker, so we know which process to resume into. -
susresserver ensures the resume bit is cleared, and trips the interrupt to execute thesusres_handler()noted above. This causes the kernel to save the last thread context and shut the system down somewhere inside the interrupt handler.
- Power on, parse kernel args to figure out how big RAM is, etc.
- Check if clean suspend marker; if no, do cold boot. Either way, zero out the marker. If yes, extract the
susresPID. - Re-initialize kernel peripherals (e.g. TRNG)
- Note if we had a clean suspend in the
susresregister, setup the resume interrupt, and trigger it (but interrupts are still masked -- we will handle this later) - Flip the bit on the "resume" hardware
susresregister; ensure the ticktimer is paused so the user-space code can reload it right away. - Reload the backup kernel arguments; patch the PID of
susresserver into SATP. - Boot into the kernel with the resume argument set; the asm.S post-amble for the loader contains the code that sets up the SATP and brings us into virtual memory mode.
- A separate asm.S pre-amble for the kernel checks the resume argument; if true, it sets up the system as if it were entering an interrupt context in the
susreshandler by setting the default stack pointer, enabling interrupts, and settingscauseso an interrupt appears to be triggered. - Jump to the interrupt dispatch routine in Xous, e.g.
_start_trap_rust - Xous enters the
susresresume interrupt handler, but with theresumehardware bit set, causing it to pick the resume path. - The
susresserver gets the resume message, sets theAtomicBoolthat gates theexecution_gateto ungate its execution - All the blocking scalars from the
SuspendReadycall are unblocked. This allows thread execution to resume, at which point servers restore their hardware registers. - Execution resumes as normal.
This is the meat of the code that enables the MMU, given that the tables have been set up already: https://github.com/betrusted-io/xous-core/blob/main/loader/src/asm.S#L26-L54
With RISC-V, there are three modes: Machine, Supervisor, and User. Machine mode is always physical, Supervisor and User depend on the state of a bit in... I want to say MSTATUS. So what you have to do is set MSTATUS such that when you return from an interrupt, it goes into Supervisor mode. You set the return-from-interrupt address to be the address of main. When that happens, the MMU gets enabled and you enter Supervisor mode.
This is the loader-to-kernel jump point: https://github.com/betrusted-io/xous-core/blob/main/loader/src/main.rs#L1095-L1110
... and who is responsible for saving and restoring them.
- x0-x31 + pc: stored on trap kernel/asm.S, managed by kernel/riscv/process.rs
- f0-f31: not implemented
Let's work back from the source code to figure out what we can affect in a "restore" context. Here is a map of writable CSRs on the VexRiscV (as read out of the source code / these aren't analyzed yet):
Writable CSRs on the vexriscv
These are not used by Xous, because Xous does not use machine mode:
- 0xBC0: zz_256 -> masks externalInteruptArray_regNext for machine mone -> MIM
- 0x303: mideleg
- 0x302: medeleg
- 0x304: can write both mie/sie; 0x104 (sie) used by Xous
The "sstatus" registers are maintained by the kernel, and do not need an explicit "restore":
- 0x300: mstatus/sstatus/status
- 0x100: sstatus/status
These need to be restored by the loader, prior to Xous resume:
- 0x180: satp. This needs to have a PID set to the
susresPID, as that's the process we are resuming into. The rest does not need to be touched, as the kernel occupies a megapage that sits at the top of every process (so any PID will have valid kernel pages at the right spot). - 0x9c0: zz_258 -> masks externalAinterruptArray_regNext for supervisor mode -> SIM. The value is stored in
SIM_BACKINGin kernel/src/arch/riscv/irq.rs. - 0x344, 0x144 (true alias): sip - read prior to handling an interrupt. Does not seem to side-effect clearing the bit. No explicit restore is required, just need to trigger the resume software interrupt and the normal mechanisms should "do the right thing".
- 0x104: sie - static values loaded in
- In addition, the loader should block until the TRNG kernel port shows availability of data; then the first entry should be read and discarded as it is an invalid pipeline value.
- 0x105: stvec. This was not found in the code review, but this also needs to be setup properly. This is statically mapped to
_start_trap, and is necessary for the interrupt handler to return as it returns by triggering an "instruction page fault" (e.g. returning to a known "bad" instruction page, and using that mechanism as the dedicated return-from-interrupt handler)
This is an ad-hoc list of things that keep me up at night when I think about suspend/resume.
- I2C - split transactions in progress
- Audio - currently playing audio buffer
- Engine25519 - currently computing, plus microcode/computation state
- SHA - currently computing, plus digest state
- AES - CPU AES registers?
- Memlcd - redraw in progress
- SPINOR - erase/program in progress
- JTAG - eFuse operation in progress