Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zcheri_mode "loose" redefinition #133

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

sorear
Copy link
Contributor

@sorear sorear commented Feb 25, 2024

Meet the functional requirements of a hybrid ABI with 4 bits of architectural state and no use of omnipresent capability metadata. Independent of but strongly synergistic with #130.

Signed-off-by: Stefan O'Rear <sorear@fastmail.com>
We now have three models for the implementation of MODESW: storing the
mode in the pcc metadata, as ISAv9, storing the mode in the pcc LSB, as
Morello, and storing the mode in a loosely associated bit.

Using the metadata is robust but uses a scarce metadata bit (especially
on RV32) for a bit with extremely low entropy, since legacy code by
definition does not make heavy use of code capabilities.

Storing the mode in the address LSB is superficially attractive, since
it is unused for code capabilities, but it is incompatible with existing
RISC-V design decisions (silently masking LSB on legacy mode jalr and
writes to xepc).

Were it made possible to use the LSB of function pointers and return
addresses in that way, it would be most useful for mode switches which
do not affect the ABI and can be mixed freely at the level of possibly
indirect calls. We have two very different ABIs and only one useful
instruction set for each ABI, so there is little to be gained by
decoupling the ABI from the implementation instruction set of a
function.

Assuming, then, that the entry instruction set of a function is
determined by the ABI used to call it, it is natural to decouple the
choice of instruction set from the function address. We define in this
proposal that code which is jumped to via a capability always uses
Capability mode; to jump to code in Legacy mode, use a trap return or a
trampoline within the legacy pcc bounds.

By requiring trap handlers to use Capability mode and not storing mode
metadata for CSRs other than xepc, this proposal achieves the smallest
architectural state of the three options, only 4 bits for D+M+S+U.

Fixes riscv#31.
Fixes riscv#34.
Fixes riscv#63.

Signed-off-by: Stefan O'Rear <sorear@fastmail.com>
@andresag01
Copy link
Collaborator

andresag01 commented Feb 26, 2024

My understanding of this proposed change is:

  • The CHERI mode is associated with the PCC, but it is no longer encoded in the capability format
  • New bits in RV's xstatus and dcsr are used to track the CHERI mode when excepting/returning from exception

Then, if I understand correctly, the mode can only be changed in the following ways:

  • Executing a mode change instruction (MODESW)
  • Returning to a lower privileged mode

@sorear: Is this correct?

The idea of keeping the CHERI mode in the capability's encoding is nice because then all the information you need to jump to some code (i.e. permissions, address, bounds, mode, etc) is in one place and its inherently synchronized by-design.

However, in the scheme proposed in this change, the mode is decouple from the capability and there are 2 things to track and keep in sync. For example, if you wanted to context switch a process, you'd need to retrieve the pcc, then the mode and then store them both separately. Also, if you wanted to have truly hybrid code (e.g. because your program is Capability but there is this only library that's Legacy), then you'd need to do something MODESW followed by jump to call the library and then MODESW back when you return which is inconvenient.

@jrtc27: I think you had comments about this when the idea last came up?

@sorear
Copy link
Contributor Author

sorear commented Feb 27, 2024

My understanding of this proposed change is:

Correct.

Then, if I understand correctly, the mode can only be changed in the following ways:

Taking a trap at any privilege also changes the mode, as does JALR.MODE.

The idea of keeping the CHERI mode in the capability's encoding is nice because then all the information you need to jump to some code (i.e. permissions, address, bounds, mode, etc) is in one place and its inherently synchronized by-design.

I agree completely about the benefits of having as much information as possible relevant to the secure interpretation of a capability in the capability itself.

We're already trading off capability semantic completeness with capability metadata size by not including an ASID in the metadata. I suspect that if you decided to spend a bit to catch errors, a bit of ASID would help more than a capability mode bit, which will virtually always be 1 (since legacy code does not use capabilities for its function pointers and return addresses, the only places legacy mode code capabilities appear are in capability mode to legacy mode switch code and trap code).

Wide availability of a feature is its own kind of niceness. That's the direction I'm hoping to move this in.

For example, if you wanted to context switch a process, you'd need to retrieve the pcc, then the mode and then store them both separately.

The saved mode is in sstatus, which has been part of the context switch on Linux and seL4 since the beginning and appears to also be in the CheriBSD switch code. A context switch without sstatus will break quite a few extensions.

Also, if you wanted to have truly hybrid code (e.g. because your program is Capability but there is this only library that's Legacy), then you'd need to do something MODESW followed by jump to call the library and then MODESW back when you return which is inconvenient.

You already need nontrivial codegen to spill all call-saved capability registers, reestablish csp as a capability after the call, and switch to a stack covered by ddc if that wasn't a given. Two MODESW fit right in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants