New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
c18n: [DRAFT] Rework implementation to be interrupt-safe #2079
base: dev
Are you sure you want to change the base?
Conversation
874a94c
to
0ecef41
Compare
9680cbd
to
773358e
Compare
e195871
to
c0da4f1
Compare
sys/cheri/cheri.h
Outdated
struct cheri_c18n_info { | ||
uint8_t version; | ||
size_t stats_size; | ||
struct rtld_c18n_stats * __capability stats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be __kerncap or does that complicate coredump support? I don't think only purecap rtld touches it in userspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-purecap RTLD wouldn't even have c18n compiled in, so using __kerncap makes no practical difference, if I'm understanding the problem correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like it should be __kerncap
as the annotation will go away once we're purecap-only.
struct proc *p; | ||
struct cheri_c18n_info info; | ||
int error; | ||
void *buffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you initialize this to NULL you don't need two labels for the exit path.
} | ||
|
||
buffer = malloc(info.stats_size, M_TEMP, M_WAITOK); | ||
n = proc_readmem(curthread, p, (__cheri_addr vm_offset_t)info.stats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't go blindly trusting the address here. At a minimum we need to check the capability or the process could leak secrets with a bad value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is such a check performed? Could you point me to an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use __CAP_CHECK to verify that the data is in range. That macro should likely be altered to require that the capability be unsealed as well as tagged. You should also check that it has load permission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__CAP_CHECK
does require the capability to be tagged, which in this causes it to always fail because info.stats
is always untagged.
I don't understand why we need to do this check though. Wouldn't the userspace just leak it's own memory if it sets a bad value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is info.stats untagged? That seems completely wrong.
Causing program secrets to be trivial available by sysctl seems like a bad idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that proc_readmem
uses the UIO_SYSSPACE
flag which does a bcopynocap_c
underneath, stripping all tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we need an _c variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we want to extend uio_rw
with UIO_READ_CAP UIO_WRITE_CAP variants, so that we can honor it in both uiomove_flags
and uiomove_fromphys
(and all possible variations) without having to carry around an extra flag. I think when we set the UIO_READ/WRITE we already know whether we expect capabilities to be there and the current scheme should work fine in most cases as we don't preserve capabilities by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think you want it orthogonal yes. Extending uio_rw
is probably the cleanest way. My only worry is if there is any code doing if (uio->uio_rw == UIO_READ) else /* WRITE */
instead of using switch
statements. A quick grep does show many, but also lots of KASSERT's that would catch this I think?
66fd8d4
to
c73231e
Compare
3dd5952
to
b4b5479
Compare
Exposes LD_COMPARTMENT_STATS that exports a set of compartmentalisation-related statistics to a user-specified file.
The trampoline and other parts of RTLD are refactored to be interrupt-safe. The trusted frame is redesigned to allow trampolines to perform tail-calls that do not push a trusted frame. The new design also no longer relies on a region of metadata at the bottom of each compartment's stack.
/* | ||
* Error handling here is wrong. If ENOEXEC, really want to print | ||
* output indicating no information, which this function signature | ||
* doesn't currently support. This is because the process probably | ||
* simply doesn't have c18n in use | ||
*/ | ||
name[0] = CTL_KERN; | ||
name[1] = KERN_PROC; | ||
name[2] = KERN_PROC_C18N; | ||
name[3] = kp->ki_pid; | ||
error = sysctl(name, nitems(name), *pp, lenp, NULL, 0); | ||
if (error != 0 && errno != ESRCH && errno != EPERM && | ||
errno != ENOEXEC) { | ||
warn("sysctl(kern.proc.c18n)"); | ||
goto out_free; | ||
} | ||
if (error != 0) | ||
goto out_free; | ||
return (0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rwatson Do we need to fix the error handling here?
} | ||
|
||
buffer = malloc(info.stats_size, M_TEMP, M_WAITOK); | ||
n = proc_readmem(curthread, p, (__cheri_addr vm_offset_t)info.stats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__CAP_CHECK
does require the capability to be tagged, which in this causes it to always fail because info.stats
is always untagged.
I don't understand why we need to do this check though. Wouldn't the userspace just leak it's own memory if it sets a bad value?
Edit: The c18n statistics part of this PR is a bit stalled, so I pulled out the interrupt-safe changes to #2090 which will hopefully get merged soon.
This PR builds upon #2012 and #2032 and the real content is in the very last commit entitled c18n: Rework implementation to be interrupt-safe. This is not meat to be merged but is a stable implementation needing feedback. I do hope it can be merged in the next release if time permits.
This commit completely refactors the trampoline and how stack switching works. The purecap and benchmark ABI implementations now both use a dedicated register to store the trusted stack (
ddc
andrddc
respectively). This makes the trampolines look identical (modulo register names) on both ABIs. No metadata recording the current top of the stack is stored at the bottom of each compartment's stack. Instead, the stack lookup table now stores that information.The signal handling mechanism has been rewritten to handle (rare) cases where c18n code, in particular trampolines, is interrupted. All c18n code paths that could be interrupted have been audited and it is believed that they can all be handled correctly, although testing for that is hard.