Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c18n: [DRAFT] Rework implementation to be interrupt-safe #2079

Open
wants to merge 7 commits into
base: dev
Choose a base branch
from

Conversation

dpgao
Copy link
Contributor

@dpgao dpgao commented Apr 6, 2024

Edit: The c18n statistics part of this PR is a bit stalled, so I pulled out the interrupt-safe changes to #2090 which will hopefully get merged soon.

This PR builds upon #2012 and #2032 and the real content is in the very last commit entitled c18n: Rework implementation to be interrupt-safe. This is not meat to be merged but is a stable implementation needing feedback. I do hope it can be merged in the next release if time permits.

This commit completely refactors the trampoline and how stack switching works. The purecap and benchmark ABI implementations now both use a dedicated register to store the trusted stack (ddc and rddc respectively). This makes the trampolines look identical (modulo register names) on both ABIs. No metadata recording the current top of the stack is stored at the bottom of each compartment's stack. Instead, the stack lookup table now stores that information.

The signal handling mechanism has been rewritten to handle (rare) cases where c18n code, in particular trampolines, is interrupted. All c18n code paths that could be interrupted have been audited and it is believed that they can all be handled correctly, although testing for that is hard.

@dpgao dpgao force-pushed the c18n-nobot branch 6 times, most recently from 874a94c to 0ecef41 Compare April 10, 2024 17:01
@dpgao dpgao changed the title c18n: [WIP] Do not store stack metadata at its bottom c18n: Rework implementation to be interrupt-safe Apr 10, 2024
@dpgao dpgao force-pushed the c18n-nobot branch 2 times, most recently from 9680cbd to 773358e Compare April 11, 2024 17:14
@dpgao dpgao force-pushed the c18n-nobot branch 2 times, most recently from e195871 to c0da4f1 Compare April 19, 2024 15:45
libexec/rtld-elf/rtld_c18n.c Show resolved Hide resolved
struct cheri_c18n_info {
uint8_t version;
size_t stats_size;
struct rtld_c18n_stats * __capability stats;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be __kerncap or does that complicate coredump support? I don't think only purecap rtld touches it in userspace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar enough with the mechanisms involved here. Perhaps @rwatson and @bsdjhb can comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-purecap RTLD wouldn't even have c18n compiled in, so using __kerncap makes no practical difference, if I'm understanding the problem correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like it should be __kerncap as the annotation will go away once we're purecap-only.

sys/kern/kern_proc.c Outdated Show resolved Hide resolved
sys/kern/kern_proc.c Outdated Show resolved Hide resolved
struct proc *p;
struct cheri_c18n_info info;
int error;
void *buffer;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you initialize this to NULL you don't need two labels for the exit path.

sys/sys/imgact.h Outdated Show resolved Hide resolved
}

buffer = malloc(info.stats_size, M_TEMP, M_WAITOK);
n = proc_readmem(curthread, p, (__cheri_addr vm_offset_t)info.stats,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't go blindly trusting the address here. At a minimum we need to check the capability or the process could leak secrets with a bad value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is such a check performed? Could you point me to an example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use __CAP_CHECK to verify that the data is in range. That macro should likely be altered to require that the capability be unsealed as well as tagged. You should also check that it has load permission.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__CAP_CHECK does require the capability to be tagged, which in this causes it to always fail because info.stats is always untagged.

I don't understand why we need to do this check though. Wouldn't the userspace just leak it's own memory if it sets a bad value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is info.stats untagged? That seems completely wrong.

Causing program secrets to be trivial available by sysctl seems like a bad idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that proc_readmem uses the UIO_SYSSPACE flag which does a bcopynocap_c underneath, stripping all tags.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we need an _c variant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we want to extend uio_rw with UIO_READ_CAP UIO_WRITE_CAP variants, so that we can honor it in both uiomove_flags and uiomove_fromphys (and all possible variations) without having to carry around an extra flag. I think when we set the UIO_READ/WRITE we already know whether we expect capabilities to be there and the current scheme should work fine in most cases as we don't preserve capabilities by default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think you want it orthogonal yes. Extending uio_rw is probably the cleanest way. My only worry is if there is any code doing if (uio->uio_rw == UIO_READ) else /* WRITE */ instead of using switch statements. A quick grep does show many, but also lots of KASSERT's that would catch this I think?

sys/sys/elf_common.h Outdated Show resolved Hide resolved
sys/sys/sysctl.h Outdated Show resolved Hide resolved
lib/libprocstat/libprocstat.c Outdated Show resolved Hide resolved
@dpgao dpgao force-pushed the c18n-nobot branch 2 times, most recently from 66fd8d4 to c73231e Compare April 22, 2024 17:06
@dpgao dpgao force-pushed the c18n-nobot branch 3 times, most recently from 3dd5952 to b4b5479 Compare April 23, 2024 22:32
dpgao and others added 4 commits April 24, 2024 12:10
Exposes LD_COMPARTMENT_STATS that exports a set of
compartmentalisation-related statistics to a user-specified file.
The trampoline and other parts of RTLD are refactored to be
interrupt-safe. The trusted frame is redesigned to allow trampolines to
perform tail-calls that do not push a trusted frame.

The new design also no longer relies on a region of metadata at the
bottom of each compartment's stack.
Comment on lines +387 to +405
/*
* Error handling here is wrong. If ENOEXEC, really want to print
* output indicating no information, which this function signature
* doesn't currently support. This is because the process probably
* simply doesn't have c18n in use
*/
name[0] = CTL_KERN;
name[1] = KERN_PROC;
name[2] = KERN_PROC_C18N;
name[3] = kp->ki_pid;
error = sysctl(name, nitems(name), *pp, lenp, NULL, 0);
if (error != 0 && errno != ESRCH && errno != EPERM &&
errno != ENOEXEC) {
warn("sysctl(kern.proc.c18n)");
goto out_free;
}
if (error != 0)
goto out_free;
return (0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwatson Do we need to fix the error handling here?

}

buffer = malloc(info.stats_size, M_TEMP, M_WAITOK);
n = proc_readmem(curthread, p, (__cheri_addr vm_offset_t)info.stats,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__CAP_CHECK does require the capability to be tagged, which in this causes it to always fail because info.stats is always untagged.

I don't understand why we need to do this check though. Wouldn't the userspace just leak it's own memory if it sets a bad value?

@dpgao dpgao changed the title c18n: Rework implementation to be interrupt-safe c18n: [DRAFT] Rework implementation to be interrupt-safe Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants