Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Rust syscalls/descriptors, and implement pipe() in Rust #1068

Merged
merged 13 commits into from Jan 21, 2021

Conversation

stevenengler
Copy link
Contributor

@stevenengler stevenengler commented Jan 14, 2021

Issue: #1033

The goals of this PR are:

  1. Support writing syscalls and new descriptor types entirely in Rust.
  2. Don't break existing C syscalls/descriptors so that we can use the old versions for performance comparisons.
  3. Allow us to incrementally move existing syscalls/descriptors to Rust since we can't convert everything at once.
  4. New Rust types must be compatible with Rust's descriptor table and event notification system.
  5. Be careful about performance when using internal mutability and reference counting since Rust requires these wrappers to be thread-safe.

The main changes are the following:

  1. Rename the C Descriptor type to LegacyDescriptor.
  2. Add a Rust Descriptor type.
  3. Modify the descriptor table to store Rust CompatDescriptor objects, which store either a Descriptor or LegacyDescriptor.
    pub enum CompatDescriptor {
        New(Descriptor),
        Legacy(SyncSendPointer<c::LegacyDescriptor>),
    }
  4. Rather than storing the file descriptor and file description (Linux "struct file") as the same object, the Rust version splits them up into two objects: the Descriptor and PosixFile objects. This allows the descriptor to be duplicated, and simplifies the integration of Rust descriptors and C descriptors.
    pub struct Descriptor {
        file: Arc<AtomicRefCell<PosixFile>>,
        flags: i32,
    }
  5. The Epoll descriptors were modified to work with CompatDescriptor objects rather than LegacyDescriptor objects so that they support Rust Descriptor objects as well.
  6. The Epoll descriptor's "watching" and "ready" tables now use the (fd, ptr) tuple as the key rather than just the fd. This is similar to how Linux implements epoll, and will be useful once we add support for duplicating file descriptors.
  7. Add the Rust PipeFile type to replace the C Channel type. This requires implementing the pipe(), read(), write(), and close() syscall handlers in Rust. If these Rust syscall handlers are operating on a LegacyDescriptor, they switch to the C syscall handler instead.
  8. A flag in src/main/host/syscall_handler.c allows you to enable/disable the Rust syscall handlers at build time (enabled by default). This might be useful for benchmarking.
  9. Adds some simple tests for pipe(), and writing/reading to/from these pipes. (These tests don't pass using the C syscalls handlers.)
  10. Some maintenance on the Rust/C bindings.

This PR is best reviewed by individual commits.

Changed the ordering of the C/Rust binding generation to make it
more reliable.

Renamed the `cbindings` module to `cshadow` since the contents of
`cbindings` are actually Rust bindings for C code.

Added some addtional documentation.
This helps to prevent circular dependencies between C headers.
@stevenengler stevenengler added Component: Main Composing the core Shadow executable Type: Enhancement New functionality or improved design labels Jan 14, 2021
@codecov
Copy link

codecov bot commented Jan 14, 2021

Codecov Report

Merging #1068 (ae65bd4) into dev (be97c7b) will decrease coverage by 0.14%.
The diff coverage is 78.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #1068      +/-   ##
==========================================
- Coverage   55.15%   55.01%   -0.15%     
==========================================
  Files         129      129              
  Lines       19358    19471     +113     
  Branches     4611     4645      +34     
==========================================
+ Hits        10677    10712      +35     
- Misses       5895     5974      +79     
+ Partials     2786     2785       -1     
Flag Coverage Δ
tests 55.01% <78.75%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/main/host/syscall/uio.c 0.00% <0.00%> (ø)
src/main/host/syscall_condition.c 70.40% <11.11%> (-6.26%) ⬇️
src/main/host/syscall/unistd.c 43.35% <45.45%> (-14.46%) ⬇️
src/main/host/descriptor/channel.c 32.14% <50.00%> (-33.34%) ⬇️
src/main/host/syscall/epoll.c 61.70% <64.70%> (-0.94%) ⬇️
src/main/host/network_interface.c 67.39% <66.66%> (ø)
src/main/host/syscall/fileat.c 9.65% <66.66%> (ø)
src/main/host/syscall/timerfd.c 36.11% <75.00%> (ø)
src/main/host/process.c 74.43% <79.41%> (-0.34%) ⬇️
src/main/host/syscall_handler.c 53.33% <80.00%> (ø)
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be97c7b...59f9846. Read the comment docs.

Copy link
Contributor

@sporksmith sporksmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just some minor nits

@@ -121,17 +121,17 @@ static void _descriptortable_trimIndicesTail(DescriptorTable* table) {
}
}

bool descriptortable_remove(DescriptorTable* table, LegacyDescriptor* descriptor) {
bool descriptortable_remove(DescriptorTable* table, int handle) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional - Maybe worth having a stronger type here, to avoid mixing with other such handles? e.g. struct DescriptorHandle { int val; }.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the code in "descriptor_table.c" uses the name "index" and the code in "process.c" uses the name "handle", but they both mean the same thing (the "fd"). I renamed this back to "index" to make it more clear.

src/main/bindings/c/bindings.h Outdated Show resolved Hide resolved
src/main/host/descriptor/mod.rs Outdated Show resolved Hide resolved
src/main/host/descriptor/mod.rs Outdated Show resolved Hide resolved
src/main/host/descriptor/mod.rs Outdated Show resolved Hide resolved
src/main/host/descriptor/mod.rs Show resolved Hide resolved
src/main/host/descriptor/pipe.rs Outdated Show resolved Hide resolved
src/main/host/syscall/unistd.rs Show resolved Hide resolved
src/main/host/syscall_handler.c Outdated Show resolved Hide resolved
@@ -165,6 +165,12 @@ static void _syscallhandler_post_syscall(SysCallHandler* sys, long number,
scr = syscallhandler_##s(sys, args); \
_syscallhandler_post_syscall(sys, args->number, #s, &scr); \
break
#define HANDLE_NEW(s) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throughout, is there a more descriptive tag we can use than new? maybe rust? e.g. HANDLE_RUST?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now HANDLE_RUST.


/// An invocation will fail to compile if the provided type is not Send.
#[cfg(test)]
fn verify_send<T: Send>() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea. I experimented with this a little and found a way to do it with an extra trait definition instead. I slightly prefer this way since it's purely static, and makes it easier to put it near the type definition in an idiomatic way. I don't feel super strongly about it vs leaving it as-is though.

Example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e3a40bbd6bc607b67dbacb9f1c687e3b

trait IsSend : Send {}
trait IsSync : Sync {}

struct Foo {}

// Compiles
impl IsSend for Foo {}
impl IsSync for Foo {}

struct Bar {x: *mut i32}
// Doesn't compile
impl IsSend for Bar {}
impl IsSync for Bar {}

if desired we could #[cfg(test)]-guard the impls as well, though I think no code will end up in the final binary anyway. e.g. https://godbolt.org/z/z5n9rr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I definitely like this way better.

Copy link
Member

@robgjansen robgjansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is great - a big move in the right direction :) I'm glad we have good testing in place to give us confidence that things are still working given the number of changes here.

src/main/bindings/c/bindings.h Outdated Show resolved Hide resolved
EpollWatchObject watchObject;

/* for varible scoping */
if (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to refactor into a small static function, which helps us avoid too much nesting (which makes things easier to read/follow) and also serves as a form of documentation. FWIW, I usually prefer small functions over bare blocks.

src/test/pipe/pipe.shadow.config.xml Outdated Show resolved Hide resolved
#define NATIVE(s) \
case SYS_##s: \
debug("native syscall %ld " #s, args->number); \
scr = (SysCallReturn){.state = SYSCALL_NATIVE}; \
break

/* Comment out this line to use the C syscall handlers. */
#define USE_RUST_SYSCALLS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, do we want this as a command line option instead of #define?

Or maybe a better question is, what is the plan for eventually removing the c syscall handlers? Is it just until we feel the rust versions are "stable"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this may be better as the inverse (ex: USE_C_SYSCALLS) which defaults to unset, and then can be enabled as a gcc flag with -D USE_C_SYSCALLS.

Right now the Rust syscall handlers still depend on the C syscall handlers if the descriptor is a "legacy descriptor", so some of the C syscall handler code (for example syscallhandler_read) will have to stay until all of the descriptor types are implemented in Rust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can now be set with a CLI flag. For example, ./setup build --debug --test --clean --use-c-syscalls.

@stevenengler
Copy link
Contributor Author

stevenengler commented Jan 20, 2021

Thanks for the reviews! I made some comments (and sorry if there were a lot of emails, I also triggered GitHub's abuse prevention system several times) and also fixed/improved a few other things I found.

@sporksmith Can you take a look at 85bbb75 and let me know if those changes look good to you?

Copy link
Contributor

@sporksmith sporksmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, other than that I'd still prefer the increment in compatdescriptor_getPosixFile to be a bit more explicit (see inline comment)

src/main/host/descriptor/mod.rs Show resolved Hide resolved
Added a new Rust `Descriptor` type and modified the descriptor table
to store `CompatDescriptor` objects, which can store either this new
Rust descriptor type or the legacy C descriptor type.
Added a new Rust `PosixFile` object which is referenced by `Descriptor`
objects. Also added the structure of the C interface that Shadow will
later use to integrate this new Rust type with legacy descriptors such
as `Epoll`. This interface is defined but not yet implemented.
Rather than indexing only by the fd, the epoll descriptor 'watching'
and 'ready' tables now index their entries by a tuple of the fd and a
pointer to the object. This is similar to how Linux implements epoll,
and will give us more expected behaviour. It is also on the path to
adding support for watching Rust descriptor objects.
The `Epoll` descriptor type can now watch for events on both
`LegacyDescriptor` objects and `PosixFileArc` objects.
Shadow syscalls can now block on Rust `PosixFile` objects.
@stevenengler stevenengler force-pushed the rust-syscalls branch 2 times, most recently from 01d63c7 to 18bf02a Compare January 21, 2021 15:06
This commit implements a `FilePipe` type which supports the `pipe()`
syscall completely in Rust.
Use the `--use-c-syscalls` flag to build Shadow without the Rust
syscall handlers. For example:

./setup build --debug --test --clean --use-c-syscalls
All flags were being stored in the descriptor object, but only the
FD_CLOEXEC should be. The rest should be stored in the posix file
object itself. Also improved the comments.
@stevenengler stevenengler merged commit eab0f69 into shadow:dev Jan 21, 2021
@stevenengler stevenengler deleted the rust-syscalls branch January 21, 2021 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Main Composing the core Shadow executable Type: Enhancement New functionality or improved design
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants