Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upWIP: bincode IPC #1124
WIP: bincode IPC #1124
Conversation
Note: this adds a dependency on `time` to `webrender_traits`. The purpose of this is to evaluate the overhead of different IPC implementations and serialization strategies. In particular, we need to replace the current implementation, as it is unsound; it just blindly transmutes, leaving room for a malicious process to create Undefined representations of e.g. enums. We take 4 time stamps: * before serialization to `Vec<u8>` happens * after serialization * before deserialization * after deserialization Because some of these time stamps happen in webrender_traits and cross-process, we need to take them manually and send the first two over IPC. Currently I'm just stuffing them in one of the descriptors, since that's easy and minimally invasive (but Firefox's headers will need to be updated, I think). We consider the time to perform the message send cross-process to be the time between serialization ending and deserialization starting. Overall IPC overhead is the whole process. We also record how large the two sent buffers are, as this may change significantly with a Proper Serialization Mechanism. In my preliminary testing basically all overhead is currently in the actual message send process (about 0.5ms on Mac in Servo). There's sometimes some overhead from deserialization, which is probably stuffing the auxlist in a map. About 200k is being sent for the display list, and 100k for the aux list.
|
I've only profiled Servo so far. Summary: Mean IPC time goes from 0.5ms to 0.9ms with this change. Most of these losses are in deserialization time; some of it is in serialization time. It looks like most of this is when processing DisplayItems, and not AuxiliaryLists (need to dig into this, but it makes sense since AuxiliaryLists are relatively simple). Note however that (de)serialization costs are suppressed by a surprising reduction in message send time, which appears to be the result of a reduction in the size of the message sent. Currently DisplayItems are ~twice as big as they could be in memory, and bincode manages to realize this size reduction. In principle this means transmute-based IPC could be reduced to ~0.3ms by just optimizing the layout of DisplayItems? However this may improve (de)serialization time too. AuxiliaryLists seem to already be perfectly compact. All in all this is pretty promising, considering the current implementation is completely naive. There's probably interesting wins to find in bincode::deserialize and DisplayItem layout. Data: transmute:
bincode:
(edit: note that the numbers don't perfectly sum because I was mostly eye-balling the serial/send/deserial means -- close enough for horse grenades) |
|
CC @TyOverby |
|
There are a few optimizations that I'm going to be adding to bincode in the nearish future that might help webrender out:
|
|
@gankro thanks for the experiment and analysis! Machine performance: Stability/portability: possible undefined behavior? |
|
@kvark it's not so much a stability/portability issue as a "blatant hole in the multi-process security model". If the content process is hijacked, it can feed arbitrary bytes into the buffer, leading to undefined enum descriminants. This can in turn lead to trying to execute non-existent jump table slots (to name one obvious result of UB). |
@kvark: what do you mean by this? |
|
@TyOverby @gankro ok, makes sense. Would it be reasonable to look into asserting for the enum discriminants (as in - |
That would be news to me! As far as I can tell, the only unsafe is this one which is just an optimization avoiding zeroing out a Vec<u8>. Looking at the servo structs that are being serialized, I'm fairly certain that you never hit this code path to begin with, but I'd be happy to introduce a crate feature that doesn't include the optimization. On top of this, bincode has been through several days of fuzzing, never hitting any vulnerabilities (it found a few panics that were immediately fixed). I'm using bincode in networked applications, so I'm pretty concerned about the security aspect of the project. |
|
@TyOverby oh, right! My mistake then, thanks for correcting me :) |
|
@kvark bincode would be strongly preferred because we don't need to maintain it, but if I can't get the overhead low enough, the next avenue of exploration would be a fork of bincode that "happens" to match the Rust ABI exactly so that LLVM hopefully optimizes everything into transmute + validation. |
|
|
|
Replacing with new PR. |
Gankra commentedApr 17, 2017
•
edited by larsbergstrom
(based on my other PR that adds profiling; posting this now to discuss results)
Note: this adds a
bincodedep towebrender-traitsThis replaces the unsound transmute-based IPC with sound bincode-based IPC (results in subsequent comments).
This change is