Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming there were no functionality changes
LGTM. Some asides: Regarding safety-- in the general sense-- we still are not doing anything about invalid inputs (either from an adversary or a compiler defect). All these routines will still gladly interpret, say, invalid descriptors. I mentioned in the past that we need some way to validate the various descriptors. Even something as simple as a storing a "magic number" with a descriptor will at least help guard against compiler problems (but not adversaries). For example, I recently fixed a defect where we did not dereference a pointer to a runtime descriptor of some sort. By chance, the tests still happen to pass-- just by bad luck. When I happened to be adding more functionality and tests, I was getting strange behavior and crashes in the runtime. This was difficult to track down, but would have been simple with a magic number check. But the more concerning problem is how to make sure adversaries can't send in bad data. |
Work towards #249
These patches are primarily about moving all the vector types and functions into the vector module, and converting all the free functions into methods on those types.
Many existing functions take a pair of
MoveType
andMoveUntypedVector
, pair them and perform a typed operation. This is always unsafe because it requires the two to agree. After these patches, this pairing is always performed in the unsafe constructors of e.g.TypedMoveBorrowedRustVec
, which allows some of the subsequent method calls to be declared safe.During this process I discovered that slight refactorings could drastically change the number of executed rbpf instructions, sometimes exhausting the cpu budget in the bitvector test specifically. Certainly due to whims of the llvm optimizer. So I tracked the perf changes between commits - indicated in some of the commit messages. I did not attempt to maintain perfect parity with existing instruction counts, just to keep them from ballooning too much.
I think the optimizer has particular difficulty seeing all the way through from the constructor to the destructor of the big TypedMoveBorrowedRustVec enum - when it does it can avoid switching multiple times on it. But that's just speculation. I haven't looked closely. I did put
#[inline(always)]
on thenew
function of this type as I saw huge decreases in instruction counts doing so - usuallyinline(always)
is not recommended as the llvm inliner tends to make better decisions than people do, but in this case it looked like too big a win to pass up.I also added docs to the vector module.
This patch also replaces all uses of
&mut AnyValue
with*mut AnyValue
as the former is not sound: safely writing to a&mut AnyValue
could result in the actual type having an invalid representation, so writing to anAnyValue
must be unsafe (though no code actually does this). I've updated the docs to indicate this.