Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

Runtime vector refactoring #261

Merged
merged 34 commits into from Aug 10, 2023
Merged

Runtime vector refactoring #261

merged 34 commits into from Aug 10, 2023

Conversation

brson
Copy link
Collaborator

@brson brson commented Jul 29, 2023

Work towards #249

These patches are primarily about moving all the vector types and functions into the vector module, and converting all the free functions into methods on those types.

Many existing functions take a pair of MoveType and MoveUntypedVector, pair them and perform a typed operation. This is always unsafe because it requires the two to agree. After these patches, this pairing is always performed in the unsafe constructors of e.g. TypedMoveBorrowedRustVec, which allows some of the subsequent method calls to be declared safe.

During this process I discovered that slight refactorings could drastically change the number of executed rbpf instructions, sometimes exhausting the cpu budget in the bitvector test specifically. Certainly due to whims of the llvm optimizer. So I tracked the perf changes between commits - indicated in some of the commit messages. I did not attempt to maintain perfect parity with existing instruction counts, just to keep them from ballooning too much.

I think the optimizer has particular difficulty seeing all the way through from the constructor to the destructor of the big TypedMoveBorrowedRustVec enum - when it does it can avoid switching multiple times on it. But that's just speculation. I haven't looked closely. I did put #[inline(always)] on the new function of this type as I saw huge decreases in instruction counts doing so - usually inline(always) is not recommended as the llvm inliner tends to make better decisions than people do, but in this case it looked like too big a win to pass up.

I also added docs to the vector module.

This patch also replaces all uses of &mut AnyValue with *mut AnyValue as the former is not sound: safely writing to a &mut AnyValue could result in the actual type having an invalid representation, so writing to an AnyValue must be unsafe (though no code actually does this). I've updated the docs to indicate this.

brson added 30 commits July 29, 2023 22:46
Copy link
Collaborator

@ksolana ksolana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming there were no functionality changes

@nvjle
Copy link

nvjle commented Aug 1, 2023

LGTM.

Some asides:
Regarding bit_vector mentioned above. I also ran into the problem of exceeding computational budget when originally implementing generics and getting the bit_vector stdlib tests working. While LLVM can do a better job, it should be noted the the bit_vector module is absurdly, horribly inefficient-- and there is nothing LLVM can do about this. I cannot imagine a real program using this module as it exists today. Clearly it was written as a first effort for baseline functionality, and not as a serious, production facility. Even ignoring the inefficient way the Move code itself was written, all of the functions should be natives for such fine-grained bit diddling (which can also control computation costs). The other inefficiency is that it also relies on std::vector, which just is not very efficient to begin with (in the speed-of-light sense) because of the layering needed in move-native (i.e., everything you see in this patch, reinterpreting Move vectors as Rust, transfers to and fro, etc).

Regarding safety-- in the general sense-- we still are not doing anything about invalid inputs (either from an adversary or a compiler defect). All these routines will still gladly interpret, say, invalid descriptors. I mentioned in the past that we need some way to validate the various descriptors. Even something as simple as a storing a "magic number" with a descriptor will at least help guard against compiler problems (but not adversaries). For example, I recently fixed a defect where we did not dereference a pointer to a runtime descriptor of some sort. By chance, the tests still happen to pass-- just by bad luck. When I happened to be adding more functionality and tests, I was getting strange behavior and crashes in the runtime. This was difficult to track down, but would have been simple with a magic number check. But the more concerning problem is how to make sure adversaries can't send in bad data.

@brson brson merged commit ef376eb into anza-xyz:llvm-sys Aug 10, 2023
4 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants