Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upReplace Get function address #36
Conversation
TheDan64
self-requested a review
Apr 1, 2018
This comment has been minimized.
This comment has been minimized.
|
...is this an April fools joke? |
TheDan64
reviewed
Apr 1, 2018
| /// A wrapper around a function pointer which ensures the symbol being pointed | ||
| /// to doesn't accidentally outlive its execution engine. | ||
| #[derive(Debug, Clone)] | ||
| pub struct Symbol<F> { |
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 1, 2018
•
Owner
Is Symbol what LLVM calls it? (I do recall a LLVMGetSymbol function and don't want to confuse these symbols with those) I feel like Function or even RuntimeFunction might be a more descriptive name...
If LLVM calls them symbols, then maybe get_function_symbol and FunctionSymbol are good names?
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 1, 2018
•
Owner
Also, is it possible to constrain F to std::ops::Fn or something? So you can't just put any type there? I also wonder if this (nightly only) call method could be used.. https://doc.rust-lang.org/nightly/std/ops/trait.Fn.html#required-methods
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 2, 2018
Author
Contributor
I just called it Symbol because that's the name libloading uses. I'm not particularly attached to the name though.
I like the idea of constraining F to Fn. That and the assert_eq!() lower down should mean the only real type you can get out is a function pointer.
TheDan64
reviewed
Apr 1, 2018
| @@ -181,7 +226,13 @@ impl ExecutionEngine { | |||
| return Err(FunctionLookupError::FunctionNotFound); | |||
| } | |||
|
|
|||
| Ok(address) | |||
| assert_eq!(size_of::<F>(), size_of::<usize>(), | |||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 1, 2018
•
Owner
Doesn't transmute already fail to compile if the sizes are different though? Jk, this makes sense again
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 2, 2018
Author
Contributor
Yeahhh.... The transmute_copy() lower down makes me feel pretty dirty because it's essentially just a pointer cast. It lets us get around the "F may vary in size" transmute error, but it's kinda hacky.
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 2, 2018
•
Owner
Ok; this is fine. If we find that Symbol conflicts with the LLVM idea of a symbol we can rename it when we cross that bridge I suppose. Backward compatibility isn't a big concern at the moment.
Edit: Realized I replied to the wrong comment
TheDan64
reviewed
Apr 1, 2018
| assert!(xor(false, true)); | ||
| assert!(!xor(true, true)); | ||
| unsafe { | ||
| type BoolFunc = fn(bool, bool) -> bool; |
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 1, 2018
•
Owner
I guess it doesn't matter in this particular instance, but shouldn't the type here be an extern "C" fn?
This comment has been minimized.
This comment has been minimized.
|
This may not actually work, but I wonder if For example: Symbol<F> {
ee: ...,
address: usize,
}
impl <F> Deref for Symbol {
type Target = F;
fn deref(&self) -> &Self::Target {
transmute::<&F>(&self.address) // The references are important to keep rust seeing it as living on the struct
}
}Edit: in retrospect, I don't think you can make the deref method unsafe since its part of a trait, huh? I guess this is just what you proposed but inverted |
TheDan64
reviewed
Apr 1, 2018
|
This looks reasonable. My remaining question is |
TheDan64
assigned
Michael-F-Bryan
Apr 1, 2018
This comment has been minimized.
This comment has been minimized.
I really think the
I believe the
I looked at that as well. It'd be pretty much ideal, although I'd like it if there was a way to make the |
This comment has been minimized.
This comment has been minimized.
|
I think I've found a nice solution. We weren't able to constrain Instead, I created a "sealed" If you try to retrieve a function pointer which isn't marked
It'd be nice to use the |
TheDan64
reviewed
Apr 2, 2018
Yep, agreed.
Yeah.
Does the fn trait allow for run time generation of the args? If so it might be an interesting counterpart to these changes (though it doesn't need to be included in this PR. Just saying we should open an issue for it if it's a possibility) since this work is all compile time checks but that is very rigid for a compiler. We could put it behind a nightly only attribute (if that's a thing) or just feature gate it for nightly users. (Though obviously not being able to mark as unsafe function is a huge drawback, but just saying)
Wow. Blown away by this approach. This is super cool. |
| /// "C"` functions can be retrieved via the `get_function()` method. If you | ||
| /// get funny type errors then it's probably because you have specified the | ||
| /// wrong calling convention or forgotten to specify the retrieved function | ||
| /// is `unsafe`. |
This comment has been minimized.
This comment has been minimized.
| /// to doesn't accidentally outlive its execution engine. | ||
| #[derive(Clone)] | ||
| pub struct Symbol<F> { | ||
| pub(crate) execution_engine: Rc<LLVMExecutionEngineRef>, |
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 2, 2018
Owner
I think we should create an actual ExecutionEngine object, to ensure it is properly destroyed if a symbol somehow has the last remaining reference? (If this is a lot of work to change, I can do it after we merge this PR)
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 7, 2018
Author
Contributor
I was originally planning to go down the path of embedding an Rc<ExecutionEngine> in the Symbol tying it back to the original execution engine, but that turned out to be pretty annoying and cumbersome.
Really, we just want the Drop logic that will call the LLVM execution engine destructor when Rc<LLVMExecutionEngineRef>'s strong count goes to 1, so I extracted that out into a ExecEngineInner(Rc<LLVMExecutionEngineRef>) struct that both Symbol and ExecutionEngine use.
TheDan64
reviewed
Apr 2, 2018
| /// unsafe { | ||
| /// let test_fn = ee.get_function::<unsafe extern "C" fn() -> f64>("test_fn").unwrap(); | ||
| /// let return_value = test_fn(); | ||
| /// assert_eq!(return_value, 64.0); |
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 2, 2018
•
Owner
nit: assert_eq doesn't need to be in unsafe block. Maybe return the value? IE
let return_value = unsafe {
let test_fn = ee.get_function::<unsafe extern "C" fn() -> f64>("test_fn").unwrap();
test_fn()
};
assert_eq!(return_value, 64.0);If you think it's less clear that way, feel free to leave it as is.
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 7, 2018
•
Author
Contributor
It doesn't really need to be inside the unsafe block, but the reason I wrote it that way to show that the test_fn and using anything we get from it is innately unsafe, and requires a bit of extra attention.
I remember reading a while back that although you technically only need to wrap unsafe calls in an unsafe block, it's a good idea to include surrounding lines if they depend on the unsafe invariants being upheld.
TheDan64
reviewed
Apr 2, 2018
| impl<F> Debug for Symbol<F> { | ||
| fn fmt(&self, f: &mut Formatter) -> fmt::Result { | ||
| f.debug_tuple("Symbol") | ||
| .field(&"<unnamed>") |
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 2, 2018
•
Owner
Maybe we should store the function name from get_function for debugging purposes? You should be able to store a &str that lives as long as the input &str, I think.
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 7, 2018
Author
Contributor
Hmm... I'm not sure whether that'll be nice to do. We're keeping track of the execution engine by using an Rc, and mixing both a reference (with lifetimes) and a reference-counted pointer feels kinda weird.
It also means your function name must outlive the symbol, which isn't really practical or ergonomic. Imagine trying to store a loaded symbol in a struct when the name isn't known until runtime (e.g. the user provided it), it'd be way too easy to get stuck in the rabbit hole that is self-referential types.
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 7, 2018
•
Owner
The &str is tied to the input to the function call, not the execution engine though. But yeah. I guess that's fine
This comment has been minimized.
This comment has been minimized.
|
The readme example also needs to get updated to reflect these changes |
This comment has been minimized.
This comment has been minimized.
|
@TheDan64 I've updated the logic around making sure a I've also reworked the README example and copied it to the |
TheDan64
reviewed
Apr 7, 2018
| @@ -89,7 +91,7 @@ impl ExecutionEngine { | |||
| /// | |||
| /// assert_eq!(result, 128.); | |||
| /// ``` | |||
| pub fn add_global_mapping(&self, value: &AnyValue, addr: usize) { | |||
| pub fn add_global_mapping(&mut self, value: &AnyValue, addr: usize) { | |||
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 7, 2018
Owner
I wonder if adding mut will just make the API more difficult to use, it doesn't actually add any guarantees since we're pretty much just working with interior mutability anyway...
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 9, 2018
Author
Contributor
You raise a good point. Initially it felt quite strange to be mutating an execution engine without declaring it as mut, but it's probably better to remove the &mut self to stay consistent with the rest of the library.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
This is coming along nicely. Thank you for all of the effort you've put in! I just have a few more questions and then we should be all set to merge once those are resolved. |
TheDan64
reviewed
Apr 7, 2018
|
|
||
| if Rc::strong_count(&self.execution_engine) == 1 { | ||
| impl Clone for ExecutionEngine { |
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 7, 2018
Owner
Should the EE really get a publicly accessible clone method? Clone generally means reallocating a new copy, but here we're just making a new reference to the same EE. Feels a bit counter intuitive to me
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 7, 2018
Owner
The only reason why Module has Clone implemented is because LLVM provides a function to do just that
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 9, 2018
Author
Contributor
We end up using the equivalent of Clone in a couple places around the code base to create a new reference to the same underlying LLV execution engine so all I did was extract this into an impl for the Clone trait.
I think people will understand this better if we mention an ExecutionEngine is actually just a reference-counted pointer to an underlying LLVM execution engine. But I don't mind deleting (or hiding with #[doc(hidden)]) this Clone impl, it's up to you really.
This comment has been minimized.
This comment has been minimized.
TheDan64
Apr 11, 2018
Owner
Feel free to add documentation for it; preferably on the clone implementation itself (assuming rustdoc will make this visible, otherwise on the EE itself I guess)
This comment has been minimized.
This comment has been minimized.
Michael-F-Bryan
Apr 11, 2018
Author
Contributor
Rustdoc lets you override the docs for trait implementations, but I'll probably mention that an ExecutionEngine is a reference-counted pointer anyway.
Michael-F-Bryan
added some commits
Apr 1, 2018
Michael-F-Bryan
force-pushed the
Michael-F-Bryan:get-function-addr
branch
from
9dd5898
to
63d840c
Apr 21, 2018
This comment has been minimized.
This comment has been minimized.
|
@TheDan64 I think I've addressed all our previous concerns and just rebased onto |
This comment has been minimized.
This comment has been minimized.
codecov
bot
commented
Apr 21, 2018
•
Codecov Report
@@ Coverage Diff @@
## master #36 +/- ##
==========================================
+ Coverage 50.13% 50.83% +0.69%
==========================================
Files 38 37 -1
Lines 2960 2939 -21
==========================================
+ Hits 1484 1494 +10
+ Misses 1476 1445 -31
Continue to review full report at Codecov.
|
This comment has been minimized.
This comment has been minimized.
|
Sure thing, I should have time to do a final review this weekend |
TheDan64
approved these changes
Apr 23, 2018
|
@Michael-F-Bryan this looks great. Thanks for all the time you put into this PR! |
TheDan64
merged commit ebf8be2
into
TheDan64:master
Apr 23, 2018
Michael-F-Bryan
deleted the
Michael-F-Bryan:get-function-addr
branch
Apr 23, 2018
This comment has been minimized.
This comment has been minimized.
|
Woohoo! Now I just need to get back to my little calc example. I'm worried LLVM/inkwell will be too easy to use, so the actual LLVM lowering part will be dwarfed by everything else |
Michael-F-Bryan commentedApr 1, 2018
Description
After the discussion in #5 I propose we replace the
ExecutionEngine::get_function_address()method with aget_function()which does the address transmutation for the user.I believe this is still a fundamentally unsafe operation (there are no guarantees the signature is correct or it's not doing
unsafestuff), but at least this version ensures a function pointer doesn't accidentally outlive its parent execution engine.How This Has Been Tested
Updated the old tests to use the new API and added an example to the function's documentation which exercises the function.
Breaking Changes
The
get_function_address()method has now been replaced withget_function().