Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Get function address #36

Merged
merged 10 commits into from Apr 23, 2018

Conversation

Projects
None yet
2 participants
@Michael-F-Bryan
Copy link
Contributor

Michael-F-Bryan commented Apr 1, 2018

Description

After the discussion in #5 I propose we replace the ExecutionEngine::get_function_address() method with a get_function() which does the address transmutation for the user.

I believe this is still a fundamentally unsafe operation (there are no guarantees the signature is correct or it's not doing unsafe stuff), but at least this version ensures a function pointer doesn't accidentally outlive its parent execution engine.

How This Has Been Tested

Updated the old tests to use the new API and added an example to the function's documentation which exercises the function.

Breaking Changes

The get_function_address() method has now been replaced with get_function().

@TheDan64 TheDan64 self-requested a review Apr 1, 2018

@TheDan64

This comment has been minimized.

Copy link
Owner

TheDan64 commented Apr 1, 2018

...is this an April fools joke? 😄

/// A wrapper around a function pointer which ensures the symbol being pointed
/// to doesn't accidentally outlive its execution engine.
#[derive(Debug, Clone)]
pub struct Symbol<F> {

This comment has been minimized.

@TheDan64

TheDan64 Apr 1, 2018

Owner

Is Symbol what LLVM calls it? (I do recall a LLVMGetSymbol function and don't want to confuse these symbols with those) I feel like Function or even RuntimeFunction might be a more descriptive name...

If LLVM calls them symbols, then maybe get_function_symbol and FunctionSymbol are good names?

This comment has been minimized.

@TheDan64

TheDan64 Apr 1, 2018

Owner

Also, is it possible to constrain F to std::ops::Fn or something? So you can't just put any type there? I also wonder if this (nightly only) call method could be used.. https://doc.rust-lang.org/nightly/std/ops/trait.Fn.html#required-methods

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 2, 2018

Author Contributor

I just called it Symbol because that's the name libloading uses. I'm not particularly attached to the name though.

I like the idea of constraining F to Fn. That and the assert_eq!() lower down should mean the only real type you can get out is a function pointer.

@@ -181,7 +226,13 @@ impl ExecutionEngine {
return Err(FunctionLookupError::FunctionNotFound);
}

Ok(address)
assert_eq!(size_of::<F>(), size_of::<usize>(),

This comment has been minimized.

@TheDan64

TheDan64 Apr 1, 2018

Owner

This is a neat trick!

This comment has been minimized.

@TheDan64

TheDan64 Apr 1, 2018

Owner

Doesn't transmute already fail to compile if the sizes are different though? Jk, this makes sense again

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 2, 2018

Author Contributor

Yeahhh.... The transmute_copy() lower down makes me feel pretty dirty because it's essentially just a pointer cast. It lets us get around the "F may vary in size" transmute error, but it's kinda hacky.

This comment has been minimized.

@TheDan64

TheDan64 Apr 2, 2018

Owner

Ok; this is fine. If we find that Symbol conflicts with the LLVM idea of a symbol we can rename it when we cross that bridge I suppose. Backward compatibility isn't a big concern at the moment.

Edit: Realized I replied to the wrong comment 😂

assert!(xor(false, true));
assert!(!xor(true, true));
unsafe {
type BoolFunc = fn(bool, bool) -> bool;

This comment has been minimized.

@TheDan64

TheDan64 Apr 1, 2018

Owner

I guess it doesn't matter in this particular instance, but shouldn't the type here be an extern "C" fn?

@TheDan64

This comment has been minimized.

Copy link
Owner

TheDan64 commented Apr 1, 2018

This may not actually work, but I wonder if get_function doesn't need to be unsafe if we move the logic around. Do you think it's feasible to store the address until it's actually needed, and then do the transmute?

For example:

Symbol<F> {
    ee: ...,
    address: usize,
}

impl <F> Deref for Symbol {
    type Target = F;

    fn deref(&self) -> &Self::Target {
        transmute::<&F>(&self.address) // The references are important to keep rust seeing it as living on the struct
    }
}

Edit: in retrospect, I don't think you can make the deref method unsafe since its part of a trait, huh? I guess this is just what you proposed but inverted

@TheDan64
Copy link
Owner

TheDan64 left a comment

This looks reasonable. My remaining question is deref will still return an unsafe function, right?

@Michael-F-Bryan

This comment has been minimized.

Copy link
Contributor Author

Michael-F-Bryan commented Apr 2, 2018

This may not actually work, but I wonder if get_function doesn't need to be unsafe if we move the logic around. Do you think it's feasible to store the address until it's actually needed, and then do the transmute?

I really think the get_function method needs to be unsafe to call. We're essentially leaving it up to the user to choose the function signature, so to mark it as safe would probably be unsound.

This looks reasonable. My remaining question is deref will still return an unsafe function, right?

I believe the Deref impl means a Symbol<F> will be unsafe if F is unsafe (e.g. unsafe fn()). There isn't really any way to specify that a function is unsafe to call, so I guess making it unsafe to create a function is about the best we can do 😞

I wonder if we could use the nightly call from https://doc.rust-lang.org/nightly/std/ops/trait.Fn.html#required-methods ? Seems to be runtime function-ish

I looked at that as well. It'd be pretty much ideal, although I'd like it if there was a way to make the Fn* traits unsafe to call. I was hesitant to propose implementing the Fn* traits though, seeing as it'll confine inkwell to be nightly-only, and that's a pain.

@Michael-F-Bryan

This comment has been minimized.

Copy link
Contributor Author

Michael-F-Bryan commented Apr 2, 2018

I think I've found a nice solution. We weren't able to constrain F to be Fn because that then requires you to name the arguments and then things get kinda annoying.

Instead, I created a "sealed" UnsafeFunctionPointer trait and implemented it only for unsafe extern "C" functions.

If you try to retrieve a function pointer which isn't marked unsafe or extern "C" you now get an error message like this:

error[E0277]: the trait bound `fn(): inkwell::execution_engine::UnsafeFunctionPointer` is not satisfied
  --> tests/test_execution_engine.rs:44:34
   |
44 |         assert!(execution_engine.get_function::<fn()>("func").is_ok());
   |                                  ^^^^^^^^^^^^ the trait `inkwell::execution_engine::UnsafeFunctionPointer` is not implemented for `fn()`
   |
   = help: the following implementations were found:
             <unsafe extern "C" fn() -> Output as inkwell::execution_engine::UnsafeFunctionPointer>=

It'd be nice to use the on_unimplemented attribute to give a better error message, but the compiler's suggestions aren't too shabby.

@TheDan64
Copy link
Owner

TheDan64 left a comment

I really think the get_function method needs to be unsafe to call. We're essentially leaving it up to the user to choose the function signature, so to mark it as safe would probably be unsound.

Yep, agreed.

I believe the Deref impl means a Symbol will be unsafe if F is unsafe (e.g. unsafe fn()). There isn't really any way to specify that a function is unsafe to call, so I guess making it unsafe to create a function is about the best we can do

Yeah.

I looked at that as well. It'd be pretty much ideal, although I'd like it if there was a way to make the Fn* traits unsafe to call. I was hesitant to propose implementing the Fn* traits though, seeing as it'll confine inkwell to be nightly-only, and that's a pain.

Does the fn trait allow for run time generation of the args? If so it might be an interesting counterpart to these changes (though it doesn't need to be included in this PR. Just saying we should open an issue for it if it's a possibility) since this work is all compile time checks but that is very rigid for a compiler. We could put it behind a nightly only attribute (if that's a thing) or just feature gate it for nightly users. (Though obviously not being able to mark as unsafe function is a huge drawback, but just saying)

I think I've found a nice solution...

Wow. Blown away by this approach. This is super cool. 👍 We already use sealed traits a bunch in inkwell, here's hoping they get an official representation in rust, since they feel pretty hacky today.

/// "C"` functions can be retrieved via the `get_function()` method. If you
/// get funny type errors then it's probably because you have specified the
/// wrong calling convention or forgotten to specify the retrieved function
/// is `unsafe`.

This comment has been minimized.

@TheDan64

TheDan64 Apr 2, 2018

Owner

I think this should say as unsafe?

/// to doesn't accidentally outlive its execution engine.
#[derive(Clone)]
pub struct Symbol<F> {
pub(crate) execution_engine: Rc<LLVMExecutionEngineRef>,

This comment has been minimized.

@TheDan64

TheDan64 Apr 2, 2018

Owner

I think we should create an actual ExecutionEngine object, to ensure it is properly destroyed if a symbol somehow has the last remaining reference? (If this is a lot of work to change, I can do it after we merge this PR)

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 7, 2018

Author Contributor

I was originally planning to go down the path of embedding an Rc<ExecutionEngine> in the Symbol tying it back to the original execution engine, but that turned out to be pretty annoying and cumbersome.

Really, we just want the Drop logic that will call the LLVM execution engine destructor when Rc<LLVMExecutionEngineRef>'s strong count goes to 1, so I extracted that out into a ExecEngineInner(Rc<LLVMExecutionEngineRef>) struct that both Symbol and ExecutionEngine use.

/// unsafe {
/// let test_fn = ee.get_function::<unsafe extern "C" fn() -> f64>("test_fn").unwrap();
/// let return_value = test_fn();
/// assert_eq!(return_value, 64.0);

This comment has been minimized.

@TheDan64

TheDan64 Apr 2, 2018

Owner

nit: assert_eq doesn't need to be in unsafe block. Maybe return the value? IE

let return_value = unsafe {
    let test_fn = ee.get_function::<unsafe extern "C" fn() -> f64>("test_fn").unwrap();

    test_fn()
};
assert_eq!(return_value, 64.0);

If you think it's less clear that way, feel free to leave it as is.

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 7, 2018

Author Contributor

It doesn't really need to be inside the unsafe block, but the reason I wrote it that way to show that the test_fn and using anything we get from it is innately unsafe, and requires a bit of extra attention.

I remember reading a while back that although you technically only need to wrap unsafe calls in an unsafe block, it's a good idea to include surrounding lines if they depend on the unsafe invariants being upheld.

impl<F> Debug for Symbol<F> {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
f.debug_tuple("Symbol")
.field(&"<unnamed>")

This comment has been minimized.

@TheDan64

TheDan64 Apr 2, 2018

Owner

Maybe we should store the function name from get_function for debugging purposes? You should be able to store a &str that lives as long as the input &str, I think.

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 7, 2018

Author Contributor

Hmm... I'm not sure whether that'll be nice to do. We're keeping track of the execution engine by using an Rc, and mixing both a reference (with lifetimes) and a reference-counted pointer feels kinda weird.

It also means your function name must outlive the symbol, which isn't really practical or ergonomic. Imagine trying to store a loaded symbol in a struct when the name isn't known until runtime (e.g. the user provided it), it'd be way too easy to get stuck in the rabbit hole that is self-referential types.

This comment has been minimized.

@TheDan64

TheDan64 Apr 7, 2018

Owner

The &str is tied to the input to the function call, not the execution engine though. But yeah. I guess that's fine

@TheDan64

This comment has been minimized.

Copy link
Owner

TheDan64 commented Apr 7, 2018

The readme example also needs to get updated to reflect these changes

@Michael-F-Bryan

This comment has been minimized.

Copy link
Contributor Author

Michael-F-Bryan commented Apr 7, 2018

@TheDan64 I've updated the logic around making sure a Symbol properly destroys the underlying LLVM execution engine if it's the last person with a reference to it.

I've also reworked the README example and copied it to the examples/ directory.

@@ -89,7 +91,7 @@ impl ExecutionEngine {
///
/// assert_eq!(result, 128.);
/// ```
pub fn add_global_mapping(&self, value: &AnyValue, addr: usize) {
pub fn add_global_mapping(&mut self, value: &AnyValue, addr: usize) {

This comment has been minimized.

@TheDan64

TheDan64 Apr 7, 2018

Owner

I wonder if adding mut will just make the API more difficult to use, it doesn't actually add any guarantees since we're pretty much just working with interior mutability anyway...

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 9, 2018

Author Contributor

You raise a good point. Initially it felt quite strange to be mutating an execution engine without declaring it as mut, but it's probably better to remove the &mut self to stay consistent with the rest of the library.

This comment has been minimized.

@TheDan64

TheDan64 Apr 11, 2018

Owner

Yeah, I think we should remove it

@TheDan64

This comment has been minimized.

Copy link
Owner

TheDan64 commented Apr 7, 2018

This is coming along nicely. Thank you for all of the effort you've put in! I just have a few more questions and then we should be all set to merge once those are resolved.


if Rc::strong_count(&self.execution_engine) == 1 {
impl Clone for ExecutionEngine {

This comment has been minimized.

@TheDan64

TheDan64 Apr 7, 2018

Owner

Should the EE really get a publicly accessible clone method? Clone generally means reallocating a new copy, but here we're just making a new reference to the same EE. Feels a bit counter intuitive to me

This comment has been minimized.

@TheDan64

TheDan64 Apr 7, 2018

Owner

The only reason why Module has Clone implemented is because LLVM provides a function to do just that

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 9, 2018

Author Contributor

We end up using the equivalent of Clone in a couple places around the code base to create a new reference to the same underlying LLV execution engine so all I did was extract this into an impl for the Clone trait.

I think people will understand this better if we mention an ExecutionEngine is actually just a reference-counted pointer to an underlying LLVM execution engine. But I don't mind deleting (or hiding with #[doc(hidden)]) this Clone impl, it's up to you really.

This comment has been minimized.

@TheDan64

TheDan64 Apr 11, 2018

Owner

Feel free to add documentation for it; preferably on the clone implementation itself (assuming rustdoc will make this visible, otherwise on the EE itself I guess)

This comment has been minimized.

@Michael-F-Bryan

Michael-F-Bryan Apr 11, 2018

Author Contributor

Rustdoc lets you override the docs for trait implementations, but I'll probably mention that an ExecutionEngine is a reference-counted pointer anyway.

Michael-F-Bryan added some commits Apr 1, 2018

@Michael-F-Bryan Michael-F-Bryan force-pushed the Michael-F-Bryan:get-function-addr branch from 9dd5898 to 63d840c Apr 21, 2018

@Michael-F-Bryan

This comment has been minimized.

Copy link
Contributor Author

Michael-F-Bryan commented Apr 21, 2018

@TheDan64 I think I've addressed all our previous concerns and just rebased onto master, would you be able to do another review and let me know if there's anything else we need to do for this PR?

@codecov

This comment has been minimized.

Copy link

codecov bot commented Apr 21, 2018

Codecov Report

Merging #36 into master will increase coverage by 0.69%.
The diff coverage is 81.81%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #36      +/-   ##
==========================================
+ Coverage   50.13%   50.83%   +0.69%     
==========================================
  Files          38       37       -1     
  Lines        2960     2939      -21     
==========================================
+ Hits         1484     1494      +10     
+ Misses       1476     1445      -31
Impacted Files Coverage Δ
src/module.rs 57.92% <100%> (ø) ⬆️
src/execution_engine.rs 38.46% <80.95%> (+8.61%) ⬆️
src/lib.rs 62.5% <0%> (-0.72%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ab5817...63d840c. Read the comment docs.

@TheDan64

This comment has been minimized.

Copy link
Owner

TheDan64 commented Apr 21, 2018

Sure thing, I should have time to do a final review this weekend

@TheDan64
Copy link
Owner

TheDan64 left a comment

@Michael-F-Bryan this looks great. Thanks for all the time you put into this PR!

@TheDan64 TheDan64 merged commit ebf8be2 into TheDan64:master Apr 23, 2018

3 checks passed

codecov/patch 81.81% of diff hit (target 50.13%)
Details
codecov/project 50.83% (+0.69%) compared to 0ab5817
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Michael-F-Bryan Michael-F-Bryan deleted the Michael-F-Bryan:get-function-addr branch Apr 23, 2018

@Michael-F-Bryan

This comment has been minimized.

Copy link
Contributor Author

Michael-F-Bryan commented Apr 23, 2018

Woohoo! 🎉

Now I just need to get back to my little calc example. I'm worried LLVM/inkwell will be too easy to use, so the actual LLVM lowering part will be dwarfed by everything else 😜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.