Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic lib loading #14

Merged
merged 23 commits into from
Dec 4, 2020
Merged

Dynamic lib loading #14

merged 23 commits into from
Dec 4, 2020

Conversation

RDruon
Copy link
Contributor

@RDruon RDruon commented Dec 1, 2020

Hello,
I started working on using libloading with bindgen dynamic lib support.
So far it seems to work but the dynamic lib loading is not handle as it should be. I am not familiar with Rust so I am not sure how it should be done.

I added a new() method to the NVML struct (in lib.rs). This new() method will take care of loading the .so file through libloading and use it for every call to NVML method.
the tricky part is for other struct. For example with the Device struct (in device.rs) I reload the library every time a call is made to one of the Device's methods.
I can either implement the same new() method for every struct or maybe you have a better idea (Singleton ?).

@Cldfire
Copy link
Owner

Cldfire commented Dec 2, 2020

Hi, thanks so much for getting this started! I really appreciate it 😄

Everything worked great on my machine 🎉 so it looks like dynamically loading NVML at runtime isn't far off.

I've done some work on top of your branch over here; do you mind if I push to your branch so we can work together on this?

Here's an overview of what I've done:

  • Grabbed a more up-to-date copy of the nvml.h header and re-genned the bindings with that
    • Also updated the copy you've added to the repo; in the past I've tried to avoid adding the header to the repo, but looking around the internet I see that everyone else tosses it in their repos, so I guess we can too?
  • Changed the name of the libloading wrapper struct that bindgen generates from nvml to NvmlLib, to be more idiomatic
  • Returned to just having init and init_with_flags as the constructors for the NVML struct. Splitting it up into new and init means it's possible for a library user to obtain an NVML instance and call functions without initializing NVML properly, which isn't what we want
  • Started changing the way we call the dynamically loaded function symbols. Bindgen generates convenience accessors for them, making it quick and easy to call them, but the tradeoff is they'll cause the library to panic if an error occurred loading the symbol for whatever reason. Instead, we need to check and properly handle the Result types that are loaded into the fields of the NvmlLib struct, allowing us to return an error to the caller if we're unable to access the symbol

I also cleaned up the issue you mentioned:

I added a new() method to the NVML struct (in lib.rs). This new() method will take care of loading the .so file through libloading and use it for every call to NVML method.
the tricky part is for other struct. For example with the Device struct (in device.rs) I reload the library every time a call is made to one of the Device's methods.

So, you were off to the right start: NVML is the entry point to the crate and the first thing the user will construct. As a result, it's the place we'll need to load and store the NvmlLib wrapper type.

We don't need to load the wrapper type again for the other structs, though, or create a singleton; we just need to make sure that all of the other structs have a reference to an NVML instance through which they can get access to the NvmlLib stored inside.

As an example, previously the Device struct was defined like this:

pub struct Device<'nvml> {
    device: nvmlDevice_t,
    _phantom: PhantomData<&'nvml NVML>,
}

I had no need for a reference to an NVML instance in this struct prior to now, so I was using PhantomData instead. The PhantomData type allowed us to "store" the 'nvml lifetime in the Device struct, ensuring that the Device can't outlive the NVML instance it was created from.

Now that we need access to stuff the NVML instance holds onto, though, we can change the definition of Device to the following:

pub struct Device<'nvml> {
    device: nvmlDevice_t,
    pub nvml: &'nvml NVML,
}

This removes the usage of PhantomData in favor of simply storing an actual reference to an NVML instance, giving us access to the NvmlLib type contained within and resolving the issue. (I made similar changes for all the other structs that needed this.)

As far as the next steps that need to be taken care of here:

  • Finish converting all of the code that accesses function symbols to access them in a way that doesn't panic if they failed to load
  • Get this working on Windows
  • The crate can now compile on macOS (although it will fail at runtime), so we don't need to stop compilation on macOS anymore
  • Double-check the newly generated bindings and make sure they're all good (I know off the top of my head there are some other manual modifications I needed to make in the past to the bindings; I should probably document those)

I'll chip away at it all over the course of the next few days!

@Cldfire Cldfire linked an issue Dec 2, 2020 that may be closed by this pull request
@Cldfire
Copy link
Owner

Cldfire commented Dec 2, 2020

Also, I have no idea why CI failed to run here; maybe restarting it will help?

EDIT: and now it worked, very odd

@RDruon
Copy link
Contributor Author

RDruon commented Dec 2, 2020

Hello Jarek,

Thanks a lot for all the modification! I have push them to my dynamic branch.
I will take a look at the function symbols error handling for other struct based on what you did for the lib.rs file.
I don't have access to a Windows system with a NVIDIA GPU unfortunately, neither macOS.

Copy link
Owner

@Cldfire Cldfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for going through and updating the rest of it!

I don't have access to a Windows system with a NVIDIA GPU unfortunately, neither macOS.

No worries; I have access to both, so I'll take care of that part.

src/event.rs Outdated Show resolved Hide resolved
@Cldfire
Copy link
Owner

Cldfire commented Dec 3, 2020

Works great on Windows as well (other than that ^ small issue).

EDIT: and compiles on macOS, erroring at runtime. Sweet!

This import library is no longer required since we are now loading the dll at runtime
This would introduce the ability to unsafely store function pointers contained
within the lib after the `NVML` instance gets destroyed
Otherwise `self.lib` could get accessed again by the `Drop` impl if
`lib.__library.close()` returned an error
Copy link
Owner

@Cldfire Cldfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think this is ready to merge! There's some work left to be done before I can release this, but I can finish that outside of this PR.

Thanks again for all your work here!

@Cldfire Cldfire merged commit 57225a0 into Cldfire:master Dec 4, 2020
@Cldfire Cldfire mentioned this pull request Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load NVML lib at runtime
2 participants