Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust-sdk): Context collection #203

Closed
wants to merge 2 commits into from
Closed

Conversation

ok-nick
Copy link

@ok-nick ok-nick commented Jan 6, 2023

Example implementation of the idea proposed in #202.

The goal here is to treat Context as it is treated internally; a collection of plugins. This will allow users to store and call plugins whenever they want, without them being freed and without having to cache them again.

A lot of the code doesn't make use of the libraries conventions, I threw it together as an example. Let me know what you think and if we should go ahead with implementing this.

@ok-nick ok-nick changed the title Schemes feat(rust-sdk): Reusable plugin Scheme and Context collection Jan 6, 2023
@ok-nick
Copy link
Author

ok-nick commented Jan 6, 2023

I don't think schemes are necessary, we could probably remove them in favor of the old API, while still reforming the Context to be a collection.

@zshipko
Copy link
Contributor

zshipko commented Jan 6, 2023

Thanks! I agree that treating the Context as a collection of plugins is helpful is some cases, but I'm not sure how schemes are different from just using the manifest?

Adding some way to persist a plugin like Plugin::into_raw(self) would make it so Context::get and Context::remove could be added without the need for schemes. I think this would accomplish the same thing:

impl<'a> Plugin<'a> {
  pub fn into_raw(self) -> PluginIndex {
    let id = self.id;
    std::mem::forget(self);
    id
  }
}

This allows you to retain the ID of the plugin without dropping it, which means you could then opt-in to using Context::get and Context::remove to handle plugins.

@ok-nick ok-nick changed the title feat(rust-sdk): Reusable plugin Scheme and Context collection feat(rust-sdk): Context collection Jan 6, 2023
@ok-nick
Copy link
Author

ok-nick commented Jan 6, 2023

@zshipko While that works, it feels very unnatural and unclear, not too Rust-like. Imagine you have a plugin that you want to create but not use until later:

Plugin::new(...).into_raw();

Could you tell by this code that the Plugin is going to be cached? We could rename it to .cache(), but still, the plugin is already cached even before we called .cache().

How about we introduce a new struct named TempPlugin or PluginHandle that, when dropped, will free it from the Context? It would look something like this:

Plugin::builder()
    .with_wasi(true)
    .build_temp(&context)

@zshipko
Copy link
Contributor

zshipko commented Jan 6, 2023

I want to keep the default behavior for Plugin as it is to avoid too much API churn.

What about instead of into_raw calling it keep? I think that's used in the Rust PDK for something similar.

Inverting your build_temp example could work too: adding a PersistentPlugin and PluginBuilder::build_persistent function. But the extra struct feels somewhat unnecessary since the PluginIndex kind of already serves as the persistent handle.

@ok-nick
Copy link
Author

ok-nick commented Jan 6, 2023

The problem is when you retrieve a Plugin from the Context you have to call .keep() every time.

@zshipko
Copy link
Contributor

zshipko commented Jan 6, 2023

Ah, that's true. You could add a private field to Plugin called persistent (or similar) that is used to toggle the behavior of drop, it would be set to true when the plugin is retrieved using Context::get but false otherwise.

@ok-nick
Copy link
Author

ok-nick commented Jan 6, 2023

You can do that but it would be odd to have a function that only works once then serves no purpose.

I think the build_persistent is the way to go and we can use traits to avoid repeating ourselves.

How often do people need a plugin temporarily loaded vs persisted? Is it really worth supporting an API that won't be used? In my case, I call the plugins at random times during the programs execution which is why I need it persisted. I feel like in most cases, when you need to remove a plugin, it would be coordinated at a specific time.

@zshipko
Copy link
Contributor

zshipko commented Jan 6, 2023

The most typical use case is to create the plugin and store it somewhere to call later, then when it goes out of scope it gets destroyed just like any other Rust value. The Context is used primarily for error reporting and other book-keeping we don't want to put too much emphasis on it - I think removing it altogether would be more in-line with our long term goals.

Maybe if you shared some code from your project I could help you accomplish what you need?

@ok-nick
Copy link
Author

ok-nick commented Jan 7, 2023

If Context is just an abstraction over multiple plugins, then maybe we should make it optional? Errors could be contained in the plugin itself, then if the user wants a structure to handle multiple plugins they can construct a PluginGroup/Context. That way a user can opt-in to persisted plugins with a backing storage.

As far as my code, it's just a system that calls all plugins sharing a protocol and aggregates the results. My code works how it is, but I don't see why I'm making a storage on top of a storage, when I can just use the inner one. Not to mention having to create my own id system to keep track of plugins.

@zshipko
Copy link
Contributor

zshipko commented Jan 7, 2023

That sounds good to me - making Context optional could simplify the usage when no context is needed! One big job of the context is to make errors available when creating a plugin fails, that might make it trickier.

I think it may require some deeper changes to the runtime but I would be curious to see what that might look like.

@ok-nick
Copy link
Author

ok-nick commented Jan 7, 2023

Here are some ways we can return errors to the caller without a Context in the C interface (note that PluginIndex will not be used anymore, thus we must pass pointers to the Plugin struct directly):

  1. Return a C struct containing whether it succeeded or failed along with the result. The only problem with this is that we need a new struct per error which is a little inconvenient, but arguably better. It can be minimized using a macro.

An error would look something like this:

#[repr(C)]
pub struct CreatePluginResult {
  success: bool,
  result: *mut Plugin,
}

impl Display for CreatePluginResult {
  fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), fmt::Error> { 
    write!(f, "Error creating Plugin")
  }
}

With a function to get the error string as such:

pub fn error_string<T: Display>(error: T) -> *mut c_char {
  todo!() // return display function
}

And a macro to make defining errors easier:

c_error!(
  #[error("Error creating Plugin")]
  CreatePluginResult: *mut Plugin,
)
  1. Return an error code and pass the success in an out parameter. The caller gets the error string via a separate function.

An error would look something like this:

#[repr(C, i32)]
pub enum Error {
  CreatePluginResult = 1,
}

impl Display for CreatePluginResult {
  fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), fmt::Error> { 
    match *self {
      Error::CreatePluginResult => write!(f, "Error creating Plugin"),
    }
  }
}

The function to get the error string would be the same as the previous and a macro could also be made, though I don't think it'd be necessary. The only difference here that you have to remember is that the result is being passed as a pointer in the parameter.

Another thing is that the library should use a crate like thiserror instead of anyhow. anyhow is meant for binaries and there is no way to know what error happened, only that an error did happen.

@zshipko
Copy link
Contributor

zshipko commented Jan 7, 2023

Interesting, the idea of returning *mut Plugin directly is something I've looked at a few times and would probably be a little simpler to reason about but we're not looking to switch to thiserror. The out argument could always be used to return an error string to the caller, that is pretty typical in C libraries.

You have to keep in mind that any change like this will bubble up to all the SDKs - for example, using an out parameter from Javascript is do-able, but it's tricky, Ruby has yet another way of handling it and I'm not even sure what that would look like in C# - there are reasons things are the way they are when you consider how many languages are supported. Yes the API could be nicer, but it part of the design process was making it usable from many languages.

I would suggest you try to make some of the API changes you're looking for and change a few SDKs to match to get a feeling of how usable the API is from the different languages we support.

@ok-nick
Copy link
Author

ok-nick commented Jan 7, 2023

If the language supports FFI, it surely supports out parameters, it's such a widely used C convention. Perhaps it is a little tricky to get right, but that will all be hidden from the user anyways.

Also, the #[error("blah")] was not thiserror syntax, it was just an example of how we can define a custom macro, I just tried to make it look more idiomatic. Though, on the topic of thiserror, I do believe it is something that should eventually be integrated into the library. If not thiserror, then at least something to differentiate between errors so that callers can handle accordingly.

If anyone is familiar with FFI in any of the supported SDKs, feel free to give your thoughts.

Edit: I’d also like to mention that if out parameters are really that tricky, then all the more reason to use the first approach.

@ok-nick
Copy link
Author

ok-nick commented Jan 7, 2023

I came up with an example of what the code would look like with the new error system (1st method).
Here is the macro to make defining errors easier:

macro_rules! c_error {
    (
        $(
            $error_msg:literal: $error_arg:ty,
            $error_name:ident: $result:ty,
        ),+
    ) => {
        $(
            #[repr(C)]
            pub struct $error_name {
                pub success: bool,
                pub result: $result,
                pub error_arg: $error_arg,
            }

            impl Display for $error_name {
                fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), fmt::Error> {
                    write!(f, $error_msg, self.error_arg.unwrap())
                }
            }
        ),+
    }
}

Creating a new error:

c_error!(
    "Error creating Plugin: {:?}": Option<anyhow::Error>,
    CreatePluginResult: *mut Plugin,
);

What the C function would look like:

#[no_mangle]
pub unsafe extern "C" fn extism_plugin_new(
    wasm: *const u8,
    wasm_size: Size,
    with_wasi: bool,
) -> CreatePluginResult {
    trace!("Call to extism_plugin_new with wasm pointer {:?}", wasm);
    let data = std::slice::from_raw_parts(wasm, wasm_size as usize);
    match Plugin::new(data, with_wasi) {
        Ok(mut x) => {
            // TODO: forget x
            CreatePluginResult {
                success: true,
                result: &mut x as *mut _,
                error_arg: None,
            }},
        Err(e) => {
            error!("Error creating Plugin: {:?}", e);
            CreatePluginResult {
                success: false,
                result: ptr::null_mut(),
                error_arg: Some(e),
            }
        }
    }
 }

I'm not a huge fan of the error_arg and it should definitely be abstracted away. Perhaps we can make a builder or constructor for each error type or something of the sort.

@ok-nick
Copy link
Author

ok-nick commented Jan 7, 2023

Something like this could make creating the errors easier:

// success
CreatePluginResult::success(x) // convert it to pointer internally?
// error
CreatePluginResult::error(err)

If you also want to save a bit of memory you can use a union instead of a struct for the error type.

@nilslice
Copy link
Member

@ok-nick - I think there are is more to this PR than meets the eye here, and there's probably more context that would be helpful to know in order for us to really understand whether or not it's something we'd want to accept upstream.

We created the EIP repo for these kinds of ideas, and if you would like to, I'd encourage you to flesh out these ideas a bit more in the form of a proposal.

These proposals should ultimately be as thorough as EIP-005, which goes in depth about how all Extism components will be versioned, including the various components, the complexity, and the thoughtful rationale for the versioning system we've landed on.

I'm going to close out this PR with preference for a proposal, and after that has been accepted, I think everyone involved here will have more clarity about the how/what/why mentioned above.

Thanks!

@nilslice nilslice closed this Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants