Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Host Functions support for Java SDK #248

Merged
merged 16 commits into from
Mar 2, 2023

Conversation

Zwiterrion
Copy link
Contributor

No description provided.

Hi,

We want to use Extism on our project
[Otoroshi](https://github.com/MAIF/otoroshi) but we need to run it on
jdk11

This pull request makes everything run smoothly on jdk11

If you have any suggestion about this pull request, i'm open to it

Thanks for your time
@Zwiterrion
Copy link
Contributor Author

Zwiterrion commented Feb 16, 2023

Hi @bhelx, @nilslice

I'm stuck on implementing JAVA hosts functions.

I feel like i'm on the right way but i'm not getting the right output when running my plugin with host function.

The core of my host function is:

(ExtismCurrentPlugin plugin,
  LibExtism.ExtismVal[] params,
  LibExtism.ExtismVal[] results,
  JsonElement userData) -> {
    System.out.println(plugin.inputString(params[0]));

    int offs = plugin.alloc(4);
    Pointer mem = plugin.memory();
    mem.write(offs, "test".getBytes(), 0, 4);
    results[0].value.i64 = offs;

    System.out.println("Hello from Java!");
}

Here the output of the execution.

{"count": 4}
Hello from Java!
test
ExtismVal{t=1, value=ExtismValUnion{i32=0, i64=13, f32=0.0, f64=0.0}}
Exit Host function
Plugin output length: 0, output: 

At the end of the execution of my host function, i'm able to retrieve the test value.

... (line 52: HostFunction.java)
System.out.println(LibExtism.INSTANCE.extism_current_plugin_memory(currentPlugin).getString(arraysOfOutputs[0].value.i64));

...

System.out.println(outputs);
System.out.println("Exit Host function");

But when the plugin is ending, the result is empty because the LibExtism.INSTANCE.extism_plugin_output_length returns 0 as data length while LibExtism.INSTANCE.extism_plugin_output_data returns the expected content.

If I force the output length value by replacing with a raw value (that is not desirable)

 int length = LibExtism.INSTANCE.extism_plugin_output_length(contextPointer, index);
Pointer output = LibExtism.INSTANCE.extism_plugin_output_data(contextPointer, index);
return output.getByteArray(0, 30);

I get the final result containing my test value.

{"count": 4}
Hello from Java!
test
ExtismVal{t=1, value=ExtismValUnion{i32=0, i64=13, f32=0.0, f64=0.0}}
Exit Host function
Plugin output length: 30, output: �{"count": 4}test�������������

I think that I missed something on the implementation, if someone has a idea it's welcome.

@bhelx
Copy link
Contributor

bhelx commented Feb 16, 2023 via email

@bhelx
Copy link
Contributor

bhelx commented Feb 16, 2023

Got a chance to repro this. It does seem like, yes, the output isn't being registered just for this one test. Going to do a little more testing.

@bhelx
Copy link
Contributor

bhelx commented Feb 16, 2023

Did a little experimenting with Zach. We narrowed it down to where the problem is. I wrote this little plugin to look at what the host function is returning from the plugin's perspective:

#![no_main]

use extism_pdk::{*, bindings::extism_alloc};
use serde::{Deserialize, Serialize};

const VOWELS: &[char] = &['a', 'A', 'e', 'E', 'i', 'I', 'o', 'O', 'u', 'U'];

#[derive(Serialize, Deserialize)]
struct Output {
    pub count: i32,
}

extern "C" {
    fn hello_world(input: i64) -> i64;
}

#[plugin_fn]
pub unsafe fn count_vowels<'a>(input: String) -> FnResult<String> {
    let output = "hello world";
    let memory = Memory::from_bytes(output.as_bytes());
    let output = unsafe { hello_world(memory.offset as i64) };
    Ok(output.to_string())
}

It appears let output here is always 0. But logging the outputs in the host function, we can see they are being set. This suggests to us that some kind of pass by value is happening regarding the output array. The original object still has 0 as a default.

When you set the output value here: https://github.com/extism/extism/pull/248/files#diff-d93dc91be87e544e8ba1116ce83c15697ae8b5c56991f62e57cfb7bf461cefe8R58

you need to ensure you are mutating the reference, and not creating a new value. We looked into ExtismVal and ExtismValUnion but this has challenged my java knowledge.

Does that help at all? I think if you can maybe log out object ids and figure out where some value might be being copied, you can find it. If you are still stuck happy to do some more debugging on my end. It is so close!

@Zwiterrion
Copy link
Contributor Author

Thanks for your experiments. I spent the day to update this outputs parameter and it finally works. I'm not expert to mapping C type with JNA and I forgot to define the type used in my union implementation. This oversight prevented rust from getting the correct output value.

For the definition of the host function, I've seen on other sdks that user data is either a pointer or a buffer. I chose to let the user pass JSON as data to be the most generic, but let me know if that's a good idea or not.

@bhelx
Copy link
Contributor

bhelx commented Feb 17, 2023

@Zwiterrion excellent! Changes work for me too. I will review today.

Comment on lines 30 to 59
class ExtismVal extends Structure {
protected static class ByReference extends ExtismVal implements Structure.ByReference{
public ByReference() {
super();
}

public ByReference(Pointer ptr) {
super(ptr);
}
}

public ExtismVal() {
super();
read();
}

public ExtismVal(Pointer ptr) {
super(ptr);
read();
}

@Override
protected List<String> getFieldOrder() {
return Arrays.asList("t", "v");
}

public int t;
public ExtismValUnion v;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this implementation is wild, glad you figured it out 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched and tested but in fact it is no longer useful.

new ExtismCurrentPlugin(currentPlugin),
(LibExtism.ExtismVal []) inputs.toArray(nInputs),
outputs,
data == null ? Optional.empty() : Optional.of(new JsonParser().parse(data.getString(0)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but I think we want this to be more flexible. For example, what if we want to pass in a database connection or a large block of memory as user data? I will see if I can come up with a solution today.

};

String str = "test";
Pointer hostUserData = new Memory(str.length() + 1);
Copy link
Contributor

@bhelx bhelx Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a slight misunderstanding about the user data here. it is not something the plugin typically accesses. You can think of it like applied arguments in a closure, or some properties in a class. In this case it's for the java code, and not the plugin code.

Here is an example of passing in a db connection: https://github.com/extism/extism-sqlite-host-function

here you could pass in things specific to the user. E.G. a session or something.

I'm taking a look at strategies to handle this in Java today and I'll send a PR into this one if i get something working.

Copy link
Contributor

@bhelx bhelx Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still experimenting with this but wanted to get your thoughts @Zwiterrion .

What would you think of HostFunction being generic over some type for host user data. The programmer could define the userdata type they want and stuff whatever they want in it. I think this might work by creating some class that extends the Structure class which we should be able to turn into a pointer then pull back out again when java gets control. Here is a diff outlining the idea:

diff --git a/java/src/main/java/org/extism/sdk/HostFunction.java b/java/src/main/java/org/extism/sdk/HostFunction.java
index f6a18ba..a1bbe9b 100644
--- a/java/src/main/java/org/extism/sdk/HostFunction.java
+++ b/java/src/main/java/org/extism/sdk/HostFunction.java
@@ -2,11 +2,13 @@ package org.extism.sdk;
 
 import com.google.gson.JsonParser;
 import com.sun.jna.Pointer;
+import com.sun.jna.Structure;
 
 import java.util.Arrays;
 import java.util.Optional;
 
-public class HostFunction {
+
+public class HostFunction<T extends HostUserData> {
 
     private final LibExtism.InternalExtismFunction callback;
 
@@ -18,9 +20,9 @@ public class HostFunction {
 
     public final LibExtism.ExtismValType[] returns;
 
-    public final Pointer userData;
+    public final HostUserData userData;
 
-    public HostFunction(String name, LibExtism.ExtismValType[] params, LibExtism.ExtismValType[] returns, ExtismFunction f, Pointer userData) {
+    public HostFunction(String name, LibExtism.ExtismValType[] params, LibExtism.ExtismValType[] returns, ExtismFunction f, HostUserData userData) {
 
         this.name = name;
         this.params = params;
@@ -35,11 +37,13 @@ public class HostFunction {
 
             LibExtism.ExtismVal[] outputs = (LibExtism.ExtismVal []) outs.toArray(nOutputs);
 
+            HostUserData d = new HostUserData(data);
+
             f.invoke(
                     new ExtismCurrentPlugin(currentPlugin),
                     (LibExtism.ExtismVal []) inputs.toArray(nInputs),
                     outputs,
-                    data == null ? Optional.empty() : Optional.of(new JsonParser().parse(data.getString(0)))
+                    d
             );
 
             for (LibExtism.ExtismVal output : outputs) {
@@ -54,7 +58,7 @@ public class HostFunction {
                 Arrays.stream(this.returns).mapToInt(r -> r.v).toArray(),
                 this.returns.length,
                 this.callback,
-                userData,
+                userData.getPointer(),
                 null
         );
     }
diff --git a/java/src/test/java/org/extism/sdk/PluginTests.java b/java/src/test/java/org/extism/sdk/PluginTests.java
index ee983d4..4069204 100644
--- a/java/src/test/java/org/extism/sdk/PluginTests.java
+++ b/java/src/test/java/org/extism/sdk/PluginTests.java
@@ -18,6 +18,17 @@ import static org.assertj.core.api.Assertions.assertThat;
 import static org.extism.sdk.TestWasmSources.CODE;
 import static org.junit.jupiter.api.Assertions.assertThrows;
 
+class MyUserData extends HostUserData {
+  private String data1;
+  private int data2;
+
+  public MyUserData(String data1, int data2) {
+    super();
+    this.data1 = data1;
+    this.data2 = data2;
+  }
+}
+
 public class PluginTests {
 
     // static {
@@ -125,9 +136,8 @@ public class PluginTests {
             returns[0].v.i64 = offs;
         };
 
-        String str = "test";
-        Pointer hostUserData = new Memory(str.length() + 1);
-        hostUserData.setString(0, str);
+
+        MyUserData hostUserData = new MyUserData("test", 42);
 
         HostFunction hello_world = new HostFunction(
                 "hello_world",

And the HostUserData object:

package org.extism.sdk;

import com.sun.jna.Pointer;
import com.sun.jna.Structure;

public class HostUserData extends Structure {
  public HostUserData(Pointer ptr) {
    super(ptr);
    read();
  }

  public HostUserData() {
    super();
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure how to make all this work just trying to express the idea right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With other languages, like python, we have something like a concept of handles: https://github.com/extism/extism/blob/main/python/extism/extism.py#LL205C30-L205C45

this allows us to pass a handle to python objects across the ffi boundary, then when we come back we can retried those python objects from the handle. We need the equivalent thing. I asked ChatGPT what it thinks the equivalent is and it said Callback https://java-native-access.github.io/jna/4.2.1/com/sun/jna/Callback.html

so perhaps we can put some data in a callback and pull it out when we get back into java world?

int nOutputs,
Pointer data
);
}

@Structure.FieldOrder({"t", "v"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, this is much simpler!

@Zwiterrion
Copy link
Contributor Author

Zwiterrion commented Feb 18, 2023

@bhelx That's a great idea, I follow your idea (with some tweaks) and it works. I can create a class that inherits from HostUserData with all the fields I want.

class MyUserData extends HostUserData {
      private String data1;
      private int data2;

      public MyUserData(String data1, int data2) {
          super();
          this.data1 = data1;
          this.data2 = data2;
      }
  }

  ExtismFunction helloWorldFunction = (ExtismFunction<MyUserData>) (plugin, params, returns, data) -> {
      System.out.println("Hello from Java Host Function!");
      System.out.println(String.format("Input string received from plugin, %s", plugin.inputString(params[0])));

      int offs = plugin.alloc(4);
      Pointer mem = plugin.memory();
      mem.write(offs, "test".getBytes(), 0, 4);
      returns[0].v.i64 = offs;

      System.out.println(String.format("Host user data, %s, %d", data.data1, data.data2));
  };

  HostFunction hello_world = new HostFunction<>(
          "hello_world",
          parametersTypes,
          resultsTypes,
          helloWorldFunction,
          new MyUserData("test", 2)
  );

The HostUserData class just needs to inherit from PointerType, not from structure, and it works!

@bhelx
Copy link
Contributor

bhelx commented Feb 22, 2023

The HostUserData class just needs to inherit from PointerType, not from structure, and it works!

That makes a lot more sense! Awesome that it works!

new MyUserData("test", 2)
);

HostFunction[] functions = {hello_world};
Copy link
Contributor

@bhelx bhelx Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the best way to do this in java as my java knowledge is quite outdated, but is it possible to make the generic type optional? So you could pass none or null to the userdata param? Something like this:

        HostFunction null_user_data = new HostFunction(
                "hello_world",
                parametersTypes,
                resultsTypes,
                helloWorldFunction,
                null
        );

There are definitely cases where a host function doesn't need a user data so it might be cumbersome to have someone make a type. Or maybe we provide some kind of none type for them if there is no generic way to handle it.

@bhelx
Copy link
Contributor

bhelx commented Feb 22, 2023

@Zwiterrion this is looking good and I'm not seeing any major issues right now. Is it ready to be taken out of draft? Let me know what is left to be done I can maybe pick up some tasks. Or if you think it's good enough at this point we can get it tested and merged.

@Zwiterrion
Copy link
Contributor Author

@bhelx I just replaced the user data by an Java optional. The user can pass an Optional.empty() or an Optional.of(<user defined type>). I added two more tests. The first to check the presence of the expected host function hello_world. The second to define a host function without user data.

@Zwiterrion Zwiterrion marked this pull request as ready for review February 24, 2023 09:01
@bhelx
Copy link
Contributor

bhelx commented Feb 28, 2023

@Zwiterrion going to review tomorrow. If all is well will get it merged this week! Let me know if there are any last minute changes you want to make.

@Zwiterrion
Copy link
Contributor Author

@bhelx I already work with host functions and saw that the same error in my code and in theshouldInvokeFunctionFromByteArrayWasmSource() test. I just changed the encoder/decoder which converts the manifest from bytes to json and now it is works. The data did not match any variant of untagged enum Wasm at line 8 column 3 disappeared.

@bhelx
Copy link
Contributor

bhelx commented Mar 2, 2023

Alright, just tested again. I think it's in a good spot to merge. If you have any more changes feel free to put in a follow up. We're going to try to release a 0.3 in the coming weeks so there is still some time to make any tweaks or bug fixes.

@bhelx bhelx merged commit 3e69cee into extism:main Mar 2, 2023
@bhelx
Copy link
Contributor

bhelx commented Mar 2, 2023

Thanks so much!

@Zwiterrion Zwiterrion deleted the feature/java-host-functions branch January 30, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants