feat(compiler): FFI and C# bindings for RVM, incl. host-await built-ins#672
feat(compiler): FFI and C# bindings for RVM, incl. host-await built-ins#672kusha wants to merge 3 commits intomicrosoft:mainfrom
Conversation
Allow hosts to register function names at compile time so that calls to those names emit HostAwait instructions directly, enabling natural syntax like fetch(x) instead of __builtin_host_await(x, "fetch"). - Add host_await_builtins map and register_host_await_builtin() to Compiler - Validate arg_count == 1 and reject reserved __builtin_host_await name - Extend determine_call_target() resolution: explicit > registered > user > builtin - Both explicit and registered paths emit identical HostAwait bytecode - Add compile_from_policy_with_host_await() entry point in rules.rs - Extended test harness with HostAwaitBuiltinSpec and args assertion - 9 YAML test cases: suspend/resume, run-to-completion, multiple names, queue, shadowing, object packing, arg_count rejection, reserved name rejection, standard builtin override - Documentation: instruction-set.md, architecture.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Mark Birger <birgerm@yandex.ru>
Expose registered host-await builtins, Program compilation, and RVM accessors through the FFI layer and C# bindings. FFI (bindings/ffi/src/rvm.rs): - compile_from_modules with host-await builtin registration - set/get host-await responses, argument, identifier - RegorusHostAwaitBuiltin C struct C# (bindings/csharp/Regorus/): - Program.CompileFromModules overloads with HostAwaitBuiltin[] - Rvm.SetHostAwaitResponses, GetHostAwaitArgument, GetHostAwaitIdentifier - HostAwaitBuiltin readonly struct, ExecutionMode enum - ModuleMarshalling: PinnedUtf8Strings, PinnedHostAwaitBuiltins Rust (src/rvm/vm/machine.rs): - get_host_await_argument() and get_host_await_identifier() accessors Docs: README examples, API.md reference, vm-runtime.md accessors Tests: suspendable + run-to-completion C# scenarios
fa62435 to
54fbb82
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends the RVM host-await feature end-to-end by wiring “registered host-await builtins” through the Rust compiler, VM accessors, the FFI layer, and the C# bindings, with accompanying docs and tests.
Changes:
- Add compiler support to register host-awaitable builtin names at compile time and emit
HostAwaitfor natural function-call syntax. - Expose host-await suspension inspection (identifier/argument) and run-to-completion response preloading via FFI + C# APIs.
- Add Rust YAML regression cases plus new C# binding tests and documentation updates.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/rvm/rego/mod.rs | Extends YAML harness to pass registered host-await builtins into compilation and (optionally) assert host-await arguments in suspendable mode. |
| tests/rvm/rego/cases/registered_host_await.yaml | Adds regression coverage for registered builtins (shadowing, queueing, arg-count validation, overrides). |
| src/rvm/vm/machine.rs | Adds VM accessors for host-await identifier/argument when suspended. |
| src/languages/rego/compiler/rules.rs | Adds compile_from_policy_with_host_await and wires registrations into compilation. |
| src/languages/rego/compiler/mod.rs | Adds storage + registration API for host-awaitable builtins (with validation). |
| src/languages/rego/compiler/function_calls.rs | Updates call resolution/emission to support registered host-await builtins and emit identifier literals. |
| docs/rvm/vm-runtime.md | Documents new VM host-await accessors. |
| docs/rvm/instruction-set.md | Documents registered host-await builtin behavior and resolution order. |
| docs/rvm/architecture.md | Describes explicit vs registered HostAwait emission paths. |
| bindings/ffi/src/rvm.rs | Adds FFI structs/APIs for compile-with-builtins, response preloading, and suspension JSON accessors. |
| bindings/csharp/Regorus/Rvm.cs | Adds C# wrappers for host-await inspection + response preloading. |
| bindings/csharp/Regorus/Program.cs | Adds CompileFromModules overload supporting host-await builtin registrations; refactors compile paths. |
| bindings/csharp/Regorus/NativeMethods.cs | Adds P/Invoke declarations and FFI struct for host-await builtins + new VM functions. |
| bindings/csharp/Regorus/ModuleMarshalling.cs | Generalizes UTF-8 string pinning helper and adds host-await builtin marshalling. |
| bindings/csharp/Regorus/Compiler.cs | Introduces public HostAwaitBuiltin struct for registration. |
| bindings/csharp/Regorus.Tests/RvmProgramTests.cs | Adds suspendable + run-to-completion tests for registered host-await builtins. |
| bindings/csharp/README.md | Adds usage examples for registered host-await in both execution modes. |
| bindings/csharp/API.md | Documents new/expanded Program/Rvm/HostAwaitBuiltin APIs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| values.push_back(val); | ||
| } | ||
| } | ||
|
|
||
| guard.set_host_await_responses(core::iter::once((id_value, values))); | ||
| Ok(()) |
There was a problem hiding this comment.
regorus_rvm_set_host_await_responses calls RegoVM::set_host_await_responses, which clears the entire host-await response map. As a result, calling this function multiple times (e.g., to preload responses for multiple identifiers) will drop previously set responses for other identifiers, making run-to-completion with multiple different host-await builtins impossible via this API. Consider changing the VM API usage to clear/replace only the queue for the specified identifier (or provide an FFI that accepts multiple identifiers in one call).
| /// Clears any previously configured responses, then queues the | ||
| /// provided values for the given identifier. | ||
| /// </summary> | ||
| /// <param name="identifier">The builtin identifier.</param> |
There was a problem hiding this comment.
The XML doc says this method "clears any previously configured responses" but the underlying FFI currently clears all identifiers’ queues, not just the specified identifier. Either clarify in the docs that this overwrites the entire response set, or adjust the native API so callers can preload responses for multiple identifiers without losing earlier ones.
| /// Clears any previously configured responses, then queues the | |
| /// provided values for the given identifier. | |
| /// </summary> | |
| /// <param name="identifier">The builtin identifier.</param> | |
| /// Replaces the entire previously configured HostAwait response set, | |
| /// then queues the provided values for the given identifier. | |
| /// This method does not preserve responses queued for other identifiers. | |
| /// </summary> | |
| /// <param name="identifier">The builtin identifier whose responses will be queued after the reset.</param> |
| buffer[i] = new RegorusHostAwaitBuiltin | ||
| { | ||
| name = namePinned.Pointer, | ||
| arg_count = (UIntPtr)builtins[i].ArgCount, |
There was a problem hiding this comment.
Casting builtins[i].ArgCount (an int) to UIntPtr will reinterpret negative values as very large unsigned values. Consider validating ArgCount is non-negative (and within a reasonable range) before casting, or changing HostAwaitBuiltin.ArgCount to an unsigned type to match the FFI surface.
| buffer[i] = new RegorusHostAwaitBuiltin | |
| { | |
| name = namePinned.Pointer, | |
| arg_count = (UIntPtr)builtins[i].ArgCount, | |
| var argCount = builtins[i].ArgCount; | |
| if (argCount < 0) | |
| { | |
| throw new ArgumentOutOfRangeException(nameof(builtins), $"Host await builtin at index {i} has a negative {nameof(HostAwaitBuiltin.ArgCount)}."); | |
| } | |
| buffer[i] = new RegorusHostAwaitBuiltin | |
| { | |
| name = namePinned.Pointer, | |
| arg_count = (UIntPtr)argCount, |
|
|
||
| result := kv_store(input.key, input.value) | ||
| query: data.demo.result | ||
| # Registration panics because arg_count must be 1. |
There was a problem hiding this comment.
Comment says registration "panics", but the compiler path returns a compilation error (want_error) rather than panicking. Adjust the comment to reflect that this is an expected compile-time failure.
| # Registration panics because arg_count must be 1. | |
| # Registration is rejected because arg_count must be 1. |
Summary
Exposes the registered host-await builtins,
Programcompilation, and RVM runtime accessors through the FFI layer and C# bindings. This is the companion to #667 (registered host-await builtins in the compiler/VM), making the feature usable from C# consumers.Motivation
The previous PR added registered host-await builtins to the Rust compiler and VM. However, the FFI boundary and C# bindings only exposed the raw
__builtin_host_awaitpath. This PR bridges the gap so C# consumers can:Program.CompileFromModules, which emitsHostAwaitinstructions directly.Rvm.SetHostAwaitResponsesto queue responses before execution.Rvm.GetHostAwaitIdentifier()andRvm.GetHostAwaitArgument()to determine which builtin suspended and with what argument.Changes
Rust VM (
src/rvm/vm/machine.rs)get_host_await_argument()andget_host_await_identifier()—const fnaccessors that return the argument/identifier when the VM is in a HostAwait-suspended state.FFI (
bindings/ffi/src/rvm.rs)RegorusHostAwaitBuiltin—#[repr(C)]struct for passing builtin registrations across FFI.regorus_compile_from_modules— extended with optional host-await builtin array.regorus_rvm_set_host_await_responses— pre-load response queue for run-to-completion mode.regorus_rvm_get_host_await_argument/regorus_rvm_get_host_await_identifier— JSON accessors for suspension state.C# Bindings (
bindings/csharp/Regorus/)HostAwaitBuiltin— readonly struct wrapping a builtin name and argument count.ExecutionMode— enum (RunToCompletion,Suspendable).Program.CompileFromModules— new overloads acceptingHostAwaitBuiltin[].Rvm.SetHostAwaitResponses— queue JSON responses for a given identifier.Rvm.GetHostAwaitArgument()/GetHostAwaitIdentifier()— read suspension state.ModuleMarshalling—PinnedUtf8StringsandPinnedHostAwaitBuiltinshelpers for safe FFI marshalling.Documentation
bindings/csharp/README.md— added suspendable and run-to-completion examples with registered builtins.bindings/csharp/API.md— addedProgram,Rvm,HostAwaitBuiltin,ExecutionModeAPI reference.docs/rvm/vm-runtime.md— documented the new VM accessors.Tests (
bindings/csharp/Regorus.Tests/RvmProgramTests.cs)RegisteredHostAwait_Suspendable_SuspendAndResume— registersget_account, suspends, verifies identifier and argument, resumes with a response.RegisteredHostAwait_RunToCompletion_WithPreloadedResponses— registerstranslate, pre-loads a response, verifies end-to-end execution.Notes
Entry point marshalling reused for JSON strings: The existing
PinnedUtf8Stringshelper (originally written for pinning entry point string arrays across the FFI boundary) is repurposed to also marshal the JSON response strings inSetHostAwaitResponses. Same pattern: pin an array of null-terminated UTF-8 pointers, pass pointer + length to Rust.CompileFromEnginedoes not support host-await builtins: OnlyCompileFromModulesacceptsHostAwaitBuiltin[]. This is a scoping choice to keep the PR smaller — there is no technical constraint preventing it. The engine-based path can be extended in a follow-up if needed.