Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] first draft of wasi:http@0.3.0 #101

Closed
wants to merge 5 commits into from

Conversation

dicej
Copy link
Collaborator

@dicej dicej commented Feb 14, 2024

This is intended to start a conversation about what we expect wasi:http@0.3.0 to look like once we have support for streams, futures, stream errors, and asynchronous lifts and lowers in the component model. In addition, I'm planning to implement a prototype of this interface using isyswasfa in the near future, so I want to make sure the design is at least plausible.

High level description:

  • The incoming-handler and outgoing-handler interfaces have been combined into a single handler interface.
  • The incoming- and outgoing- variations of request and response have been combined.
  • I've added a option<request-options> field to request since it would be awkward to leave it as a parameter of handler.handle (e.g. what would it mean to receive such a parameter for an incoming request or for a request passed from one component to the other without any use of the network?).
  • I've added a request-options-error (analogous to header-error) to distinguish between unsupported fields and immutable handles.

I do not intend for this to be merged into the main branch of the wasi-http repo (since we'll presumably be doing 0.2.x work there), but I would like for it to live somewhere we can iterate on it. A few options come to mind:

  • Keep it in my fork of the wasi-http repo for now
  • Make a long-lived branch in the official repo for it
  • Put it in another repo under the WebAssembly org (e.g. wasi-http-0.3.0)
  • Other ideas?

This is intended to start a conversation about what we expect wasi:http@0.3.0 to
look like once we have support for `stream`s, `future`s, stream `error`s, and
asynchronous lifts and lowers in the component model.  In addition, I'm planning
to implement a prototype of this interface using
[isyswasfa](https://github.com/dicej/isyswasfa) in the near future, so I want to
make sure the design is at least plausible.

High level description:

- The `incoming-handler` and `outgoing-handler` interfaces have been combined
  into a single `handler` interface.

- The `incoming-` and `outgoing-` variations of `request` and `response` have
  been combined.  Likewise for `body`.

- I've added a `option<request-options>` field to `request` since it would be
  awkward to leave it as a parameter of `handler.handle` (e.g. what would it
  mean to receive such a parameter for an incoming request or for a request
  passed from one component to the other without any use of the network?).

- I've added a `request-options-error` (analogous to `header-error`) to
  distinguish between unsupported fields and immutable handles.

I do not intend for this to be merged into the `main` branch of the `wasi-http`
repo (since we'll presumably be doing 0.2.x work there), but I _would_ like for
it to live somewhere we can iterate on it.  A few options come to mind:

- Keep it in my fork of the `wasi-http` repo for now
- Make a long-lived branch in the official repo for it
- Put it in another repo under the `WebAssembly` org (e.g. `wasi-http-0.3.0`)
- Other ideas?

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
wit/types.wit Outdated Show resolved Hide resolved
@lukewagner
Copy link
Member

Very exciting to see these simplifications, even if only hypothetical atm.

This moves the methods of that resource directly to `request` and `response`.

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
wit/types.wit Outdated Show resolved Hide resolved
wit/types.wit Show resolved Hide resolved
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
Copy link
Member

@lukewagner lukewagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good so far!

wit/types.wit Outdated Show resolved Hide resolved
wit/types.wit Outdated Show resolved Hide resolved
wit/types.wit Outdated Show resolved Hide resolved
wit/types.wit Outdated Show resolved Hide resolved
wit/types.wit Outdated Show resolved Hide resolved
Signed-off-by: Joel Dice <joel.dice@fermyon.com>
- Rather than assume the existence of a built-in WIT `error` type, I've switched
  back to using `wasi:io/error/error` for the time being so I don't have to
  teach `wit-parser` and friends about the new built-in type (yet).  The
  built-in type is just a vague idea at this point, so it's too early to try to
  use it, plus it's unlikely to be very interesting compared to `stream` and
  `future`.

- Fix a cut-and-paste issue for `request`'s `options` method, which was both
  misnamed and misdocumented.

Signed-off-by: Joel Dice <joel.dice@fermyon.com>
@dicej
Copy link
Collaborator Author

dicej commented Feb 29, 2024

I've just finished adding a "mock" implementation of this draft (i.e. one suitable for testing but which doesn't use the network) to isyswasfa. The service and middleware examples illustrate the structure of a non-trivial application that includes body streaming and trailers. In other words, that's roughly what I would expect a real wasi:http@0.3.0 Rust app to look like modulo the naming of the "isyswasfa" pieces, which would instead be canon or WIT built-ins in the real version.

A few observations based on the implementation experience:

  • Now that there's a single wasi:http/handler interface, if middleware wants to both forward requests to another component and make outbound requests (or wants to be able to forward requests to multiple distinct components), we'll need some way for it to import multiple wasi:http/hander-like things. This has also come up in the context of virtualization, and AFAIK all proposed solutions are currently theoretical and require e.g. WIT changes.
  • It would be nice if we had a take-trailers: func() -> option<future<trailers>> method on both request and response, which would allow middleware to forward trailers to a new request/response without having to pipe the body as well (see the middleware example above where we're forced to do this). Either that, or allow mutating the headers of a request or response so we don't have to create a new one from scratch (which involves copying everything over, including trailers) when all we want to do is add/remove/edit headers. Thoughts or concerns?
  • In practice, it would be great to have WIT-level "hints" about whether a binding generator should generate async or sync bindings for each function. For example, getters and setters on resources probably shouldn't be async by default, whereas e.g. request.finish definitely should be. And when I say "default", I mean that the app developer can always override it and say "no, really, I want to make async calls to request.method" or "no really, I want to make sync calls to request.finish", or even "please generate both a sync and an async version for this imported function". The hints would only affect the defaults for the bindings generator without creating any sort of "function coloring" at the component type level.
    • Currently, I'm hard-coding which wasi:http functions get sync bindings and which ones get async bindings, which is awkward.

@lukewagner
Copy link
Member

Great work! That's really exciting.

we'll need some way for it to import multiple wasi:http/hander-like things

Yes, that makes sense. I think the WIT feature we need to add here is CM/#287, so you'd be able to write:

world middleware {
  import a: wasi:http/handler;
  import b: wasi:http/handler;
  ...
}

which would then produce a component-type:

(component
  (import "a" (implements "wasi:http/handler") (instance ...))
  (import "b" (implements "wasi:http/handler") (instance ...))
  ...
)

where (implements ...) is neither part of the name nor the type; it's just a semantic hint (that enables WIT roundtripping but also perhaps improved DX in various composition scenarios).

It would be nice if we had a take-trailers: func() -> option<future> method on both request and response, which would allow middleware to forward trailers to a new request/response without having to pipe the body as well (see the middleware example above where we're forced to do this).

So just to check my understanding: is the problem that, if you immediately call finish on the body, it doesn't give you an option<future<trailers>>, instead you get an async call that you can wait on to get an option<trailers> (no future)? I wonder if the fix here is to have some built-in way to get a future<T*> from any lowered async call to a function returning T*. I could imagine a bunch of different ABI variations, not sure which one is best, but it seems imminently doable.

Either that, or allow mutating the headers of a request or response so we don't have to create a new one from scratch (which involves copying everything over, including trailers) when all we want to do is add/remove/edit headers.

#103 is proposes adding a take-headers for extracting headers that become mutable (once they are disconnected from the request/response). Are you taking that addition into account and still it's insufficient? (Alternatively, it seems like we could just make all fields other than Content-Length mutable; the current rules are just trying to be regular, but Content-Length is the only one that really wants the current rules.)

In practice, it would be great to have WIT-level "hints" about whether a binding generator should generate async or sync bindings for each function.

Yes, agreed. I was thinking of adding a nonblocking function attribute (so, e.g., you could have foo: nonblocking func(s: string) -> string) that would not just hint, but actually enforce the invariant that the callee did not transitively call a blocking import. I think it's important for this not to be the default (b/c it really hinders what's possible w.r.t virtualizability) so that interface designers think about it. But the WIT syntactic sugar for: constructors, getters and setters could set it by default (b/c for many languages, if the bindings generator has to worry about async, it won't be able to actually generate the bindings you want it to generate because they are all synchronous).

@dicej
Copy link
Collaborator Author

dicej commented Feb 29, 2024

we'll need some way for it to import multiple wasi:http/hander-like things

Yes, that makes sense. I think the WIT feature we need to add here is CM/#287

Yeah, that looks great.

It would be nice if we had a take-trailers: func() -> option method on both request and response, which would allow middleware to forward trailers to a new request/response without having to pipe the body as well (see the middleware example above where we're forced to do this).

So just to check my understanding: is the problem that, if you immediately call finish on the body, it doesn't give you an option<future<trailers>>, instead you get an async call that you can wait on to get an option<trailers> (no future)? I wonder if the fix here is to have some built-in way to get a future<T*> from any lowered async call to a function returning T*. I could imagine a bunch of different ABI variations, not sure which one is best, but it seems imminently doable.

In the current draft, the finish method is on request/response, and it is static and consumes its argument, which means any child handles must have already been dropped, including the body. So if I want to take a request and make a new request that's exactly the same as the old one (same body, trailers, etc.) but with different headers, I must spawn a task that copies the body buffer-by-buffer, then call finish, get the trailers, and send them on to the new request. In other words, I can't just take the body from the first request and give it to the second request and then call finish on the first request, because that will trap. I'm not sure how being able to get a future<T> from a lowered async call would help in this case -- would the returned future<T> somehow own the request passed to finish and keep it alive so that the body can stay alive, too?

Either that, or allow mutating the headers of a request or response so we don't have to create a new one from scratch (which involves copying everything over, including trailers) when all we want to do is add/remove/edit headers.

#103 is proposes adding a take-headers for extracting headers that become mutable (once they are disconnected from the request/response). Are you taking that addition into account and still it's insufficient? (Alternatively, it seems like we could just make all fields other than Content-Length mutable; the current rules are just trying to be regular, but Content-Length is the only one that really wants the current rules.)

If I want to pass the original request to the imported wasi:http/handler instead of creating a brand new request and passing that, the only way I can modify headers is if I get mutable access to the headers of the original request (i.e. not a disconnected version), so yes, I think we'd need the mutable-except-for-content-length approach.

In practice, it would be great to have WIT-level "hints" about whether a binding generator should generate async or sync bindings for each function.

Yes, agreed. I was thinking of adding a nonblocking function attribute (so, e.g., you could have foo: nonblocking func(s: string) -> string) that would not just hint, but actually enforce the invariant that the callee did not transitively call a blocking import. I think it's important for this not to be the default (b/c it really hinders what's possible w.r.t virtualizability) so that interface designers think about it. But the WIT syntactic sugar for: constructors, getters and setters could set it by default (b/c for many languages, if the bindings generator has to worry about async, it won't be able to actually generate the bindings you want it to generate because they are all synchronous).

👍

@lukewagner
Copy link
Member

[...] so yes, I think we'd need the mutable-except-for-content-length approach.

Good point; I think you're right. This isn't clear-cut in 0.2 because you end up being forced to create a new resource anyways, but in 0.3, it seems clear.

So if I want to take a request and make a new request that's exactly the same as the old one (same body, trailers, etc.) but with different headers

I think the mutable-except-for-content-length design fixes this exact issue, but if we tweak it slightly (maybe I want to create new headers but use the same body), it does seem like a design flaw if you have to wait to copy over the body stream just so you can forward the trailers. I know I proposed removing the intermediate body resource type, but I think this is a good argument for adding it back again. Because then you could have consume: func() -> result<body> and then in one simple transfer of ownership, you could pass it on to a new request or response. Does that sound right to you?

@dicej
Copy link
Collaborator Author

dicej commented Feb 29, 2024

I know I proposed removing the intermediate body resource type, but I think this is a good argument for adding it back again. Because then you could have consume: func() -> result<body> and then in one simple transfer of ownership, you could pass it on to a new request or response. Does that sound right to you?

Yeah, that should work. If we do that and leave the finish static methods as they are, then we'd also need the "async-call-to-future<T>" feature so we can call finish without awaiting it and pass the call as a future<trailers> to the constructor for the new request or response.

@lukewagner
Copy link
Member

Agreed!

@dicej
Copy link
Collaborator Author

dicej commented Mar 6, 2024

Does this look plausible?

diff --git a/wit/deps/http/types.wit b/wit/deps/http/types.wit
index c859a38..9793aca 100644
--- a/wit/deps/http/types.wit
+++ b/wit/deps/http/types.wit
@@ -225,6 +225,43 @@ interface types {
   /// Trailers is an alias for Fields.
   type trailers = fields;
 
+  /// Represents an HTTP Request or Response's Body.
+  ///
+  /// A body has both its contents - a stream of bytes - and a (possibly empty)
+  /// set of trailers, indicating that the full contents of the body have been
+  /// received. This resource represents the contents as a `stream<u8>` and the
+  /// delivery of trailers as a `trailers`, and ensures that the user of this
+  /// interface may only be consuming either the body contents or waiting on
+  /// trailers at any given time.
+  resource body {
+  
+    /// Construct a new `body` with the specified stream and trailers.
+    constructor(
+      %stream: stream<u8>,
+      trailers: option<future<trailers>>
+    );
+  
+    /// Returns the contents of the body, as a stream of bytes.
+    ///
+    /// The returned `stream` is a child: it must be dropped before the parent
+    /// `body` is dropped, or consumed by `body.finish`.
+    ///
+    /// This invariant ensures that the implementation can determine whether the
+    /// user is consuming the contents of the body, waiting on the trailers to
+    /// be ready, or neither. This allows for network backpressure is to be
+    /// applied when the user is consuming the body, and for that backpressure
+    /// to not inhibit delivery of the trailers if the user does not read the
+    /// entire body.
+    ///
+    /// This function may be called multiple times as long as any `stream`s
+    /// returned by previous calls have been dropped first.
+    %stream: func() -> result<stream<u8>>;
+
+    /// Takes ownership of `body`, and returns a `trailers`.  This function will
+    /// trap if a `stream` child is still alive.
+    finish: static func(this: body) -> result<option<trailers>, error-code>;
+  }
+  
   /// Represents an HTTP Request.
   resource request {
 
@@ -245,8 +282,7 @@ interface types {
     /// to reject invalid constructions of `request`.
     constructor(
       headers: headers,
-      body: stream<u8>,
-      trailers: option<future<trailers>>,
+      body: body,
       options: option<request-options>
     );
 
@@ -302,25 +338,8 @@ interface types {
     /// component by e.g. `handler.handle`.
     headers: func() -> headers;
 
-    /// Returns the contents of the body, as a stream of bytes.
-    ///
-    /// The returned `stream` is a child: it must be dropped before the parent
-    /// `request` is dropped, or consumed by `request.finish`.
-    ///
-    /// This invariant ensures that the implementation can determine whether the
-    /// user is consuming the contents of the body, waiting on the trailers to
-    /// be ready, or neither. This allows for network backpressure is to be
-    /// applied when the user is consuming the body, and for that backpressure
-    /// to not inhibit delivery of the trailers if the user does not read the
-    /// entire body.
-    ///
-    /// This function may be called multiple times as long as any `stream`s
-    /// returned by previous calls have been dropped first.
-    body: func() -> result<stream<u8>>;
-
-    /// Takes ownership of `request`, and returns a `trailers`.  This function will
-    /// trap if a `stream` child is still alive.
-    finish: static func(this: request) -> result<option<trailers>, error-code>;
+    /// Takes ownership of the `request` and returns the `body`.
+    consume: static func(this: request) -> body;
   }
 
   /// Parameters for making an HTTP Request. Each of these parameters is
@@ -375,8 +394,7 @@ interface types {
     ///   for the Response.
     constructor(
       headers: headers,
-      body: stream<u8>,
-      trailers: option<future<trailers>>
+      body: body,
     );
 
     /// Get the HTTP Status Code for the Response.
@@ -396,24 +414,7 @@ interface types {
     /// component by e.g. `handler.handle`.
     headers: func() -> headers;
 
-    /// Returns the contents of the body, as a stream of bytes.
-    ///
-    /// The returned `stream` is a child: it must be dropped before the parent
-    /// `response` is dropped, or consumed by `response.finish`.
-    ///
-    /// This invariant ensures that the implementation can determine whether the
-    /// user is consuming the contents of the body, waiting on the trailers to
-    /// be ready, or neither. This allows for network backpressure is to be
-    /// applied when the user is consuming the body, and for that backpressure
-    /// to not inhibit delivery of the trailers if the user does not read the
-    /// entire body.
-    ///
-    /// This function may be called multiple times as long as any `stream`s
-    /// returned by previous calls have been dropped first.
-    body: func() -> result<stream<u8>>;
-
-    /// Takes ownership of `response`, and returns a `trailers`.  This function will
-    /// trap if a `stream` child is still alive.
-    finish: static func(this: response) -> result<option<trailers>, error-code>;
+    /// Takes ownership of the `response` and returns the `body`.
+    consume: static func(this: response) -> body;
   }
 }

@lukewagner
Copy link
Member

@dicej Looks generally in the right direction! A few thoughts:

First, the comments about the body.stream and body.finish functions are correct, but they should be automatically implied simply by the C-M rules of stream (and thus not something that has to be manually mentioned for each WASI interface or manually implemented (it would be enforced as part of the C-M bindings). In particular, the call to body.stream will not return until the returned stream is dropped, and the ongoing body.stream call will hold onto the borrow of the body until then, and thus it won't be possible to drop or finish the body while there is any active child stream. (If you want to keep the comments just so new-comers understand the rules, that's totally fine; maybe just add a note that this is an automatic consequence of the C-M.)

Next, for the case where I want to simply read the body bytes of a request or response into my component, I wonder if, instead of having static consume functions that take ownership of the parent request/response, what if we instead had body methods that return a child handle to the body resource (just like the headers method returns a child handle) with the same child rules (you have to drop any handle to the body before dropping or transferring ownership of the parent request/response). This allows a component A to be given a request, read a few bytes of the body, close the stream, drop the body handle, then transfer ownership of the request to component B which can do likewise. This scenario might arise concretely with, e.g., request batching.

For the case where I want to extract the body from one request/response to use it to zero-copy construct another request/response, I wonder if we should also consider the headers and the into-parts idea Adam initially proposed in #103 (that consumes the parent request/response and returns all its contents). Because you can just call getters to access the value fields (like the method), I would think into-parts would only return the owned child resources (headers and body). What's neat is that the child rules ensure that we don't have to worry about dangling child headerss, bodys and streams.

WDYT?

@dicej
Copy link
Collaborator Author

dicej commented Mar 6, 2024

All sounds good to me.

Now I'm thinking we should aim to get this merged into the repo so we can iterate on it properly (without interfering with the 0.2.x work). Should I add a wit-0.3.0 directory and put it there, perhaps?

@lukewagner
Copy link
Member

Sounds good to me; maybe wit-0.3.0-draft? (Btw, I could be wrong, but I don't think we need dates in the WIT package versions, just -draft would be fine (unless you want them for your own experimental prototype purposes.)

@dicej
Copy link
Collaborator Author

dicej commented Mar 7, 2024

I've opened a new PR that puts the draft in its own directory: #106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants