Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for supporting JavaScript method inheritance using Rust traits #3

Closed

Conversation

Pauan
Copy link

@Pauan Pauan commented Aug 8, 2018

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for writing up this RFC @Pauan! It is so helpful to have an example/strawman proposal to guide the discussion and get things started.


General questions for everybody:

Do we want these INode and IEventTarget etc traits to be object safe? Having generic methods breaks that. I think we can get away without object safety, since I don't think trait objects for these things will be generally useful (see inline comment below).

For the WebIDL frontend, are we expecting to stop emitting normal methods and only emit trait methods? If not, will we break method resolution, or will it and type inference do the Right Thing? I'd have to investigate further by playing around with some code, but maybe someone else already has a good intuition for the behavior of method resolution and TI.

Do we expect "normal" users using the proc-macro frontend to use this stuff? Does it make sense to limit the scope to only the WebIDL frontend?


Thanks again @Pauan for writing this up!

As an example, it should be possible to use the
[`appendChild`](https://developer.mozilla.org/en-US/docs/Web/API/Node/appendChild)
method with all of the classes which inherit from `Node` (e.g. `HTMLElement`,
`HTMLDivElement`, `SVGElement`, and many more).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to clarify whether the goal is

  1. to enable writing generic functions like fn takes_element<E: IElement>(elem: E) {...}, or
  2. to enable ergonomically calling inherited methods without upcasting with Into or From (e.g. span.appendChild(...) rather than let element: Element = span.into(); element.appendChild(...);), or
  3. both.

I personally think that (1) should be a non goal, since there is no benefit for user code to do fn takes_element<E: IElement>(elem: E) {...} over fn takes_element(elem: Element) {...} other than potentially saving an into() at call sites. The downside is that we will get monomorphizations that don't pay for their extra code size bloat with better optimization potential because all the calls are going to be the same anyways and aren't inlinable (since they are FFI calls).

Nor does it make sense to do fn takes_element(elem: &IElement) {...} 99.99% of the time, since our investigation into overridden methods showed that it doesn't happen often and the few times it does, you probably want the original method anyways. It also implies indirect calls, and potentially boxing (if taking a Box<IElement>).

Therefore, I think we should focus on (2) and we should better reflect that in this motivation section.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your assessment that we should focus on (2).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we accomplish 2, doesn't that necessarily mean that 1 will be supported as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think that (1) should be a non goal, since there is no benefit for user code to do fn takes_element<E: IElement>(elem: E) {...} over fn takes_element(elem: Element) {...} other than potentially saving an into() at call sites.

I thought about this some more, and doesn't that statement completely invalidate this RFC? Since you said that writing <E: IElement> "only saves an into() at call sites", that's also true with the current status quo, since people can always write html_element.into().append_child(node), right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would like to make an argument in favor of users using generic trait bounds.

I wrote a DOM library called dominator. It contains a DomBuilder struct.

This struct is intentionally designed so that it can work with a wide variety of HTML elements, with correct static typing.

For example, it can work with Node, HTMLElement, SVGElement, HTMLInputElement, etc. That is accomplished by having it take a generic A type argument.

Then it contains a bunch of methods which are bounded by traits:

IEventTarget
INode
IElement
IHtmlElement

The end result is that users can do things like this:

html!("input" => HtmlInputElement, {
    .attribute("foo", "bar")
    .style("qux", "corge")
})

And that macro will be expanded into this:

{
    let x: DomBuilder<HtmlInputElement> = DomBuilder::new(create_element_ns("input", HTML_NAMESPACE));
    x.attribute("foo", "bar")
     .style("qux", "corge")
     .into_dom()
}

Most of the details are unimportant, but the important thing is that you can use the html! macro with a wide variety of DOM types, and then call methods (such as attribute or style) on those DOM types, and everything will be fully statically typed: you will get a compile-error if you call the wrong method on the wrong DOM type.

The generic bounds like <A: IElement> are needed to allow the same method to be called on multiple different DOM types. And code bloat isn't a problem because I've carefully made sure that all of the methods are small and inlined.

I think perhaps you are underestimating how useful generic code is, even for external stuff like the DOM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding html_element.into().append_child(...), this won't work because the rust compiler doesn't know what you are trying to convert into. This is always a downside with generic return values. You would have to expand it as

let node: Node = html_element.into();
node.append_child(...)

As for generic arguments, I agree with you that they are useful and far more ergonomic, but they also fit into a bigger picture of function overloading (along with optional arguments and union type arguments). While I would probably have gone down a path much more similarly to what you are proposing, I think there is a general desire to keep the bindings pretty low level, and so far that has meant avoiding generics. This is definitely something that I want to revisit in the future in the form of a unified function overloading proposal, but I don't think that it should be a blocking concern at the moment.

#[wasm_bindgen]
extern {
#[wasm_bindgen(method, structural, js_name = addEventListener)]
fn add_event_listener(this: &JsValue, name: &str, listener: &Closure<FnMut(Event)>, use_capture: bool);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than generating this: &JsValue receivers, I think we would want to use the name of the super class that introduces the method. I fear that using JsValue might break host bindings support.

The structural attribute will certainly break host bindings support, as it forces a JS shim between the wasm and the native DOM function. We shouldn't automatically add that attribute. Is there a reason you added it here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in the "Unresolved Questions" section, I wasn't sure how to support non-structural uses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe most of the issues you are having with structural would be handled by placing the methods on the corresponding interface type rather than on JsValue (as per @fitzgen's suggestion).

It is absolutely essential that we don't use structural everywhere.

fn append_child<A: INode>(this: &JsValue, child: A) -> A;

#[wasm_bindgen(method, structural, js_name = removeChild)]
fn remove_child<A: INode>(this: &JsValue, child: A) -> A;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think extern functions can be generic. In the general FFI case, it doesn't make sense because a generic function is actually many functions, one for each type that the generic is instantiated with, and the extern function can't know all the the types that it will be instantiated as by some downstream user. Rust even disallow generics in extern functions:

error[E0044]: foreign items may not have type parameters
 --> src/lib.rs:2:5
  |
2 |     fn gen<A: Into<usize>>(a: A);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ can't have type parameters
  |
  = help: use specialization instead of type parameters by replacing them with concrete types like `u32`

To make this kind of generics sugar work, we would want to as_ref() to concrete types before the extern function call, something like this:

#[wasm_bindgen]
extern {
    #[wasm_bindgen(method, js_name = appendChild)]
    fn append_child(this: &Node, child: &Node);
    // ...
}

pub trait INode: AsRef<Node> + IEventTarget {
    fn append_child<N: INode>(&self, child: &N) {
        append_child(self.as_ref(), child.as_ref());
    }
    // ...
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also relates a bit to method overloading (which occurs at least for constructors), in that the generic argument here is really serving as a way to handle multiple signatures. I would say that any generic bindings should be discussed as a future extension to the bindings. So for now I would use the following signature which should be forward-compatible with @fitzgen suggestion:

pub trait INode: AsRef<Node> + IEventTarget {
    fn append_child(&self, child: &Node) {
        // ...
    }
    // ...
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think extern functions can be generic.

To be clear, the generic is on an extern which is annotated with wasm_bindgen, so wasm-bindgen could replace the generic extern with a non-generic extern.

For example, this:

#[wasm_bindgen]
extern {
    fn foo<A: Foo>(A) -> A;
}

...can be compiled into something like this:

#[wasm_bindgen]
extern {
    fn __generated_foo(JsValue) -> JsValue;
}

#[inline]
fn foo<A: Foo>(value: A) -> A {
    __generated_foo(value.into()).unchecked_into()
}

(Where Foo extends from JsCast)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ohanar The purpose of the generic with append_child is to retain type information, because the append_child method always returns the argument you pass in, so if you pass in an HTMLElement you get back an HTMLElement.

Having it instead always return a Node causes a loss of type information, which can only be re-obtained by using dyn_into, with some runtime cost and loss of ergonomics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So two things here.

  • First is that there is no way to know that what more specific return type you have on a method if you have a more specific input type by examining the WebIDL. Consider the following snippet:
interface Node : EventTarget {
    Node appendChild(Node node);
    ...
};

There is no way to know that if you actually pass an HTMLElement into the appendChild method you get back an HTMLElement. This could be addressed by adding our own custom attribute to the operation (e.g. WbgReturnType=typeof(node)), but that would require us to comb through many files and add such annotation (in addition to the work involved in codegen).

  • Secondly, the return value for this particular method is actually pretty useless. Interface objects are passed by (shared) reference, not by value. In the case that you pass in an HTMLElement, you would still have complete access to the original object after the method call, meaning that you don't need the return value to have access to that particular HTMLElement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First is that there is no way to know that what more specific return type you have on a method if you have a more specific input type by examining the WebIDL. Consider the following snippet:

That's a really good point! It's true that in practice the appendChild method always returns its argument, but that isn't actually specified in the WebIDL.

That's a good argument for avoiding the generics for now, since in my opinion the Rust bindings should be close to the WebIDL bindings.

Secondly, the return value for this particular method is actually pretty useless.

Yes, but according to the WebIDL it does return something, so unless you add in a custom attribute it will generate a return value automatically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but according to the WebIDL it does return something, so unless you add in a custom attribute it will generate a return value automatically.

Completely agree, it should still return a Node, and we shouldn't change that (regardless of how useful it is).

Then the trait methods are rewritten so that they call the extern functions (using `as_ref()` to convert `&self` to a `&JsValue`),
and they are marked as `#[inline]`.

Now that we have defined the desired traits and methods, we can add the methods to a type like this:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting users to write these impls by hand? Or is this automatically generated? If so, how does wasm-bindgen know when to emit an impl? Which new or existing attributes trigger that, and on which items do these attributes need to appear?

Copy link
Author

@Pauan Pauan Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I think just supporting WebIDL is enough, but eventually users will be writing the traits (and impls) by hand (which is why I chose a convenient syntax for it).

Even though WebIDL is by far the biggest usage of class-inheritance, there are plenty of other uses of class-based inheritance in JavaScript.

If we want to instead focus solely on WebIDL for now, I can change the RFC to be more focused on that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for a new attribute, I had considered that, but the default impls make it so easy to apply the traits, so I figured an attribute was overkill.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to include mock syntax for what macros might look like in the future, but I think we should definitely focus on WebIDL, and make it clear that macro support is not part of this RFC, merely an example of how it could potentially be extended in the future.

would have to duplicate all of the `EventTarget`, `Node`, and `Element` methods on `HtmlElement`, etc.

This is an incredible amount of duplication, so it's only really feasible for a tool which automatically
generates the methods (such as the WebIDL generator). Trying to do this duplication by hand is unmaintainable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, it means we will end up emitting larger .wasm binaries that have more imports, and larger JS glue that grabs what ultimately end up being the same functions many times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, if most of the implementations are along the lines of the following:

impl HtmlElement {
    #[inline]
    fn append_child(&self, node: &Node) -> Node {
        let this: &Node = self.as_ref();
        this.append_child(node)
    }
}

And have the real implementation lie in a Node impl block.


```rust
// Now all of the methods work!
use web_sys::traits::*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a second .rs file emitted by the WebIDL frontend?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could also be an inline module, e.g.:

// ... all the bindings that we currently generate + trait definitions
pub mod traits {
    pub use super::{
        INode,
        IHtmlElement,
        // ...
    };
}

It would be easy enough to track each of the interfaces to add such an inline module at the end of the generated module.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of whether it's a separate .rs or an inline module, either way I expect it to be auto-generated by the WebIDL generator.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I was saying is that it would be pretty straightforward to implement an extra traits module.

@ohanar
Copy link
Member

ohanar commented Aug 10, 2018

Regarding @fitzgen's questions.

  • I don't think we really need object safety. However, unless I'm forgetting something, the main thing prevents object safety are static methods/attributes, and I don't really think there is anything gained by putting the static methods/attributes in a trait over directly on the type.
  • Regardless of what the type system does, I think we should stop emitting normal methods for the sake of consistency (at least for non-static ones, see previous bullet).
  • I would say that for the moment we should only worry about WebIDL. We could likely add macro support in the future if there seems to be demand, but I don't think it is that important at the moment.


## Mixins

The above technique works great for class-based inheritance, but it can also be used to support mixins.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WebIDL spec explicitly says that mixins are not meant to be exposed in language bindings, but are merely an editorial tool. In particular, for the same reason as just stuffing everything onto JsValues, this could be incompatible with the host-bindings proposal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the methods are a part of a mixin and not a class, how will that work with the host-bindings proposal?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixins are a shorthand way of including methods and/or attributes in multiple interfaces.

Consider the webgl vs webgl2 contexts. Many of the methods are the same, but there is no real relation between them from a class hierarchy sense. To simplify spec specification, you can use place all the redundant declarations in a mixin and then include it for each of the contexts. This is the same as just copying and pasting all of the redundant declarations into each of their interface declarations.

We already support mixins (and partial interfaces, which are similar in concept) in our webidl crate, so I would really just remove the section on mixins from the RFC.

impl INode for Node {}
```

Because the traits contain default implementations, this is all that is needed to make it work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the main difference if we want to stop emitting normal methods for interface objects, and only expose the methods on traits would be here. The default implementations would be more or less the same (modulo @fitzgen comment about how they shouldn't be on JsValues), but the impl for an interface object for the corresponding interface trait would contain the real implementation. E.g.

impl IEventTarget for Node {} // Use default implementation
impl INode for Node {
    fn append_child(&self, node: &Node) -> Node {
        // the messy generated extern stuff would be in here.
    }
    // ...
}

Copy link
Author

@Pauan Pauan Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does IEventTarget have default impls, but INode doesn't?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to write up something a bit more clear on this tomorrow.

@ohanar
Copy link
Member

ohanar commented Aug 10, 2018

Got some time to write this up while proctoring an exam.

Consider the following snippet of WebIDL:

interface EventTarget {
    void addEventListener(DOMString type, EventListener listener);
};
interface Node : EventTarget {
    Node appendChild(Node node);
};

Right now we more or less get the following generated API (not complete, but most of what is relevant):

pub struct EventTarget { ... }
impl EventTarget {
    pub fn add_event_listener(&self, type_: &str, listener: &EventListener) {
        // ffi mess goes here
    }
}
pub struct Node { ... }
impl Node {
    pub fn append_child(&self, node: &Node) -> Node {
        // ffi mess goes here
    }
}
impl AsRef<EventTarget> for Node { ... }

As I understand it this proposal as currently written would change this to the following (ignoring generic stuff, which seems to be going away):

pub struct EventTarget { ... }
pub trait IEventTarget: AsRef<JsValue> {
    #[inline]
    pub fn add_event_listener(&self, type_: &str, listener: &EventListener) {
        let this: &JsValue = self.as_ref();
        this.add_event_listener(type_, listener)
    }
}
impl IEventTarget for EventTarget {}
impl JsValue { // How is this being done in the web-sys crate? Would break orphan rules.
    fn add_event_listener(&self, type_: &str, listener: &EventListener) {
         // structural ffi mess here
    }
}
pub struct Node { ... }
pub trait INode: IEventTarget + AsRef<JsValue> {
    #[inline]
    fn append_child(&self, node: &Node) -> Node {
        let this: &JsValue = self.as_ref();
        this.append_child(node)
    }
}
impl INode for Node {}
impl JsValue { // Again, orphan issues.
    fn append_child(&self, node: &Node) -> Node {
        // structural ffi mess here
    }
}
impl AsRef<EventTarget> for Node { ... }

Beyond the issues with structural, I just realized while writing this that there would likely be orphan rule issues.

I would propose the following output (largely based on the current proposal, but should address both the structural issues and the orphan rule issues):

pub struct EventTarget { ... }
pub trait IEventTarget: AsRef<EventTarget> {
    #[inline]
    pub fn add_event_listener(&self, type_: &str, listener: &EventListener) {
        let this: &EventTarget = self.as_ref();
        this.add_event_listener(type_, listener)
    }
}
impl IEventTarget for EventTarget {
    fn add_event_listener(&self, type_: &str, listener: &EventListener) {
         // ffi mess here
    }
}
pub struct Node { ... }
pub trait INode: IEventTarget + AsRef<Node> {
    #[inline]
    fn append_child(&self, node: &Node) -> Node {
        let this: &Node = self.as_ref();
        this.append_child(node)
    }
}
impl INode for Node {
    fn append_child(&self, node: &Node) -> Node {
        // ffi mess here
    }
}
impl IEventTarget for Node {}
impl AsRef<EventTarget> for Node { ... }

The main difference here is that we only cast up to where a method is introduced, rather than all the way up to JsValue and we override the default implementations for a type corresponding to a trait (rather than stuffing the ffi stuff into JsValues).

@Pauan
Copy link
Author

Pauan commented Aug 12, 2018

@ohanar How is that better than this?

pub struct EventTarget {
    fn add_event_listener(&self, type_: &str, listener: &EventListener) {
         // ffi mess here
    }
}

pub trait IEventTarget: AsRef<EventTarget> {
    #[inline]
    fn add_event_listener(&self, type_: &str, listener: &EventListener) {
        let this: &EventTarget = self.as_ref();
        this.add_event_listener(type_, listener)
    }
}

impl IEventTarget for EventTarget {}
pub struct Node {
    fn append_child(&self, node: &Node) -> Node {
        // ffi mess here
    }
}

pub trait INode: IEventTarget + AsRef<Node> {
    #[inline]
    fn append_child(&self, node: &Node) -> Node {
        let this: &Node = self.as_ref();
        this.append_child(node)
    }
}

impl IEventTarget for Node {}
impl INode for Node {}

P.S. This RFC proposes to use extern functions, not methods on JsValue, so there's no issue with orphan rules.

@Pauan
Copy link
Author

Pauan commented Aug 12, 2018

By the way, I am working on a rewrite of this RFC (taking into account all of the feedback), but I've been a bit busy with Monster Hunter World. I plan to have it done by tomorrow.

@ohanar
Copy link
Member

ohanar commented Aug 12, 2018

@Pauan That works equally well, assuming rust will choose the normal method over the trait method when they both have the same name. You could also have the normal method have a prefix or suffix (e.g. append_child_impl), which would avoid any issues with name resolution.

@Pauan
Copy link
Author

Pauan commented Aug 13, 2018

I rewrote the RFC to take into account all of the feedback.

@ohanar
Copy link
Member

ohanar commented Aug 13, 2018

Thanks @Pauan! This looks pretty good to me. One thing that I would make sure is very clear in the RFC, is that this is a breaking change compared to the current impromptu API, but that it is also easy for a user to fix (by adding a use web_sys::traits::*;).

Copy link
Member

@ohanar ohanar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still the unresolved question of what we do with static methods and constants. I would say that these should stick on the type, rather than on the trait, with the main motivation being that the new static method (i.e. the bare constructor) will likely have ambiguity issues if placed on the traits. (Also, there isn't really anything to be gained by putting them on the traits.)

};
```

Based upon that WebIDL spec, the WebIDL generator will generate this Rust code:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick, the WebIDL frontend doesn't actual use the wasm-bindgen macros, it directly generates code. We often think about what is equivalent between the two, but they fundamentally both go directly to codegen. E.g. we would say that the snippet above might de-sugar into the extern block below, but it doesn't actually ever create the extern block below, it creates the same post-processed output of the extern block.

```

However, due to the potential for code bloat, this sort of pattern is *not* encouraged
by this RFC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make clear that the user should take a Node argument instead?

fn remove_event_listener(this: &EventTarget, type: &str, callback: &Closure<FnMut(Event)>, options: bool);

#[wasm_bindgen(method, js_name = dispatchEvent)]
fn dispatch_event(this: &EventTarget, event: Event);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a bool return.

#[wasm_bindgen]
extern {
#[wasm_bindgen(method, js_name = addEventListener)]
fn add_event_listener(this: &EventTarget, type: &str, callback: &Closure<FnMut(Event)>, options: bool);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly what will be generated due to recently merged support for union arguments. (Regardless, this isn't yet supported since callback interfaces don't have support yet.)

Maybe use slightly a slightly simplified WebIDL snippet for this RFC (the full thing isn't really needed to give an example).


3. This trait has an `AsRef<Type>` constraint

4. If the WebIDL interface extends from another interface, then that is also added as a constraint (e.g. `INode` inherits from `IEventTarget`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make clear that you only put direct inheritance as a constraint? For example IElement would not have the direct constraint of IEventTarget + INode, but only INode (the IEventTarget constraint would be implied).

other than WebIDL, because of the maintenance burden.

And because class-based inheritance is used outside of WebIDL, we want to be able to support non-WebIDL use
cases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other alternative is to use Deref to model single inheritance. This doesn't suffer from the same drawbacks mentioned here, but it is generally considered an anti-pattern.


----

This proposal requires a lot of boilerplate (to define the inherit and trait methods). This is acceptable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this stuff out of unresolved questions and into a new Future Extensions section?

@Pauan
Copy link
Author

Pauan commented Aug 15, 2018

@ohanar There is still the unresolved question of what we do with static methods and constants.

Do you have any examples of static methods? I could only find constants.

At first I agreed with you, however I tested it and you can use Node.ELEMENT_NODE, Element.ELEMENT_NODE, and HTMLElement.ELEMENT_NODE, even though it's defined only on Node.

I don't know if that will still be true with the host bindings proposal, but it is the web reality today.

Since the WebIDL seems to treat constructors and constants as different, I think we can do the same with Rust: put constants onto the trait, but put the new constructor as an inherit impl.

(Also, my understanding is that the interfaces like Node, Element, HTMLElement, etc. don't have constructors anyways)

On the other hand, I don't see much benefit to putting them into the trait, since the constant will be the same for all of the types, so I'm fine with just defining Node::ELEMENT_NODE

So I guess the question is whether we want to be super-pedantic with following the WebIDL exactly or not.

Also, are there any situations where a constant or static method is overridden by a sub-class? That would be a good reason to put them onto the trait.

@alexcrichton
Copy link
Contributor

Thanks for taking the time to write this up @Pauan!

I'm personally a little hesitant to explore this sort of trait hierarchy and/or abstractions in web-sys iteslf. I see web-sys as very similar to libc in that we're exposing the functionality of the web platform but the crate doesn't have a goal of being ergonomic to use. In that sense the number one goal for web-sys is to ensure that it's low-level and doesn't prevent any legitimate patterns from being expressed (like how libc doesn't give you wrappers, just function definitions).

Now we can't entirely do that because web-sys is inherently a bit higher than libc, so as @ohanar mentioned we've got a lot of ad-hoc API decisions already made today. I think though that using traits as proposed here pushes us too far in the direction of attempting to provide an abstraction that may not work well in the long run.

Some concrete thoughts as to why I think we should stick with the low-level bindings we have today are:

  • Today's construction, while unergonomic, is explicit and helps you understand where methods are actually being routed and such. It's not a goal of web-sys to be ergonomic but rather provide the foundations for a future crate to provide an ergonomic API. It's envisioned that, like libc, long term you barely use web-sys at higher levels of the stack.

  • The traits introduced here are sort of an abstraction that doesn't really exist on the JS side of things. That's mostly because the JS side of things doesn't map super well to Rust (class inheritance vs traits), but the abstraction makes me a bit wary. It seems to encourage patterns like T: INode when otherwise you could today you'd simply take &Node or if you really wanted to stretch it you'd take T: AsRef<Node>. Additionally the traits make it difficult to call and explore methods in Rust because you have to import them (maybe all of them!) and rustdoc you've got to explore a bit more to find the relevant traits and such.

  • I'm pretty worried about the object safety aspect here. Both in that we want it and we don't want it! For example, T: INode seems like an antipattern in that it's generating far too much code (lots of monomorphizations) when otherwise you'd just want to take &Node or whichever concrete type is in play. Furthermore, even if it were object safe, that means we'd have &INode which also produces code bloat because the vtable contains far more functionality than is likely being used, and will prevent LLVM from dead-code eliminating all the extra function pointers.

  • Given the above bullet, the main thing that traits are buying us is the ability to use method calls to call superclass methods. That seems like a bit of an abuse of the trait system in Rust (in that it should provide us more than just this one benefit in theory) and I think doesn't outweigh the downsides of importing traits and such.

Overall I think it's good motivation to strive to make calling superclass methods more ergonomic, but I don't think we want to bend over backwards to satisfy this goal. The goal of ergonomics will always be secondary for the web-sys crate, so we should prioritize fundamental abilities like #2 which provide the ability to cast between types (both checked and unchecked) while also providing guarantees about subclasses you can cast to.

@alexcrichton
Copy link
Contributor

er sorry, didn't mean to close!

@alexcrichton alexcrichton reopened this Aug 15, 2018
@fitzgen
Copy link
Member

fitzgen commented Aug 16, 2018

I was originally very gung-ho about the prospect of using traits to represent inherited base methods, but seeing it all written out in all of its complexity, I have to agree with Alex's last comment.

However! It has also made me reconsider Deref upcasting, where

#[wasm_bindgen]
extern {
    pub type Base;

    #[wasm_bindgen(extends = Base)]
    pub type Derived;
}

would also generate

impl Deref for Derived {
    type Target = Base;

    fn deref(&self) -> &Base { ... }
}

We originally dismissed this as an anti-pattern, but it would not only enable base method calls without casts, it would also enable passing references to bases as arguments:

// Can do this:
derived.base_method();
// Instead of this:
let base: &Base = derived.as_ref();
base.base_method();

// Also, can do this!
takes_base(&derived);
// Instead of this:
takes_base(derived.as_ref());

The simplified version of this proposal does not handle the latter, and still introduces more complexity than emitting a Deref impl does.

The "disadvantages" and "discussion" section from the anti-pattern link aren't enough to dissuade me here -- the DOM is already designed how it is and we are forced to match it. If we were designing something Rust-y from scratch, then no I wouldn't suggest Deref. But we aren't in that situation, so I've come around to considering Deref a valid choice.

What do y'all think?

@Pauan
Copy link
Author

Pauan commented Aug 16, 2018

@alexcrichton Additionally the traits make it difficult to call and explore methods in Rust because you have to import them (maybe all of them!)

That is covered by the RFC, it's not a problem because the user can simply do use web_sys::traits::*;

and rustdoc you've got to explore a bit more to find the relevant traits and such.

That is a good point! I'll add it to the drawbacks section.

Given the above bullet, the main thing that traits are buying us is the ability to use method calls to call superclass methods.

That's not quite true: I made an argument that the trait bounds are useful for more than just calling superclass methods. Perhaps I should add that argument to the RFC.


@fitzgen The simplified version of this proposal does not handle the latter, and still introduces more complexity than emitting a Deref impl does.

I think the complexity of traits is about the same as Deref (in terms of the complexity for the user). There is some surprising behavior with Deref, so although it may be simpler to define a Deref impl, the user still has to consider the complexity when using it.

However, I agree that Deref is a solid alternative, and it does have some significant advantages (and it lacks most of the disadvantages mentioned in that anti-pattern article). So I'll add it to the Alternatives section.

@ohanar
Copy link
Member

ohanar commented Aug 16, 2018

In principle, I'm fine with the proposed trait system not being in web-sys. However, to have another crate do it requires processing the same webidl, and generating the same method signatures (including overloaded names) inside of traits rather than on normal methods. It feels a bit silly to put this logic anywhere else than the webidl crate, at which point, why not just include it in web-sys?

As for the Deref alternative, I still find it a compelling alternative. It also could be introduced at a later date in a backwards-compatible fashion.

@Pauan
Copy link
Author

Pauan commented Sep 1, 2018

I removed some of the downsides of Deref from the RFC. At this point the biggest downside of Deref is the fact that it has very poor handling of overridden methods.

So essentially it's a choice between these two (largely) incompatible strategies:

  1. Traits handle overridden methods correctly, they're more "Rust-y", but they're also a tiny bit more complicated to create and use. The docs are also not great for traits.

  2. Deref has much better docs, is a tiny bit more convenient to create and use, but handles overridden methods in an extremely poor manner.

Honestly, I think both options are bad, but traits at least handle every situation correctly, whereas Deref lulls you into a false sense of correctness.

@alexcrichton
Copy link
Contributor

Thanks for the update and sorry for the delay! I personally remain unconvinced that traits are the best option here. I don't understand how traits solve the overriden method downside of Deref (they seem to both suffer the same problem), and while it's true that WebIDL shouldn't be different than the ecosystem it seems like this still comes up so rarely in practice that it's not worth designing everything around it.

I would personally be in favor of either using Deref for method inheritance or not doing anything at all for the first few releases of web-sys. We can always figure out a way to add something later (we can make breaking changes after all!). In the meantime though a trait-based solution to me still feels too bloated for web-sys in terms of cognitive overhead, code size impact, and implementation.

@Pauan
Copy link
Author

Pauan commented Sep 13, 2018

@alexcrichton Oops, I accidentally removed that section when I did some cleanup.

First, to explain the problem with Deref:

let foo: Foo = ...;
let bar: Bar = ...;
fn my_fn(x: &Foo) -> bool { x.some_method() }

// Calls Foo::some_method
my_fn(&foo);

// Also calls Foo::some_method
my_fn(&bar);

As you can see, it calls Foo::some_method even though we passed in a Bar (we wanted it to call Bar::some_method).

This problem happens even if we use AsRef<Foo>:

fn my_fn<A: AsRef<Foo>>(x: &A) -> bool { x.as_ref().some_method() }

// Calls Foo::some_method
my_fn(&foo);

// Also calls Foo::some_method
my_fn(&bar);

However, traits do solve it:

fn my_fn<A: IFoo>(x: &A) -> bool { x.some_method() }

// Calls Foo::some_method
my_fn(&foo);

// Calls Bar::some_method
my_fn(&bar);

This works because Bar can implement the IFoo trait (and thus override some_method):

impl IFoo for Bar {
    fn some_method(&self) -> bool {
        ...
    }
}

As far as I know, this is the only way to override methods in Rust.

while it's true that WebIDL shouldn't be different than the ecosystem it seems like this still comes up so rarely in practice that it's not worth designing everything around it.

I disagree that it comes up rarely enough that we can just ignore it. There are plenty of JS libraries which use class-inheritance, and so we need some sort of solution for them.

The solution might be "use Deref in general, using traits only when it is necessary to override methods", I don't know, but we need some sort of solution. Simply saying "we don't support that" isn't good.

I agree that we don't need to solve it right now, but we do need a solution eventually.

I would personally be in favor of either using Deref for method inheritance or not doing anything at all for the first few releases of web-sys.

I agree, after examining the issue thoroughly, I'm personally in favor of postponing this RFC (i.e. doing nothing), until we have more real-world experience.

And who knows, maybe one of the inheritance RFCs which are floating around will get implemented in the Rust compiler, thus solving the issue completely.

@alexcrichton
Copy link
Contributor

Ok yes that makes sense, but note that a solution to all of this is "simply use structural". The reason that it doesn't naturally work is that we reach into prototype chains to bind callers, but if we were to instead use structural method invocation then Deref would work just fine. That means that for WebIDL we wouldn't mind too much and for other libraries if this pattern comes up then Deref works well and authors just have to bind APIs as structural that may be overridden. It's a bit slower, but all the various knobs for performance can always be tweaked.

I would personally prefer to keep pushing on this in the sense that ideally we would not postpone this RFC but rather accept the Deref solution. The trait-based solution (I think) is too complicated and risky to implement at this time (makes sense to postpone) but the Deref solution I still believe is lightweight, functional, and extensible to the future to be implemented today.

@Pauan
Copy link
Author

Pauan commented Sep 16, 2018

@alexcrichton Oh! That's really interesting! I hadn't realized that Deref + structural has correct behavior with overriding methods. That seems like a really good option (not for WebIDL, which doesn't override methods, but for other projects which require overriding methods).

@michiel-de-muynck
Copy link

Hi. I've read the RFC and the comments and I have a few comments of my own.

Maybe first a word of who I am and where I'm coming from: I'm creating a tool that transpiles TypeScript definition files to rust. Typescript definition files are kind of similar to WebIDL, but they support the full flexibility of the TypeScript type system, which goes pretty far beyond WebIDL or even Rust's type system (just scroll through this page to get an idea), so I won't be able to transpile everything accurately. But I want to do as much as possible.

I'd like to support both stdweb and wasm-bindgen, and I'd like to compile to "idiomatic" stdweb and wasm-bindgen code, which is why I'm here: to find out what idiomatic wasm-bindgen code should be. Seems like that's not set in stone yet.

So, on to my comments:

Ergonomics

It's not a goal of web-sys to be ergonomic but rather provide the foundations for a future crate to provide an ergonomic API.

I think it would be better if web-sys did have ergonomics as a goal, even if it were only a minor goal. If the ergonomics don't come in the way of correctness or performance, then I don't see the benefit of splitting web-sys into an unergonomic version and an ergonomic version. Especially because, as @ohanar mentioned, the implementation of that ergonomic version would have to parse all the same WebIDL definitions as web-sys does, so the benefit of the unergonomic web-sys crate is minimal. And as long as this hypothetical ergonomic version doesn't exist, then (as @Pauan mentioned), people that want to write bindings for some JavaScript package will copy what web-sys does. It's simply the first place you think to look.

Overridden methods

Like everyone in this thread, I like the Deref solution the most. It's elegant, doesn't cause any naming conflicts (in the trait solution, what if a package has both a class Foo and a class IFoo?), and it just works except for the issue with overridden methods. Nothing new there.

However, I disagree with the previous claims in this thread that the trait solution does solve problem of overridden methods. While I admit that traits handle the situation better than the Deref solution, it doesn't solve the issue fully. Let me explain.

In my discussion, I'll use mostly the same names that @Pauan used in his example here. Specifically, I'll assume that there's

  • a base class Foo with method some_method, that has
    • a subclass Bar, which overrides some_method, and has
      • a subclass Qux
      • a subclass Quux
      • many more subclasses
    • another subclass Baz

The quick summary of the issue, if I understand it correctly, is that there is no way to make a function that accepts a "Foo-or-subclass-of-Foo" as an argument and that calls some_method on it, expecting to call the overridden method if the Foo is a Bar (or subclass of it). The function will instead always call Foo::some_method, as it accepts a &Foo.

The solution with traits does in fact solve this specific issue. With the trait solution, you can make a generic function that accepts a T: IFoo, and if you properly implement IFoo for Bar and all its subclasses, then this generic function will call the right some_method.

But note that you have to implement IFoo properly for Bar and all subclasses of Bar. In all those impls (for Bar, Qux, Quux, etc,) you have to specify that you want some_method to call Bar's version of some_method. That's a lot of implementations of some_method. You'd think you could specify it once, in IBar, by "overriding" IFoo::some_method in IBar::some_method, but in Rust, traits can't override methods from supertraits. To Rust, IFoo::some_method and IBar::some_method are entirely unrelated functions that happen to have the same name. If you actually do this and try to call some_method, you get an error: "multiple applicable items in scope".

Having to define many impls is still just an inconvenience, it technically still works. But only in this case, not in all cases. What if I want to store "Foo-or-subclass-of-Foo" instances in a Vec, and then call some_method on them later? I have to store them as Foo. What if a javascript function that I want to write bindings for returns a Foo (which means it could return any subclass of Foo)? I have to make it return Foo. In both cases, the traits won't help the right some_method get called (you need to do instanceof).

By using traits and generic functions you can carry compile-time information of whether a certain Foo is a Bar (or subclass of it) into a function, but if that information isn't there at compile time to begin with, the traits don't help at all.

Everything structural?

I think this problem with overridden methods is deeper than this choice between Deref versus traits. The real problem is that in Javascript, methods are inherently structural. User-defined methods at least. There are still technically host and native objects, where I admit that I'm not sure about them and their methods:

  • Host objects: (window, document, etc.) I think their types can't be extended, or at least shouldn't be (it's like undefined behavior I think). So I think their methods can be marked non-structural with a clear conscience.

  • Native objects: (String, Array, etc.) I really don't know if it's ok to extend these types or not. A quick test shows that it's possible, but from what I read on the internet the consensus seems to be "it's not a good idea". So not marking their methods structural is maybe ok? Not sure.

  • User-defined classes: These are the problem. There's no such thing as a "final" class that can't be extended or a "final" method that can't be overridden. There are some tricks with Object.freeze() that may work, but I don't know of anyone that actually does that in practice. In TypeScript, the proposal to add a final keyword was rejected. In almost every JavaScript library, every ES6 class they define can be extended and every method can be overridden by the user.

web-sys deals mostly with host objects, but this RFC is not just about web-sys but about wasm_bindgen in general, right? And outside of web-sys and js-sys, it's all user-defined classes.

So if I'm making wasm_bindgen bindings to a JavaScript/TypeScript library, should I mark every method of every class it defines structural? If I do that, the sad thing would be that that would disable host bindings from being used to make method calls faster. And on top of that, overriding methods is not very common in Javascript. But I think I will still mark every method structural regardless, for 3 reasons:

  1. The potential bugs that can be caused by a method being wrongly not marked structural are very silent. You won't get any compilation errors or warnings. At runtime, everything will work fine except for that method call, and even then, you often won't get an exception or a crash. It just calls the wrong function. And worst of all it calls a function that may be very similar in behavior to the behavior of the intended function, so even when it happens you might not notice that anything is wrong. It seems very plausible for the bugs that this causes to slip under the radar all the way into production.

  2. If the bug gets detected, fixing it is not easy. Unless I'm mistaken, in the current wasm_bindgen it's not possible to call a method "structurally" if it's not marked structural. If I didn't mark a method structural in my bindings when it should have been, the only possible fix (barring some horrible instanceof-ing and casting), is to either submit a bug report to me and wait for me to mark the method structural, or to fork my bindings and make the change themselves.

  3. Unless I'm mistaken, the only advantage of not marking a method structural is performance: it can make use of host bindings. There's no (reasonable) code that breaks when marking a non-structural method structural, right? And are non-structural methods even strictly faster than structural ones? In steady-state I think the answer is yes, but at start-up, am I wrong in thinking that some prototype-chain walking needs to happen for each non-structural method that doesn't need to happen for structural methods?

For these 3 reasons, I think I will take the conservative stance to just mark every method in all my generated bindings structural. Any method could be overridden, even though very few methods actually are.

I wonder if it would be better for wasm_bindgen as a whole to take the same conservative stance: instead of assuming that a method is not structural unless it is explicitly marked otherwise, shouldn't we assume that a method is structural unless specifically marked otherwise? I mean, if someone marked a method with nothing other than #[wasm_bindgen(method)], can we assume that they know about the pitfalls with overridden methods? I think we should not, and we should provide the behavior that always gives the right answer, and let those who do know that their method won't be overridden (such as web-sys) opt-in to the host-binding optimization by explicitly marking the method not structural?

I think a good name for "not structural" could be final, since that's basically what it means right?

Top type

One last, pretty minor comment: In the Deref solution, if you have a variable x: HTMLElement, then *x is an Element, **x is a Node, and ***x is an EventTarget. EventTarget has no superclass. Should it impl Deref?

I think maybe it should, and that Target should be JsValue. It makes it so that if you have a function and one of its arguments can be any legal JavaScript value, and for some reason you don't want to make that function generic, you can take a &JsValue, and then you can pass any class instance into it.

That's a very minor and non-compelling argument. But for some reason it also sort of "feels" like if we use Deref to cast to supertypes, then the classes without superclasses should Deref to the top type, JsValue.

I apologize if anything I wrote is wrong, I'm still pretty much a beginner at wasm_bindgen.

@Pauan
Copy link
Author

Pauan commented Sep 30, 2018

in the trait solution, what if a package has both a class Foo and a class IFoo?

In that case the best thing would probably be to put them into separate modules. I do agree that the IFoo naming convention is pretty messy though.

But note that you have to implement IFoo properly for Bar and all subclasses of Bar.

That is true, which is why for casual usage we would have to implement some sort of macro to smooth it out.

What if I want to store "Foo-or-subclass-of-Foo" instances in a Vec, and then call some_method on them later? I have to store them as Foo.

You would use Vec<Box<IFoo>> (or Rc, or Arc, or similar). This is standard trait polymorphism in Rust.

By the way, Deref does not fix that problem at all, since it only applies to references. So without traits you would need to first use into() to convert it into a Foo (and you would have to use structural methods, or runtime downcasting).

On the other hand, with traits, Vec<Box<IFoo>> does the right thing (even with non-structural methods, and without requiring any downcasting). Traits simply work correctly, both in a way that is intuitive for Rust programmers, and also in a way that is correct for JS. That's their biggest benefit over Deref.

What if a javascript function that I want to write bindings for returns a Foo (which means it could return any subclass of Foo)? I have to make it return Foo. In both cases, the traits won't help the right some_method get called (you need to do instanceof).

That is a good point, it's something that stdweb has struggled with. Though it's unrelated to trait vs Deref, since the same issue exists in both.

If you mark the method as structural, then both Deref and traits work correctly in that situation.

So it would be more correct to say that it's an issue with non-structural methods.

I do agree that if methods are marked as structural, then that negates a lot of the benefit of traits (and thus makes Deref seem more compelling).

The real problem is that in Javascript, methods are inherently structural.

This isn't actually true. It is true that the easiest and shortest syntax is structural, but it is entirely possible to use JavaScript methods in a non-structural way:

// Structural
someBar.someMethod(...);

// Non-structural
Foo.prototype.someMethod.call(someBar, ...);

This is a common technique when needing to call the super-class' method inside of the sub-class' method:

Bar.prototype.someMethod = function (...) {
    Foo.prototype.someMethod.call(this, ...)
};

So if I'm making wasm_bindgen bindings to a JavaScript/TypeScript library, should I mark every method of every class it defines structural?

I would personally say yes.

If I do that, the sad thing would be that that would disable host bindings from being used to make method calls faster.

Yes, that is the trade-off: if you know there won't be any overridden methods, then it's faster because it can statically jump directly to the appropriate method implementation. But if you need runtime dynamicism, then you will need to pay a performance cost.

I'll note that traits do not have that trade-off: you can get full performance even with overridden methods. Instead the trade-off with traits is different: you're trading compiled code size for performance.

The potential bugs that can be caused by a method being wrongly not marked structural are very silent. You won't get any compilation errors or warnings.

I agree, the tiny performance gain is not worth the silent bugs.

Unless I'm mistaken, in the current wasm_bindgen it's not possible to call a method "structurally" if it's not marked structural.

That's a good point, though I believe it should be possible to use the Reflect bindings to make dynamic method calls (though it's pretty clunky).

We can't test it yet, but I believe the performance of using Reflect should be the same as a regular (structural) method call in JS.

Unless I'm mistaken, the only advantage of not marking a method structural is performance: it can make use of host bindings.

The host bindings are still up in the air, but my understanding is that they will only make certain specific APIs faster, they won't magically make all JS methods faster.

When host bindings are implemented, it'll probably be necessary to add in a new wasm_bindgen attribute for them anyways, since they would require special imports (among other things).

So even with host bindings, any calls to a regular JS method will be the same performance as doing foo.someMethod(...) in JS.

So the performance of non-structural doesn't really have anything to do with host bindings, it's about pre-caching the prototype method so that the browsers can (hopefully) optimize it better. That's true with or without host bindings.

The connection between non-structural and host bindings is more about preventing us from accidentally writing very dynamic code which will be incompatible with host bindings.

There's no (reasonable) code that breaks when marking a non-structural method structural, right?

Since structural mode is (by far) the most common type of method call in JS, that seems like a reasonable assumption.

And are non-structural methods even strictly faster than structural ones?

That's a good question. Benchmarks would be good.

In steady-state I think the answer is yes, but at start-up, am I wrong in thinking that some prototype-chain walking needs to happen for each non-structural method that doesn't need to happen for structural methods?

Perhaps, though that can quickly be offset by the need to lookup structural methods multiple times (especially if the browser doesn't cache the method immediately).

Browsers are pretty damn good at optimizing prototype lookups though, so I doubt there's a significant performance difference between structural and non-structural.

As said, benchmarks would be good.

instead of assuming that a method is not structural unless it is explicitly marked otherwise, shouldn't we assume that a method is structural unless specifically marked otherwise?

I agree, I think structural should be the default. Yes in some cases it might be slower, but it's never any slower than standard JS method calls (which JS programmers already use all the time, non-structural method calls are rare). And I think the correctness is much more compelling than some (possible) minor speed boost.

I think a good name for "not structural" could be final, since that's basically what it means right?

I think that's a good topic for a new RFC (since it's pretty far off-topic from this RFC).

I think maybe it should, and that Target should be JsValue.

Hmm, I had assumed that that was the case, but it seems I didn't spell it out clearly in the RFC (probably because the RFC focused on traits).

In any case I agree, the "top-level" classes should Deref to JsValue (which also means that all sub-classes indirectly can Deref to JsValue).

@michiel-de-muynck
Copy link

michiel-de-muynck commented Sep 30, 2018

Thanks for the reply @Pauan. I agree with pretty much all of it. Anything I don't mention in my reply here, I agree with.

But note that you have to implement IFoo properly for Bar and all subclasses of Bar.

That is true, which is why for casual usage we would have to implement some sort of macro to smooth it out.

Is it possible for a macro to do this? I mean, such a macro would have to understand the entire superclass chain of a type, including all the methods defined on all those superclasses, to know which methods are overridden and therefore need to be implemented explicitly. Do macros have that kind of information?

You would use Vec<Box<IFoo>>

I admit, I didn't consider trait objects at all in my post. Trait objects do partially solve this problem, but there's a big flaw with trait objects: you can't call methods of supertraits on a trait object. With a Box<IBar> you can't call any methods of IFoo. (EDIT: this is not true. See below). You can't upcast a Box<IBar> to a Box<IFoo> either. (not entirely true either. See below) Maybe this will change in the future, but that's how trait objects work in Rust today.

// Non-structural
Foo.prototype.someMethod.call(someBar, ...);

Does anyone do this besides for calling the super method in an overriding method? Or more specifically, since wasm_bindgen's focus is on creating bindings for APIs, do you know of any APIs where this is the expected or intended way to use that API?

I'll note that traits do not have that trade-off: you can get full performance even with overridden methods.

Only for methods not marked structural. For methods marked structural, Deref and traits have the same performance, right?

Instead the trade-off with traits is different: you're trading compiled code size for performance.

Can this code size bloat be avoided with sufficient inlining or is it unavoidable?

I think that's a good topic for a new RFC (since it's pretty far off-topic from this RFC).

I don't agree that it's off-topic. If we make it so that methods are assumed to be structural unless marked otherwise (marked how? final?), then I'm in favor of the Deref solution, because structural fixes all the issues that the Deref solution has (besides the subjective ones like "it's not the intended use-case of Deref" etc.).

On the other hand, if we assume that methods are non-structural unless marked otherwise, which we do now, then I'm in favor of the trait solution, but only as a form of "damage control". The trait solution would make it slightly harder to shoot yourself in the foot, but it's certainly still possible.

@Pauan
Copy link
Author

Pauan commented Oct 1, 2018

Is it possible for a macro to do this? I mean, such a macro would have to understand the entire superclass chain of a type, including all the methods defined on all those superclasses, to know which methods are overridden and therefore need to be implemented explicitly. Do macros have that kind of information?

Hmmm, you're right, it may be too difficult for a macro. Which I guess means that even though traits technically solve the issue, they do so at extreme annoyance for sub-classes.

but there's a big flaw with trait objects: you can't call methods of supertraits on a trait object. With a Box<IBar> you can't call any methods of IFoo.

Could you explain more? I did some testing, and it seems to work fine.

You can't upcast a Box<IBar> to a Box<IFoo> either.

I couldn't find a way to upcast, so you may be right, but since it's possible to call IFoo methods on a Box<IBar>, I don't think upcasting is necessary?

Does anyone do this besides for calling the super method in an overriding method? Or more specifically, since wasm_bindgen's focus is on creating bindings for APIs, do you know of any APIs where this is the expected or intended way to use that API?

I don't know of any, no.

For methods marked structural, Deref and traits have the same performance, right?

I think the answer is a bit complicated. Deref has to do some unchecked coercions to the right type (and it has to use references), whereas traits do not. However, those unchecked coercions can probably be inlined, so the performance is probably the same (excluding the code bloat from generics).

Can this code size bloat be avoided with sufficient inlining or is it unavoidable?

Yes, inlining can fix the code size bloat (at least sometimes). You can also use trait objects to avoid the code bloat (but that's probably worse performance than Deref).

I don't agree that it's off-topic.

Both structural and non-structural methods exist right now, so it's already possible to just mark every method as structural.

There are situations where structural makes sense for Deref and traits, and also situations where non-structural makes sense for Deref and traits.

The default is just about ergonomics and convenience, so which one is default shouldn't affect the Deref vs trait decision, in my opinion.

However, making structural the default probably doesn't require an RFC, a normal issue on the wasm-bindgen repo is probably good enough (unless @alexcrichton or @fitzgen have an objection to that).

@michiel-de-muynck
Copy link

you can't call methods of supertraits on a trait object.

Could you explain more? I did some testing, and it seems to work fine.

I apologize, I was wrong. I read this Github issue which reports that upcasting trait objects isn't possible and links to threads and discussions about multiple-trait trait objects (&(A+B+C)), which aren't even possible at all, and from that I concluded that calling the methods of a supertrait on a trait object would also be impossible. But that's not true. I've edited my post.

It seems like there are even ways to make upcasting work, by essentially manually adding methods to the vtable for the purpose of upcasting. Seems clunky, but I guess it works. Still doesn't help in the situation where a JS function returns a Foo-or-subclass-of-Foo, and the Rust side never has any information about which subclass it is, so never gets the opportunity to construct the trait object.

@alexcrichton
Copy link
Contributor

Thanks for the post here @migi, I think you bring up some interesting points!

I wanted to elaborate a bit more on the structural-vs-not change as I do also think it's pretty relevant to something like this. The initial intent behind only adding structural after the fact (aka your final by default) is that wasm-bindgen was designed for the upcoming host bindings proposal in wasm.

In host bindings function calls can be flagged with a "this" parameter, which means that wasm-bindgen will actually need zero shims for JS final methods. We are emitting a polyfill today, however, which is indeed a shim. The intention here is that the wasm module instantiation fills in an import with a function that is precisely what we want to invoke, no shims involved.

Now that all sounds nice, but how does it actually perform today? I haven't ever measured myself! I've been slowing trying to work on a small suite of benchmarks with wasm-bindgen and such, and I've recently added one for structural-vs-final. The UI is pretty terrible, but you can run the benchmark by visiting this page and then clicking "Run test" on the "structural vs not" line. Eventually you'll see two bars below. The left-hand bar is final, and the right hand bar is structural.

The measurements I got in a few browsers were:

  • Firefox - structural is 3% faster
  • Chrome - structural is 4% slower
  • Edge - structural is 16% slower

I've been getting sort of wildly different results though over time, sometimes Firefox is seemingly up to 15% faster with structural than without. I'm curious to see what others get here!

In any case, I feel that we should let empirical data drive us towards a conclusion about whether to use structural by default or not. I think it would be fine to consider changing, but I think we'll want to make sure to have strong motivation to do so.

@Pauan
Copy link
Author

Pauan commented Oct 2, 2018

@alexcrichton I might be wrong, but my understanding is that the host bindings API will also work with structural (and structural would be the default for JS methods). I'm not sure if the host bindings API even supports non-structural methods!

(Edit: @lukewagner corrected me on this, host bindings are always non-structural, but it's possible to use Reflect.get + Reflect.apply for structural methods)

So as I said, it's not about "host bindings vs not host bindings", it's just about the trade-off of performance (looking up the prototype once, with non-structural) vs correctness (with structural).

P.S. It's great to see some benchmarks! The results are pretty much what I expected: not much difference between structural and non-structural: browsers are really great at optimizing prototype chain lookups (except Edge, apparently).

I tested out the benchmark, I get similar results as you.


@lukewagner Sorry for pinging you in here, but I have five important questions:

  1. With the upcoming host bindings API, if a WebAssembly module imports a method foo and calls it with a bar anyref (as the first argument), is that equivalent to calling bar.foo() in JS, or would it be more like SomeClass.prototype.foo.call(bar)?

    I assume it's the same as bar.foo() (i.e. that host binding method calls are exactly the same as JS method calls, including the prototype chain lookup), is that correct?

  2. Also, is it even possible to do the equivalent of SomeClass.prototype.foo.call(bar) with the host bindings API? (In other words, call the specific method of a specific class, without any prototype chain lookup).

  3. Let's say you have a JS class Foo and a JS class Bar extends Foo, and Bar overrides a method on Foo...

    class Foo {
        foo() { ... }
    }
    
    class Bar extends Foo {
        foo() { ... }
    }

    ...and a WebAssembly module imports the method foo, and calls foo on an anyref object which happens to be an instance of Bar, will it call Foo.prototype.foo or Bar.prototype.foo?

  4. Also, given the above Foo and Bar classes, can the WebAssembly module import a single foo method and then call it on instances of both Foo and Bar?

    Or would it need two separate imports, one import for Foo.prototype.foo, and a second import for Bar.prototype.foo?

  5. Is it even possible to import the specific method of a specific class (e.g. Foo.prototype.foo)?

@lukewagner
Copy link

Great questions!

  1. WebAssembly imports function values on instantiation and thus necessarily calls the same exact function on each call to that import, so, even with Host Bindings, that would correspond to the SomeClass.prototype.foo.call(args), assuming one had done const foo = SomeClass.prototype.foo; once up front and the actual call was foo.call(args).
  2. Yes, if you boil that down to two steps: (1) grab a function, (2) call the function (using Host Bindings to do f.call(import args) instead of the default f(import args) (which passes undefined as the receiver).
  3. With the simplistic imports described above, you'd have to pick a specific foo function at instantiation time, so every import call would call exactly that function.
  4. Yep, just as if you did let foo = ...; foo.call(a); foo.call(b);.
  5. Yep, it all depends on how you construct the values of the import object.

FWIW, this is all based on the straightforward Host Binding that has been proposed (CALL_THIS in the (probably soon to be refreshed) explainer). If we wanted a way for wasm to make a polymorphic method call (where the callee function was determined by a proto-walk), one could imagine a fancier Host Binding that effectively did a Reflect.apply(Reflect.get(obj, 'foo'), arg0, [args1...N]), where foo was a literal in the Host Binding, and this could be pretty well. However my question would be: if we're doing polymorphic dispatch out to JS anyway, then is there that much of a win. A motivating use case is if we needed to do a polymorphic method dispatch so we could call directly from wasm into a Web API, but I'm not currently aware of any such cases (yet).

@Pauan
Copy link
Author

Pauan commented Oct 2, 2018

@lukewagner Thanks a lot for answering my questions! So I was mistaken about the host bindings, they do indeed correspond to non-structural methods in wasm-bindgen.

But as you said, it's possible for structural methods to use host bindings (by using Reflect.apply and Reflect.get), so my point to @alexcrichton does still stand.

Another question: when you are importing a method, how do you specify which class it should use? In other words, how would you import Foo.prototype.foo? The method name obviously goes into the import list as usual, but where do you specify the class?

Or are you saying that it will still be necessary to create a JS shim that basically does export const foo = Foo.prototype.foo; (and wasm then imports that foo variable)?

However my question would be: if we're doing polymorphic dispatch out to JS anyway, then is there that much of a win.

One of our motivations is to be able to (ideally) remove all JS shims and just use host bindings to directly call APIs. Even if we can't use host bindings 100% of the time, we want to be able to use them as much as possible (for improved performance, and smaller file size, due to not needing JS shims).

There are situations in wasm-bindgen where we need to do prototype lookups (i.e. the same as foo.bar() in JS). So having an easy and efficient way to do that with host bindings would be useful, though perhaps it's not necessary in practice (since we can always use Reflect.get + Reflect.apply)

@lukewagner
Copy link

But as you said, it's possible for structural methods to use host bindings (by using Reflect.apply and Reflect.get), so my point to @alexcrichton does still stand.

I also like (and originally advocated for) the original idea of using the intended wasm Host Bindings semantics. But if there's a significant performance advantage to generating obj.foo() for now, then I can see the reasoning for just emitting that now and later, when wasm offers an even faster way to call Web APIs, either making technically-but-maybe-not-in-practice breaking change for calls to Web APIs, or through some new opt-in syntactic option.

Another question: when you are importing a method, how do you specify which class it should use? In other words, how would you import Foo.prototype.foo?

By default, wasm is simply given values directly (from the import object) or, later, with ESM integration, from some other module's exports. To avoid the boilerplate of plucking a bunch of properties off various global objects/ctors/prototypes, one could imagine yet another Host Binding to do this directly as part of the JS wasm instantiation process, but again this likely wouldn't be a performance improvement, just glue-code reduction. If Built-in Modules become real, then wasm could import those directly without glue. Furthermore, if builtin modules reflected existing builtins currently on the global (a variation of getOriginals), then potentially no glue would be needed for any builtin. But this is all rather speculative at the moment, so I think we need to wait a bit to see (and participate) in how that develops.

One of our motivations is to be able to (ideally) remove all JS shims and just use host bindings to directly call APIs.

I like that goal, although there's obviously a balance between spec/engine complexity and supporting all the possible use cases. So at least for a "Host Bindings MVP", I'm interested to focus on just the set of things that remove JS glue thunks between wasm and Web APIs, allowing fast, direct calls.

@alexcrichton
Copy link
Contributor

The discussion here is pretty long at this point and a bit winding with respect to alternative, so I've opened up a formal alternative to this PR of using Deref while also switching to structural by default. That hopefully summarizes all that's been said here as well!

@alexcrichton
Copy link
Contributor

For those interested, we also had a discussion in the last working group meeting about this RFC recapping a few things.

@Pauan
Copy link
Author

Pauan commented Oct 6, 2018

@lukewagner Thanks a lot, that's very useful information!

So my understanding is that JS shims will still be necessary even after the host-bindings proposal is implemented.

The only thing the host-bindings proposal buys us is the ability to make function calls with this (and the anyref proposal lets us avoid manual stack/heap allocations of JS objects).

So in order to avoid JS shims, we would need something even further, such as the built-in modules proposal.

That definitely makes structural seem a lot better, since shims will still be needed even with final + host-bindings.

@alexcrichton
Copy link
Contributor

alexcrichton commented Oct 29, 2018

I'd like to now propose that we enter the Final Comment Period for this RFC.

Disposition: close

@rustwasm/core members to sign off:

This is done at the same time of proposing that we merge #5

@fitzgen
Copy link
Member

fitzgen commented Oct 30, 2018

In favor of closing this RFC and pursuing #5

@alexcrichton
Copy link
Contributor

#5 has finished its FCP and is now merged, so I'm going to close this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants