Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for customising instanceof behaviour #1405

Merged
merged 2 commits into from
Apr 12, 2019

Conversation

RReverser
Copy link
Member

This allows types to define custom Rust code as an override for JsCast::instanceof.

  • Fixes Specialise dyn_ref to improve value detection #1367 for types that have built-in cross-realm checks, allowing dyn_ref and dyn_into to accept values from other realms. In particular:
    • .instanceof::<Array>() will now automatically use Array::is_array instead of actual ... instanceof Array.
    • .instanceof::<Function>() will use JsValue::is_function. This also simplifies Function::try_from to being an alias for dyn_ref::<Function>().
    • .dyn_into::<JsString>(), .dyn_into::<Number>(), .dyn_into::<Boolean>() will also now use corresponding checks from JsValue.
      In theory, these 3 can break checks if someone has been using boxed primitives (new String(...), new Number(...), new Boolean(...)), but these are extremely rare in real-world JS code and are already not support by JsValue methods. If we do want to support them, I believe it should be done at the JsValue level in as_f64, as_bool, as_string too.
  • Fixes .dyn_into::<Symbol> / `.dyn_ref::<Symbol> don't work as expected #1370 - .instanceof::<Symbol>() will now use JsValue::is_symbol - previously it would return false for any value, even actual Symbols, making it impossible to use .dyn_ref::<Symbol>() and .dyn_into::<Symbol>().

cc @alexcrichton @fitzgen

@RReverser
Copy link
Member Author

RReverser commented Mar 28, 2019

In theory, these 3 can break checks if someone has been using boxed primitives

Heh, looks like some tests actually check this. What do we want to do about this? Do we want to add such support to corresponding JsValue::* methods, ignore the boxed variants or somehow combine checks?

@RReverser
Copy link
Member Author

It also seems strangely inconsistent that Boolean and Number have fn new bindings to create a boxed primitive, but JsString doesn't... @alexcrichton any chance we could soft-deprecate these on existing primitives?

@fitzgen
Copy link
Member

fitzgen commented Mar 28, 2019

Ideally we would have some raw layer that just does the instanceof check and matches JS semantics exactly, and then on top of that we would add some sort of "is-a?" predicate that Does What I Mean.

I'm not sure how to do this in a backwards compat way.

@RReverser
Copy link
Member Author

RReverser commented Mar 28, 2019

I'm not sure how to do this in a backwards compat way.

We can surely separate them by adding

trait JsCast {
  // ...existing definitions

  fn is_a(val: &JsValue) -> bool {
    Self::instanceof()
  }
}

and then overriding this extra method via attribute, but I see little benefit in that because dyn_ref, dyn_into etc. shouldn't blindly use instanceof IMO, and we would just end up with two methods...

@alexcrichton
Copy link
Contributor

Can we perhaps draw inspiration from JS somehow? How does JS typically handle these differences idiomatically? Can we basically make the idiomatic thing in JS the same idiomatic thing in Rust?

@RReverser
Copy link
Member Author

Can we basically make the idiomatic thing in JS the same idiomatic thing in Rust?

That's what this PR attempts to do.

Unfortunately, despite few proposals in the past, there is no central API for cross-realm checks, but the common consensus is that you should use them where available (like in case with typeof or Array.isArray or Node's Buffer.isBuffer etc.).

For other objects, which don't have corresponding method, the common alternative is to use string tag checks like Object.prototype.toString.call(...) === '[object ...]' but for now I decided not to go there as it adds potentially heavy runtime cost, whereas built-ins are free.

@Pauan
Copy link
Contributor

Pauan commented Apr 1, 2019

@RReverser is correct: there are many ways to check whether an object is a particular type, so there isn't a single "correct" way to do it in JS.

Instead you need to use typeof or Array.isArray or instanceof or {}.prototype.toString.call depending on the situation. So I agree with him that being able to customize the behavior of instanceof is needed.

However, since instanceof is an actual operator in JS, I think it would be far less confusing if it was renamed to something else, like is_a. That avoids any confusion with the instanceof operator. And then we should deprecate the instanceof method.

@alexcrichton
Copy link
Contributor

I think that sounds like the best of all worlds yeah where we:

  • Add a new method, is_a to JsCast
    • For now, make it a default method calling instanceof for backwards compatibility reasons
  • Deprecate instanceof, recommending is_a
  • Change all dyn_* methods to use is_a
  • Allow customizing is_a (as this PR does)
  • Take is_a customizations in this PR for Array/String etc

How's that sound?

@fitzgen
Copy link
Member

fitzgen commented Apr 1, 2019

That plan sounds great to me.

In an ideal world, I'd change the semantics of JsCast::instanceof be exactly JS instanceof, but I think that is backwards incompatible.

@RReverser
Copy link
Member Author

@alexcrichton Sounds perfectly reasonable to me! I'll update the PR.

@RReverser
Copy link
Member Author

In an ideal world, I'd change the semantics of JsCast::instanceof be exactly JS instanceof, but I think that is backwards incompatible.

@fitzgen I think that's what @alexcrichton suggests too? (keeping instanceof as an actual JS instanceof for now)

@fitzgen
Copy link
Member

fitzgen commented Apr 1, 2019

We implement JsCast::instanceof for some things like JsValue that have no JS equivalent.

@RReverser
Copy link
Member Author

JsCast::instanceof for some things like JsValue that have no JS equivalent.

Ohh. Yeah, I think these would have to be deprecated.

@RReverser
Copy link
Member Author

Btw, do we want to actually call it is_a or just is would be enough? (just to avoid a/an confusion)

@Pauan
Copy link
Contributor

Pauan commented Apr 1, 2019

@fitzgen In an ideal world, I'd change the semantics of JsCast::instanceof be exactly JS instanceof, but I think that is backwards incompatible.

Personally, I don't see any use for JsCast::instanceof, it's almost never what you want.

On the other hand, a JsValue::instanceof would be useful. It would just use the JS instanceof operator, and nothing else:

impl JsValue {
    pub fn instanceof(&self, class: &JsValue) -> bool {
        // intrinsic which uses `instanceof` in JS
    }
}

This would be useful as a low-level building block, but isn't intended to be used directly (instead JsCast::is_a is preferred).

@RReverser
Copy link
Member Author

RReverser commented Apr 1, 2019

On the other hand, a JsValue::instanceof would be useful. It would just use the JS instanceof operator, and nothing else:

I was thinking about it too, but for that we need to solve #1406 first (I raised it with this particular issue in mind too).

@Pauan
Copy link
Contributor

Pauan commented Apr 1, 2019

@RReverser Right, but JsValue::instanceof can be added in a separate PR, so it isn't really related to this PR, so we can decide that later.

My point was basically just that JsCast::instanceof isn't needed, so we can just deprecate it.

@RReverser
Copy link
Member Author

My point was basically just that JsCast::instanceof isn't needed, so we can just deprecate it.

Maybe better to deprecate in the same PR as JsValue::instanceof is added (in case someone does want a stable way to do that, for whatever weird reason)?

@RReverser
Copy link
Member Author

Also, as far as I can tell, for now we don't want to use O.p.toString on types without real cross-realm checks so we still need instanceof for now as a fallback.

@Pauan
Copy link
Contributor

Pauan commented Apr 1, 2019

@RReverser Except right now JsCast::instanceof doesn't always use instanceof, it's a weird mishmash of some things. So I'm fine with deprecating it in the same release as JsCast::is

Also, as far as I can tell, for now we don't want to use O.p.toString on types without real cross-realm checks so we still need instanceof for now as a fallback.

Why don't we want to use toString?

@RReverser
Copy link
Member Author

RReverser commented Apr 1, 2019

Why don't we want to use toString?

See above:

  1. @alexcrichton: > For now, make it a default method calling instanceof for backwards compatibility reasons
  2. me: > for now I decided not to go there as it adds potentially heavy runtime cost, whereas built-ins are free.

Aside from perf impact, toString is also not 100% reliable, although for different reasons.

@RReverser
Copy link
Member Author

Except right now JsCast::instanceof doesn't always use instanceof, it's a weird mishmash of some things.

Aside from JsValue, when else it doesn't use instanceof?

@Pauan
Copy link
Contributor

Pauan commented Apr 1, 2019

Aside from JsValue, when else it doesn't use instanceof?

I think just JsValue.

for now I decided not to go there as it adds potentially heavy runtime cost, whereas built-ins are free.

I created some benchmarks: https://jsperf.com/array-type-checking3

The builtin gets ~32,891,703 ops/sec. Using toString gets ~22,678,401 ops/sec.

A cost? Sure, but that's still 22 million ops/sec. So in the situations where a builtin doesn't exist, correctness is more important in my opinion.

toString is also not 100% reliable, although for different reasons.

This is the first time I've heard of that, could you explain more?

@alexcrichton
Copy link
Contributor

@RReverser trying to follow the discussion, were you on board with the previous plan? I was looking at #893 again today and realized that this'd be a great way to fix that, so was curious if you're still up for pursuing this change!

@RReverser
Copy link
Member Author

trying to follow the discussion, were you on board with the previous plan?

I was :) See #1405 (comment)

There have been just few follow-up questions which can be summarised as:

  1. Can we use a more generic is instead of is_a / is_an?
  2. Do we want to switch to Object.prototype.toString approach for everything else or do we keep instanceof as a default behaviour for now?

@alexcrichton
Copy link
Contributor

Since this is in theory not called too much (rather the dyn_* methods would be preferred) the name doesn't matter too too much so I'd be wary of taking such a small name as is, but I also don't have much against it. If we're reframing this as "this is a loose check to see if the types are compatible and will work together" then I think it's fine for us to change the internal behavior at any time, so for now I'd probably stick with instanceof and if necessary we could switch to Object.prototype.toString

@RReverser
Copy link
Member Author

so for now I'd probably stick with instanceof and if necessary we could switch to Object.prototype.toString

Oh, actually, this brings me back to another question from the beginning of this PR - do we really want to continue to support boxed variants of Number and Boolean? If so, current change to typeof will break them (because for boxed variants it returns object as expected).

Given that we don't support boxed variant of JsString and in general boxing primitives into objects is usually a bad idea and rarely happens in real-world code, I'd suggest to soft-deprecate fn new for these two types, but the final decision is not up to me.

@alexcrichton
Copy link
Contributor

So long as "general idiomatic patterns continue to work" I don't have strong feelings one way or another. I don't really know what it means to not support them or to to support them.

@Pauan
Copy link
Contributor

Pauan commented Apr 4, 2019

@alexcrichton In JS there is a distinction between primitive and boxed values:

  • "foo" is a primitive string.
  • new String("foo") is a boxed string.

Most of the time they behave the same, but there are some differences:

typeof "foo" === "string"
typeof new String("foo") === "object"
"foo" instanceof String === false
new String("foo") instanceof String === true

On the other hand, these are both true:

{}.toString.call("foo") === "[object String]"
{}.toString.call(new String("foo")) === "[object String]"

So you can use that to reliably detect both.

Right now, since JsString uses instanceof, that means that it's not actually possible to use string primitives like "foo" as a JsString.

If it used typeof then it would only work with primitive strings.

If it used {}.toString.call then it would work with both.

Boxed strings are very rare in JS. So using either typeof or {}.toString.call is fine, but using instanceof is not okay.

Since {}.toString.call supports both, it seems like the obvious choice, but it does have a bit of a performance cost, so using typeof may be better.

On the other hand, some types in JS don't work with typeof, so in those cases I recommend using {}.toString.call, since it's the most reliable method I know of for type detection (and it works cross-realm).

@RReverser
Copy link
Member Author

Sorry for the delay here - I think we agreed with the plan for this PR for now, I just need few days to finish as working on few projects in parallel :) (as far as I understand, there is no rush since no one raised these issues before anyway, so hopefully few days is fine)

@RReverser
Copy link
Member Author

Ok I fixed issues and decided to rename the method to is_type_of (with a signature T::is_type_of(&JsValue) -> bool) - @alexcrichton you said, it's unlikely to be called directly often and the longer name is preferable, and this seems to reflect what it does the best.

I've also added a default fallback to existing instanceof instead of replacing it altogether, as discussed above.

@RReverser
Copy link
Member Author

One more note: I haven't added custom check to type Object yet because it seems to be a breaking change and is tracked separately in #1366.

@RReverser
Copy link
Member Author

Noticed that, just like Function::try_from, JsString::try_from can also be updated to being an alias for dyn_ref instead of an unsafe code, so updated it too.

@alexcrichton
Copy link
Contributor

Nice! Could this perhaps switch the direction of the arguments though? Instead of T::is_type_of(&obj) could that be written as obj.is_type_of::<T>()? (sort of like instanceof today). That'd then require a seaparate method which isn't typically called by users but would be used to do the T::some_other_name(&obj).

Additionally, could the documentation be updated to indicate when you might want to use is_instance_of vs is_type_of?

@RReverser
Copy link
Member Author

That'd then require a seaparate method

Yeah this was the primary reason I didn't do it - from the discussion above I've got an impression that we don't want to encourage this method to be often called directly?

Also note that semantically it's in obj.is_type_of::<T>() that the name would have to be different, not in T::is_type_of -> T::some_other_name, because the direction of the name is rather opposite from instanceof ("T is type of value" vs "value is instance of T").

But, anyway, I guess we could add obj.has_type::<T>, the question is only whether we actually need it...

@alexcrichton
Copy link
Contributor

I don't really mind too much about the naming, but I suspect that we'll want to recommend that this is called almost all the time instead of is_instance_of. Most users will use dyn_* which won't matter much, but in some cases I suspect that is_instance_of is in used, and we'll want to make sure the recommended form is still of the form self.method_name::<T>() to ensure it reads left-to-right

@RReverser
Copy link
Member Author

Ok I can add obj.has_type::<T>() then.

@RReverser
Copy link
Member Author

(this was just a rebase to fix conflicts for now)

@RReverser
Copy link
Member Author

@alexcrichton Ok done and looks like CI passed.

@alexcrichton alexcrichton merged commit 657b97b into rustwasm:master Apr 12, 2019
@alexcrichton
Copy link
Contributor

Thanks @RReverser!

@RReverser RReverser deleted the instanceof branch April 12, 2019 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants