Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make binding (mostly) idempotent #46

Open
dead-claudia opened this issue Feb 22, 2017 · 16 comments
Open

Make binding (mostly) idempotent #46

dead-claudia opened this issue Feb 22, 2017 · 16 comments

Comments

@dead-claudia
Copy link
Contributor

As mentioned previously in multiple places here previously, the following behavior is highly unintuitive:

const obj = {foo() { return this }}
function bar() { return this }

assert(::obj.foo !== ::obj.foo)
assert(obj::bar !== obj::bar)

Instead of trying to ditch it entirely, here's my proposed fix: create a global per-realm 2D weak map to memoize bound functions. It's harder to transpile, but engine support won't have to make significant changes to accommodate this. Why am I suggesting this?

  • For similar reasons why people expect obj.foo === obj.foo, making the binding equivalent idempotent would also make this easier. It also makes event registration easier
  • Making idempotent binding weak means it is still collectable
  • Memoizing it per-realm avoids potential security holes when you add arbitrary properties to the returned function.
  • Memoizing it per-realm makes it possible to reliably transpile for most cases in Babel and friends.

This retains some of the existing benefits of the existing proposal with its unity, while removing one of the key gotchas people have complained about. In addition, it allows better engine optimization, because in practice, the weak map is merely an implementation detail:

  • Engines can natively check for known constant functions and values (either statically known or via IC), to improve branch prediction.
  • If you associate the maps per-function for constant ones, you only need a 1D map.
    • A transpiler could also do some of this to avoid some indirection, but only for statically provable cases.
  • Engines can dodge map creation when no callee makes use of function binding by generating the map lazily. This will avoid a perf impact (especially at startup and with simple lambdas) with legacy code.
  • Engines love being called with the same function repeatedly, and this allows for them to employ this optimization better for similar reasons to lambda lifting.

So, what do you all think about making it idempotent per-realm?

@bathos
Copy link

bathos commented Feb 23, 2017

This comes to mind as something I think is pretty weird, if I’m understanding the proposal correctly...

module-a

export const foo = { bar() { /*...*/ } };
(::foo.bar).qux = 4;

module-b

import { foo } from './module-a';

assert(foo.bar.qux === undefined);

const boundBar = ::foo.bar;

assert(boundBar.qux === 4);

I don’t think there’s any precedent for being able to pull objects out of the ether without having a prior reference like this, is there?

(Also, if they’re GCable, does that mean they are GCable when both the bound and the original function are no longer reffed, or in general? Because in the latter case, boundBar.qux might be 4 or it might be undefined — nondetermistically.)

@andyearnshaw
Copy link

@bathos that was my first thought too and I think that would be an important blocker for this idea. I don't see how it can be reliably transpiled either, without introducing a common global variable that all transpilers agreed upon (or accept that it's not really per-realm when you transpile it).

I'd rather see it implemented per scope instead of global/per-realm, e.g.:

module-a

import foo from './module-c';
import qux from './module-b';

qux(::foo.bar);
assert(::foo.bar !== ::foo.bar); // false

module-b

import foo from './module-c';

export default function qux (fn) {
    assert(::foo.bar !== fn); // true
};

Essentially, module-a would transpile/desugar to:

import foo from './module-c';
import qux from './module-b';

const __fooBarBound = foo.bar.bind(foo);
qux(__fooBarBound);
assert(__fooBarBound !== __fooBarBound); // false

I think this could work because it covers the biggest footgun and major use case for memoization, unsubscribe functions. It also gets the same GC benefits and is super easy to transpile to engines that don't support WeakMap.

@bathos
Copy link

bathos commented Feb 23, 2017

That makes more sense to me, I think that’s a lot safer and simpler, and realm-level globalness isn’t actually solving anything I think; 99% of the time, the event handler patterns in question would occur within a single function scope.

But my gut still says this is solving a footgun related to obscure semantics with more obscure semantics. I don’t think this is a good idea mainly because it creates such a unique case, a category of behavior which doesn’t fit into any prior pattern in the language. I probably can’t articulate a better position than that, though ... maybe I’m just thinking too conservative.

@Volune
Copy link

Volune commented Feb 23, 2017

Two cases I think may not work with the per-scope implementation:

const obj = { a() {}, b() {} );
const a = ::obj.a;
obj.a = obj.b;
const b = ::obj.a;
assert(a !== b);
const obj = { a() {}, b() {} }, boundObj = {};
Object.keys(obj).forEach((key) => {
  if (typeof obj[key] === 'function') {
    boundObj[key] = ::obj[key];
  }
});
Object.keys(obj).forEach((key) => {
  if (typeof obj[key] === 'function') {
    assert(boundObj[key] === ::obj[key]);
  }
});

@Alxandr
Copy link

Alxandr commented Feb 23, 2017

@Volune that depends on how you do the per-scope implementations though. If you do a weakmap of some-kind, the name won't matter. Something like this maybe (not tested)?

// input:
const foo = ::console.log;
// result
const __boundFn = (() => {
  // helper, maybe even in babel-polyfill or whatever
  const cache = new WeakMap();
  return (obj, fn) => {
    let fns;
    if (cache.has(obj)) {
      fns = cache.get(obj);
    } else {
      fns = new WeakMap();
      cache.set(obj, fns);
    }

    if (fns.contains(fn)) {
      return fns.get(fn);
    } else {
      bound = (...args) => fn.call(obj, args);
      fns.set(fn, bound);
      return bound;
    }
  };
})();

const foo = __boundFn(console, console.log);

@andyearnshaw
Copy link

andyearnshaw commented Feb 23, 2017 via email

@dead-claudia
Copy link
Contributor Author

Note that the per-scope thing is purely an optimization. The spec would be in terms of a per-realm 2D weak map, with entries added based on the realm each bound function is created. So basically, these would all work as expected:

Same module:

const foo = {bar() {}}
assert.equal(::foo.bar, ::foo.bar)

Different modules, same realm:

// ./a.js
export const foo = {bar() {}}
export const bound = ::foo.bar

// ./b.js
import {foo, bound} from "./a"
assert.equal(::foo.bar, bound)

Same module, different realm:

const vm = require("vm")

const foo = {bar() {}}
const bound = vm.runInNewContext("::foo.bar", foo)
assert.notEqual(::foo.bar, bound)

Different modules, different realm:

// ./a.js
const vm = require("vm")

export const foo = {bar() {}}
export const bound = vm.runInNewContext("::foo.bar", foo)

// ./b.js
import {foo, bound} from "./a"
assert.notEqual(::foo.bar, bound)

As for what Babel and friends will do, they'll need this helper function. (If only one version of this helper is used, then you've got nearly 100% spec compliance.)

// Instead of calling `func.bind(inst)`, it'll call `bind(func, inst)` from here
var wm = new WeakMap()
function bind(func, inst) {
    // Don't memoize for non-primitive `this`
    if (inst == null || typeof inst !== "object" && typeof inst !== "function") {
        return func.bind(inst)
    }
    var child = wm.get(func)
    if (child == null) wm.set(func, child = new WeakMap())
    var ref = child.get(inst)
    if (ref == null) child.set(inst, ref = func.bind(inst))
    return ref
}

Hopefully that better clarifies what I mean.

@dead-claudia
Copy link
Contributor Author

In spec language, it'd look like this:

6.1.7.4 Well Known Intrinsic Objects

Add the following values to the list of well known intrinsic objects:

  • %WeakMapPrototypeGet% - The original value of the get property of %WeakMapPrototype%
  • %WeakMapPrototypeSet% - The original value of the set property of %WeakMapPrototype%

8.2 Realms

Add a [[BoundFunctions]] field, a %WeakMap% instance, to the list of realm record fields.

Note that the [[BoundFunctions]] field, as well as its contents, is never exposed to user code, so an implementation might avoid creating it.

8.2.1 CreateRealm()

Add the following right before the last step:

  • Let bound be ! Construct(%WeakMap%).
  • Set realm.[[BoundFunctions]] to bound.

Note that the [[BoundFunctions]] field, as well as its contents, is never exposed to user code, so an implementation might avoid creating it.

Abstract Operation: InitializeBoundFunction(target, thisValue)

  • Let result be undefined.
  • Let child be undefined.
  • If IsCallable(target) is false, throw a TypeError exception.
  • If Type(thisValue) is Object:
    • Let ctx be the running execution context.
    • Let realm be ctx's realm.
    • Let bound be realm.[[BoundFunctions]].
    • Let child be ! Call(%WeakMapPrototypeGet%, bound)
    • If child is undefined:
      • Let child be ! Construct(%WeakMap%).
      • Perform ! Call(%WeakMapPrototypeSet%, bound, «child»)
    • Assert: child is a WeakMap.
    • Set result to ! Call(%WeakMapPrototypeGet%, child)
  • If result is undefined:
    • Let F be ? BoundFunctionCreate(target, thisValue, «»).
    • Set result to ? InitializeBoundFunctionProperties(F, thisValue).
    • If child is not undefined:
      • Assert: child is a WeakMap.
      • Perform ! Call(%WeakMapPrototypeSet%, child, «result»).
  • Assert: IsCallable(result) is true.
  • Return result.

Note that the [[BoundFunctions]] field, as well as its contents, is never exposed to user code, so an implementation might avoid creating it.

@InvictusMB
Copy link

@isiahmeadows
Why not store a reference to a bound instance of function on the original instance of function itself?
Why introduce a global shared state for this?

@dead-claudia
Copy link
Contributor Author

@InvictusMB You could, but it'd still require a single weak map. I used a global WeakMap for the first layer of indirection as a lazy way to mock up the semantics, but you could add a per-function field.

Either way is transparent to the user, so engines could even use a std::unordered_map on the object if they wanted.

@wearhere
Copy link

I'd like to vote for indexing at the top level by object rather than by function, i.e. adopt an implementation closer to #46 (comment) than the snippet at the bottom of #46 (comment); or, if folks didn't want to introduce global state, using per-object fields rather than per-function fields.

I suspect this will reduce the number of maps, because—at least when binding functions for use with React—it's common to bind multiple functions to the same object, but less common to bind the same function to multiple objects.

@wearhere
Copy link

Hi folks, I have a WIP Babel plugin that forks https://github.com/babel/babel/tree/master/packages/babel-plugin-transform-function-bind to roughly implement #46 (comment), you can see here. More commentary on the Gist, feedback welcome!

@dead-claudia
Copy link
Contributor Author

I like that, and it generally looks what I had intended with my suggestion.

@acutmore
Copy link

Perhaps the cached function should be frozen (Object.freeze) so the caller can not observe/introduce side-effects made permissible by the idempotence.

I think idempotence is crucial for this operator to work intuitively. Especially for the add/remove eventListener example:

addEventListener('foo', obj::bar);
removeEventListener('foo', obj::bar); // would leak listeners if not idempotence
// ref: https://github.com/tc39/proposal-bind-operator/issues/44#issue-203206592

On the other hand, the points about this being observable to the caller when the function is modified are also very good.

@zenparsing
Copy link
Member

It would definitely have to be frozen (kind of like template literal call site objects) in order to satisfy some object-security requirements.

On the other hand, the points about this being observable to the caller when the function is modified are also very good.

You would have to use a WeakMap to store data related to the bound function.

@dead-claudia
Copy link
Contributor Author

@zenparsing The weak map requirement was stated even in the initial comment of this issue. Just thought I'd point that out.

@acutmore I didn't initially think of idempotence, but that sounds like a good idea. If we also allow this, it also allows engines to create flyweights that are ===, but don't share the same memory address under the hood, so they don't incur the map overhead. (This has precedent in the form of strings - they could just return exotic auto-frozen callable objects.)

const bound1 = obj::bound
obj.bound = () => ...
const bound2 = obj::bound
assert(bound1 !== bound2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants