New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster implementation for Record #286

Closed
Pauan opened this Issue Jan 16, 2015 · 8 comments

Comments

Projects
None yet
4 participants
@Pauan
Copy link

Pauan commented Jan 16, 2015

Record is currently implemented with a Map. But because it has a fixed size, and because the keys are always strings, it's possible to use a much simpler and faster implementation.

Each Record would have two properties, _keys and _values. _keys is a mutable JavaScript object mapping string keys to indexes, and _values is a mutable JavaScript array that contains the actual values.

To lookup a key in the Record, you first look it up in _keys which gives you an index, and then you lookup that index in _values.

To set a key in a Record, you first look it up in _keys, which gives you an index. You then make a copy of the _values array, mutate the index to be the new value, then return a new Record with the new _values array.

There are a few reasons this is fast:

  • You don't need to mutate or copy _keys at all: you can simply pass it as-is to the new Record, because it never changes.
  • Copying an array is super fast in JavaScript, because of various VM optimizations.
  • Looking up a string key in a JavaScript object is faster than doing the equivalent lookup in a Map, and you get to take advantage of the various VM optimizations for objects.

I have implemented this technique in my library, and you can find the benchmarks here:

https://github.com/Pauan/Immutable/blob/javascript/benchmarks/Record/2015-01-16

Getting is always super fast. Setting is fast until you reach around 1,000 keys or so, and it's extremely rare to have Records with 1,000+ keys in them, so that isn't a problem.

@leebyron

This comment has been minimized.

Copy link
Collaborator

leebyron commented Jan 16, 2015

Yes! Thanks for writing up the whole draft of this, this has been on my wishlist for a while and I appreciate you documenting up the approach.

@copy

This comment has been minimized.

Copy link

copy commented Jan 22, 2015

I'm also interested in this.

Under the hood, when you create many objects with the same properties, v8 makes the same optimization as suggested. I think it would perform best if we just use an normal JavaScript object for storage. I'm not sure about the copying cost though. Here's a good introduction to the topic: http://vimeo.com/43334972

@Pauan

This comment has been minimized.

Copy link

Pauan commented Jan 22, 2015

@copy Making a copy of an object is very slow: #174 (comment)

As shown in these benchmarks, making a copy of an array is much faster than making a copy of an object. This isn't surprising, since arrays are much simpler data structures than objects.

@copy

This comment has been minimized.

Copy link

copy commented Jan 22, 2015

@Pauan The copy function in that benchmark is pretty bad, using for-in instead of a function specific to the structure of the Record. Optimally, a Record with three properties x, y, z would create a function like:

function copy(p)
{
    return {
        x: p.x,
        y: p.y,
        z: p.z,
    };
}

It probably doesn't matter too much, but your solution has an indirection on every get/set.

@Pauan

This comment has been minimized.

Copy link

Pauan commented Jan 22, 2015

@copy Using eval or new Function lets you create a custom function for each Record, however certain systems (like Google Chrome Extensions) forbids eval.

You are right that there is a (very minor) indirection, but you'll note that even with the indirection, it blows the pants off of the other immutable data structures, and is close to mutable object speed for get.

I re-ran the benchmarks, this time adding in a custom function as you suggested:

https://github.com/Pauan/Immutable/blob/javascript/benchmarks/Record/2015-01-22

  • JavaScript Object Copying (eval) is your idea.
  • Immutable-js Record is the current implementation in Immutable-js, and it does not use array copying: it uses trees.
  • Immutable Record is the array copying idea, which is discussed in this bug report.

As you can see, your idea is slower for set than copying an array. Copying an array is really really fast. Objects are fundamentally more complex than arrays.

In any case, thanks for the idea (and the video), even if it didn't pan out in the end. I'm always willing to learn new techniques to make these data structures faster.

@Pauan

This comment has been minimized.

Copy link

Pauan commented Jan 22, 2015

After I posted that, I remembered that V8 has optimizations for constructors, so I added in JavaScript Object Copying (constructor). Unfortunately, it ended up being significantly slower than eval.

@leebyron leebyron modified the milestone: 4.0 Jun 18, 2015

@leebyron leebyron removed the 4.0 label Jun 18, 2015

@ghost

This comment has been minimized.

Copy link

ghost commented Aug 4, 2015

Thank you for reporting this issue and appreciate your patience. We've notified the core team for an update on this issue. We're looking for a response within the next 30 days or the issue may be closed.

leebyron added a commit that referenced this issue Mar 10, 2017

RFC: Refactor Record. No longer a Collection. Faster guts.
This is a breaking change refactor of Record which no longer extends a collection type and therefore does not inherit sequence methods - which was a constant source of confusion and breakage. It also refactors the internals to no longer rely on a Map, but instead a fixed size List which should result in dramatically faster performance. This is also a breaking change as `delete()` (aka `remove()`) and `clear()` are no longer available - previously these methods reverted values to their default value, now that can be done with `set(k, undefined)`.

This adds a predicate function `isRecord()` and also minorly refactors related functionality such as `Seq()` constructors.

Fixes #505
Fixes #286

leebyron added a commit that referenced this issue Mar 10, 2017

RFC: Refactor Record. No longer a Collection. Faster guts.
This is a breaking change refactor of Record which no longer extends a collection type and therefore does not inherit sequence methods - which was a constant source of confusion and breakage. It also refactors the internals to no longer rely on a Map, but instead a fixed size List which should result in dramatically faster performance. This is also a breaking change as `delete()` (aka `remove()`) and `clear()` are no longer available - previously these methods reverted values to their default value, now that can be done with `set(k, undefined)`.

This adds a predicate function `isRecord()` and also minorly refactors related functionality such as `Seq()` constructors.

Fixes #505
Fixes #286

leebyron added a commit that referenced this issue Mar 10, 2017

RFC: Refactor Record. No longer a Collection. Faster guts. (#1135)
* RFC: Refactor Record. No longer a Collection. Faster guts.

This is a breaking change refactor of Record which no longer extends a collection type and therefore does not inherit sequence methods - which was a constant source of confusion and breakage. It also refactors the internals to no longer rely on a Map, but instead a fixed size List which should result in dramatically faster performance. This is also a breaking change as `delete()` (aka `remove()`) and `clear()` are no longer available - previously these methods reverted values to their default value, now that can be done with `set(k, undefined)`.

This adds a predicate function `isRecord()` and also minorly refactors related functionality such as `Seq()` constructors.

Fixes #505
Fixes #286

* Add perf test

leebyron added a commit that referenced this issue Mar 10, 2017

RFC: Refactor Record. No longer a Collection. Faster guts. (#1135)
* RFC: Refactor Record. No longer a Collection. Faster guts.

This is a breaking change refactor of Record which no longer extends a collection type and therefore does not inherit sequence methods - which was a constant source of confusion and breakage. It also refactors the internals to no longer rely on a Map, but instead a fixed size List which should result in dramatically faster performance. This is also a breaking change as `delete()` (aka `remove()`) and `clear()` are no longer available - previously these methods reverted values to their default value, now that can be done with `set(k, undefined)`.

This adds a predicate function `isRecord()` and also minorly refactors related functionality such as `Seq()` constructors.

Fixes #505
Fixes #286

* Add perf test
@ianstormtaylor

This comment has been minimized.

Copy link

ianstormtaylor commented Apr 1, 2017

@leebyron this is in the 4.0 releases now, yeah? Is it something that you've seen big enough performance gains with that it should be added to the release history? Others might be interested if so 😄 thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment