New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Array implementation #674

Closed
wants to merge 10 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@Skinney
Contributor

Skinney commented Jul 24, 2016

The current implementation of Arrays in elm is buggy, badly documented and hard to understand.

This PR introduces a re-implementation of Arrays which has none of the bugs of the current implementation, and is written using mostly Elm code.

Implementation details

This implementation adds a new module that provides a thin, but immutable, wrapper over Javascript arrays called JsArray. This module has not been exposed, so it's only available within core. The module could be inlined into the Array module, but I felt it would be cleaner to have it as its own module as it deals with its own type.

The actual Array implementation is written in Elm and makes use of the JsArray module. It's not a Relaxed Radix Tree, which is why append operations are much slower. On the other hand, this implementation "Just Works" with Elm's equality operator, which means that special handling of Arrays is unnecessary when performing equality checks. Checking two arrays for equality can also be an order of magnitude faster with this implementation.

initialize and slice operations are slower due to a naive implementation which could be improved in the future. On Chrome, get is about 50% of the performance, while in Safari get is on par with the current implementation. On both browsers, set and push are faster with this implementation.

Performance could improve with optimizations in the Elm language (like inlining Bitwise operations and wrapper-less function calls). Also, if a pop function is introduced, it could be used to improve performance of slice operations. Such a function would have to be careful to do the exact opposite of push to make sure == still works.

This implementation doesn't have any of the bugs the plagues the current Array implementation in core.

The implementation is based on the same idea as Persistent Vectors in Clojure. push pushes elements into a tail. When the tail is of size 32, a new empty tail is created, while the "full" tail is pushed into a HAMT-based tree. Pushing elements onto the array, as well as reading and replacing at the tail, is therefore a constant time operation.

To find a value in the tree, we divide the index into groups of five bits. This allows a single index to represent 7 groups of numbers between 0 and 31. We can use this groups to search for a value in the tree, where each group corresponds to an index at each level of the tree.

Which group represents the root/top of the tree, depends on the maximum depth of the tree. If an array has less than 32 elements, there really is no tree, we only need to look at the tail. If the tail overflows however, this is pushed into the tree, and we will have a maximum depth of two. The top level of the tree will in this case be represented by the second group of bits, while the first five bits will represent the position in the lowest level of the tree. Should the tree have a maximum depth of 3 (more than 1024 elements), the third group of bits will represent the position at the root.

This way of ordering the elements, ensures that the value corresponding to index 0 is at the bottom left of the tree, while the value corresponding to the highest index not in the tail, will be at the bottom right. In other words, the values are sorted by their corresponding index.

Benchmarks

The benchmarks are available in the native branch on http://www.github.com/Skinney/collections-ng.git.

The results, on a macbook air 2012, running on the latest version of Chrome, are as follows:

For the current implementation:

Starting Large Array suite.
bench.js:8501 Build x 88.14 ops/sec ±2.12% (48 runs sampled)
bench.js:8501 Repeat x 4,259 ops/sec ±0.86% (60 runs sampled)
bench.js:8501 Set x 833,365 ops/sec ±1.94% (61 runs sampled)
bench.js:8501 Push x 788,039 ops/sec ±1.68% (60 runs sampled)
bench.js:8501 Get x 7,959,420 ops/sec ±1.78% (59 runs sampled)
bench.js:8501 Append x 448,015 ops/sec ±1.84% (59 runs sampled)
bench.js:8501 Slice from end x 706,822 ops/sec ±1.93% (60 runs sampled)
bench.js:8501 Slice from both x 383,147 ops/sec ±1.68% (61 runs sampled)
bench.js:8501 Fold x 7,460 ops/sec ±0.61% (40 runs sampled)
bench.js:8501 Indexed Map x 3,382 ops/sec ±1.69% (60 runs sampled)
bench.js:8501 Indexed List x 517 ops/sec ±3.56% (56 runs sampled)
bench.js:8501 Equality x 5,563 ops/sec ±0.79% (32 runs sampled)
bench.js:8501 Equality fail x 5,535 ops/sec ±0.75% (32 runs sampled)
bench.js:8501 Equality worst case x 3,296 ops/sec ±0.80% (21 runs sampled)
bench.js:8503 Done with Large Array suite.

For the implementation in this PR:

Starting Large Array suite.
bench.js:8501 Build x 101 ops/sec ±3.51% (51 runs sampled)
bench.js:8501 Repeat x 107 ops/sec ±0.95% (53 runs sampled)
bench.js:8501 Set x 1,423,405 ops/sec ±0.67% (61 runs sampled)
bench.js:8501 Push x 1,306,977 ops/sec ±0.75% (60 runs sampled)
bench.js:8501 Get x 3,101,145 ops/sec ±0.74% (61 runs sampled)
bench.js:8501 Append x 107 ops/sec ±2.46% (54 runs sampled)
bench.js:8501 Slice from end x 102 ops/sec ±4.07% (52 runs sampled)
bench.js:8501 Slice from both x 104 ops/sec ±1.42% (52 runs sampled)
bench.js:8501 Fold x 3,425 ops/sec ±0.61% (22 runs sampled)
bench.js:8501 Indexed Map x 101 ops/sec ±2.24% (52 runs sampled)
bench.js:8501 Indexed List x 808 ops/sec ±2.93% (54 runs sampled)
bench.js:8501 Equality x 1,113,068 ops/sec ±1.06% (61 runs sampled)
bench.js:8501 Equality fail x 1,118,502 ops/sec ±0.65% (61 runs sampled)
bench.js:8501 Equality worst case x 292 ops/sec ±0.72% (58 runs sampled)
bench.js:8503 Done with Large Array suite.

As mentioned earlier, results will vary between different browsers.

@process-bot

This comment has been minimized.

Show comment
Hide comment
@process-bot

process-bot Jul 24, 2016

Thanks for the pull request! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

process-bot commented Jul 24, 2016

Thanks for the pull request! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

Providing an index to slice which is above array length should imply …
…the rest of the array, not the rest of the array minus one.
@shmookey

This comment has been minimized.

Show comment
Hide comment
@shmookey

shmookey Jul 24, 2016

To save others the calculation, for each test this new version is...

Build               15% faster
Repeat              3980% slower
Set                 71% faster
Push                66% faster
Get                 39% slower
Append              420000% slower
Slice from end      690000% slower
Slice from both     370000% slower
Fold                220% slower
Indexed Map         3300% slower
Indexed List        56% faster
Equality            20000% faster
Equality fail       20000% faster
Equality worst case 1100% slower

shmookey commented Jul 24, 2016

To save others the calculation, for each test this new version is...

Build               15% faster
Repeat              3980% slower
Set                 71% faster
Push                66% faster
Get                 39% slower
Append              420000% slower
Slice from end      690000% slower
Slice from both     370000% slower
Fold                220% slower
Indexed Map         3300% slower
Indexed List        56% faster
Equality            20000% faster
Equality fail       20000% faster
Equality worst case 1100% slower
@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 24, 2016

Contributor

Get should be slower, not faster?

Contributor

Skinney commented Jul 24, 2016

Get should be slower, not faster?

@shmookey

This comment has been minimized.

Show comment
Hide comment
@shmookey

shmookey Jul 24, 2016

Right, fixed, sorry!

shmookey commented Jul 24, 2016

Right, fixed, sorry!

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 24, 2016

Contributor

No worries! I'm glad you added the numbers. That being said, I would appreciate if you also added numbers for equality, if it's not too much of a bother, as some of the tradeoff has been on equality vs append performance.

Repeat, initialize, slice, map, filter, append etc. are all essentially using a naive foldl push based solution while the current implementation uses specialized native code. That explains some of the difference in performance, but also indicates that there is potential for optimization. The "slice from end" test, for instance, could probably see an order of magnitude improvement if a pop function was introduced.

The focus for this PR has been on correctness and getting the basic operations (get, set and push) competitive in performance.

Contributor

Skinney commented Jul 24, 2016

No worries! I'm glad you added the numbers. That being said, I would appreciate if you also added numbers for equality, if it's not too much of a bother, as some of the tradeoff has been on equality vs append performance.

Repeat, initialize, slice, map, filter, append etc. are all essentially using a naive foldl push based solution while the current implementation uses specialized native code. That explains some of the difference in performance, but also indicates that there is potential for optimization. The "slice from end" test, for instance, could probably see an order of magnitude improvement if a pop function was introduced.

The focus for this PR has been on correctness and getting the basic operations (get, set and push) competitive in performance.

@shmookey

This comment has been minimized.

Show comment
Hide comment
@shmookey

shmookey Jul 24, 2016

Sure thing. My thoughts were initially that it doesn't matter if your equality is faster or slower because it's correct and the old one is not - but I suppose it's nice to be able to show that yours is also hundreds of times faster! 😄

shmookey commented Jul 24, 2016

Sure thing. My thoughts were initially that it doesn't matter if your equality is faster or slower because it's correct and the old one is not - but I suppose it's nice to be able to show that yours is also hundreds of times faster! 😄

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 24, 2016

Contributor

Equality of arrays have been fixed in newer versions of elm-lang/core by special handling of arrays in the eq function (see commit 8cd8606). So it currently works, it's just slower than it has to be.

Contributor

Skinney commented Jul 24, 2016

Equality of arrays have been fixed in newer versions of elm-lang/core by special handling of arrays in the eq function (see commit 8cd8606). So it currently works, it's just slower than it has to be.

Show outdated Hide outdated src/Native/JsArray.js
function get(idx, arr) {
var val = arr._0[idx];
if (val) {

This comment has been minimized.

@eeue56

eeue56 Jul 24, 2016

Contributor

This is wrong. What if the arr contains vals which are bools, some of which are false? I would use typeof val === "undefined"

@eeue56

eeue56 Jul 24, 2016

Contributor

This is wrong. What if the arr contains vals which are bools, some of which are false? I would use typeof val === "undefined"

Show outdated Hide outdated src/Native/JsArray.js
}
function set(idx, val, arr) {
var copy = arr._0.slice();

This comment has been minimized.

@eeue56

eeue56 Jul 24, 2016

Contributor

There are no bounds checking here. From the comment,

Sets an element at a given index. Returns an unmodified array if
 +index is out of bounds.

Check the array size, and make sure 0 <= idx < array.length

@eeue56

eeue56 Jul 24, 2016

Contributor

There are no bounds checking here. From the comment,

Sets an element at a given index. Returns an unmodified array if
 +index is out of bounds.

Check the array size, and make sure 0 <= idx < array.length

Show outdated Hide outdated src/Native/JsArray.js
function set(idx, val, arr) {
var copy = arr._0.slice();
copy[idx] = val;

This comment has been minimized.

@eeue56

eeue56 Jul 24, 2016

Contributor

This will put elements freely in the array. E.g -.
screen shot 2016-07-25 at 00 36 07

@eeue56

eeue56 Jul 24, 2016

Contributor

This will put elements freely in the array. E.g -.
screen shot 2016-07-25 at 00 36 07

Show outdated Hide outdated src/Native/JsArray.js
function foldl(f, init, arr) {
var a = init,
len = length(arr),
i;

This comment has been minimized.

@eeue56

eeue56 Jul 24, 2016

Contributor

I think this style is very risky needlessly. better to just move var i = 0 into the for loop

@eeue56

eeue56 Jul 24, 2016

Contributor

I think this style is very risky needlessly. better to just move var i = 0 into the for loop

Show outdated Hide outdated src/Native/JsArray.js
function foldr(f, init, arr) {
var a = init,
i;

This comment has been minimized.

@eeue56

eeue56 Jul 24, 2016

Contributor

ditto above

@eeue56

eeue56 Jul 24, 2016

Contributor

ditto above

{-| Pushes an element to the end of the array, increasing its size.
-}
push : a -> JsArray a -> JsArray a
push =

This comment has been minimized.

@eeue56

eeue56 Jul 24, 2016

Contributor

A wrapper around Array.prototype.concat might be helpful for performance, but nbd right now.

@eeue56

eeue56 Jul 24, 2016

Contributor

A wrapper around Array.prototype.concat might be helpful for performance, but nbd right now.

@shmookey

This comment has been minimized.

Show comment
Hide comment
@shmookey

shmookey Jul 25, 2016

Also, if a pop function is introduced, it could be used to improve performance of slice operations.

Could you elaborate on this at all? Not that I find myself slicing arrays very often, but the performance hit here is large, and that seems like as good a reason as any to add a new function. Presumably it needn't even be exposed?

shmookey commented Jul 25, 2016

Also, if a pop function is introduced, it could be used to improve performance of slice operations.

Could you elaborate on this at all? Not that I find myself slicing arrays very often, but the performance hit here is large, and that seems like as good a reason as any to add a new function. Presumably it needn't even be exposed?

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 25, 2016

Contributor

Right. So currently, the slice operation is essentially implemented as a foldn push empty source. In the case where you only want to slice from the end, it would be faster just to pop elements off the array (doesn't require a rebuild). This should also make slice faster when slicing from both ends, as it would be fewer elements to fold over.

The first concern is to have a good foundation in core. Optimizations can come later. I'll be happy to work further on performance if this is merged.

Contributor

Skinney commented Jul 25, 2016

Right. So currently, the slice operation is essentially implemented as a foldn push empty source. In the case where you only want to slice from the end, it would be faster just to pop elements off the array (doesn't require a rebuild). This should also make slice faster when slicing from both ends, as it would be fewer elements to fold over.

The first concern is to have a good foundation in core. Optimizations can come later. I'll be happy to work further on performance if this is merged.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 25, 2016

Member

Optimizations can come later. I'll be happy to work further on performance if this is merged.

I don't think this is a good way to approach things. It seems pretty smart to find a way to make slice faster. It's almost 700,000x slower, so I think that's worth exploring. Maybe the result is "We tried A and B and here's why they both do not work." That's better than "Maybe it can be better, but let's merge it into core before thinking it through."

I'm open to giving Skinney/elm-array-draft the ability to use the native wrapper for internal use of 32 element arrays if you need that so people who already have code that uses Array can try it and see more easily. I don't think that's what you were saying though.

Member

evancz commented Jul 25, 2016

Optimizations can come later. I'll be happy to work further on performance if this is merged.

I don't think this is a good way to approach things. It seems pretty smart to find a way to make slice faster. It's almost 700,000x slower, so I think that's worth exploring. Maybe the result is "We tried A and B and here's why they both do not work." That's better than "Maybe it can be better, but let's merge it into core before thinking it through."

I'm open to giving Skinney/elm-array-draft the ability to use the native wrapper for internal use of 32 element arrays if you need that so people who already have code that uses Array can try it and see more easily. I don't think that's what you were saying though.

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 25, 2016

Contributor

What I'm worried about is spending time optimizing something, and then not have this implementation used. But I see your point about slice, and other operations, being so much slower that it warrants further inspection. I'll keep at it and see what I come up with.

Contributor

Skinney commented Jul 25, 2016

What I'm worried about is spending time optimizing something, and then not have this implementation used. But I see your point about slice, and other operations, being so much slower that it warrants further inspection. I'll keep at it and see what I come up with.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 25, 2016

Member

What I'm worried about is spending time optimizing something, and then not have this implementation used.

This is always a possible outcome. I tried to express that here. For example, the List.take improvements started with something that was not viable. At that time, we had no idea if it would become viable. If it didn't, wouldn't it have been a bad outcome to merge it?

Point is, thinking that "contribution" means "merged into core" is a poisonous mindset. Languages cannot work that way. Like I say here, many of my projects don't make the cut. Learning is progress. Prototypes are progress. Performance numbers are progress. Discussion is progress. Doing work doesn't mean core should change.

(This is also why it is very important to talk about projects like this on elm-dev while they are in progress. That I am hearing about this for the first time as a PR frames our discussion as accept-or-reject which is not a reasonable first discussion of a large and complex change IMO.)

Member

evancz commented Jul 25, 2016

What I'm worried about is spending time optimizing something, and then not have this implementation used.

This is always a possible outcome. I tried to express that here. For example, the List.take improvements started with something that was not viable. At that time, we had no idea if it would become viable. If it didn't, wouldn't it have been a bad outcome to merge it?

Point is, thinking that "contribution" means "merged into core" is a poisonous mindset. Languages cannot work that way. Like I say here, many of my projects don't make the cut. Learning is progress. Prototypes are progress. Performance numbers are progress. Discussion is progress. Doing work doesn't mean core should change.

(This is also why it is very important to talk about projects like this on elm-dev while they are in progress. That I am hearing about this for the first time as a PR frames our discussion as accept-or-reject which is not a reasonable first discussion of a large and complex change IMO.)

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 25, 2016

Member

To come back to my initial comments here, I think it is important that we find people using Array in their projects and see what kind of impact this has on them. Like, if someone has a grid based game, does it still work fine?

That data is important for assessing the way things are now, and will make it clearer what else would be needed. (Having it be a separate package while we assess these things makes it possible to gather data without having an accept-or-reject discussion yet.)

So next steps should be finding Array users.

Member

evancz commented Jul 25, 2016

To come back to my initial comments here, I think it is important that we find people using Array in their projects and see what kind of impact this has on them. Like, if someone has a grid based game, does it still work fine?

That data is important for assessing the way things are now, and will make it clearer what else would be needed. (Having it be a separate package while we assess these things makes it possible to gather data without having an accept-or-reject discussion yet.)

So next steps should be finding Array users.

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 25, 2016

Contributor

I don't disagree with anything you're saying, but I think there are some misconceptions here that is worth clearing up.

  1. This code is already available in a separate library, called CollectionsNg, which also contains code for Dict and Set with working equality. This has been posted to the elm-dev mailing list in the past. The only difference between this PR and CollectionsNg (currently), is the JsArray module (CollectionsNg is available in elm-package, so no native code).

  2. I'm well aware that this code might not make the cut, and that's fine. For those who value correctness over performance, CollectionsNg is already available. The reason I would rather wait to do optimization on other functions in the API, is because you've already signalled that a 40% drop in get performance might be too much. If the people using Array today cannot tolerate a 40% drop, and we are not able to close that gap, spending time on optimizing other functions would seem like a waste of time, unless those optimizations will be available in CollectionsNg (all my ideas for optimizing other functions currently involve more use of native code).

Anyway. I agree that getting more information from people of the Array module is the logical next step, so the question becomes how we go about that. I've advertised CollectionsNg any way I know how, and I don't think the library sees much use.

Should I close this PR until we have more information from users of Array? Either way is fine by me. :)

Contributor

Skinney commented Jul 25, 2016

I don't disagree with anything you're saying, but I think there are some misconceptions here that is worth clearing up.

  1. This code is already available in a separate library, called CollectionsNg, which also contains code for Dict and Set with working equality. This has been posted to the elm-dev mailing list in the past. The only difference between this PR and CollectionsNg (currently), is the JsArray module (CollectionsNg is available in elm-package, so no native code).

  2. I'm well aware that this code might not make the cut, and that's fine. For those who value correctness over performance, CollectionsNg is already available. The reason I would rather wait to do optimization on other functions in the API, is because you've already signalled that a 40% drop in get performance might be too much. If the people using Array today cannot tolerate a 40% drop, and we are not able to close that gap, spending time on optimizing other functions would seem like a waste of time, unless those optimizations will be available in CollectionsNg (all my ideas for optimizing other functions currently involve more use of native code).

Anyway. I agree that getting more information from people of the Array module is the logical next step, so the question becomes how we go about that. I've advertised CollectionsNg any way I know how, and I don't think the library sees much use.

Should I close this PR until we have more information from users of Array? Either way is fine by me. :)

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 25, 2016

Member

Given that equality on Set and Dict is fixed in core, I think it makes sense to create a package focused primarily on an alternate Array implementation. In the spirit of obvious names, I would suggest calling it Skinney/elm-array-exploration and having it only expose Array.Hamt.

I can't find the RRBT version, but I have talked to the person working on it at Elm meetups in SF a lot. Maybe it could be included in this package as Array.Rrbt, making it easy for people to compare.

I'm also down to let a library like this have native access, with the explicit goal of figuring out how best to have a better version in core.

Once such a thing exists, we can be explicit about encouraging people to give it a try and see how it works. Another way to get "user data" is to create projects that naturally want to use arrays and see how they look. Maybe certain domains do not use append often? Maybe others do? This kind of thing can be distributed easily, but we'll have to solicit help if we want it.

Does that route make sense?

Member

evancz commented Jul 25, 2016

Given that equality on Set and Dict is fixed in core, I think it makes sense to create a package focused primarily on an alternate Array implementation. In the spirit of obvious names, I would suggest calling it Skinney/elm-array-exploration and having it only expose Array.Hamt.

I can't find the RRBT version, but I have talked to the person working on it at Elm meetups in SF a lot. Maybe it could be included in this package as Array.Rrbt, making it easy for people to compare.

I'm also down to let a library like this have native access, with the explicit goal of figuring out how best to have a better version in core.

Once such a thing exists, we can be explicit about encouraging people to give it a try and see how it works. Another way to get "user data" is to create projects that naturally want to use arrays and see how they look. Maybe certain domains do not use append often? Maybe others do? This kind of thing can be distributed easily, but we'll have to solicit help if we want it.

Does that route make sense?

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 25, 2016

Contributor

Sounds like a plan to me. I'll let you know when I have a repo up and running.

Contributor

Skinney commented Jul 25, 2016

Sounds like a plan to me. I'll let you know when I have a repo up and running.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 25, 2016

Member

Also, that makes sense about get vs slice. Makes sense to focus on the one that's the bigger deal.

On the note of inlining bitwise operations, I know @mgold did that by hand just to see what kind of impact it would have on his Random alternative. It went from slower-than-current to faster, which means it's basically no tradeoffs. Only better. I think it would be worth doing the same hand-modifications here just to see. Max also had created custom infix operators, and taking them out was a big improvement.

Member

evancz commented Jul 25, 2016

Also, that makes sense about get vs slice. Makes sense to focus on the one that's the bigger deal.

On the note of inlining bitwise operations, I know @mgold did that by hand just to see what kind of impact it would have on his Random alternative. It went from slower-than-current to faster, which means it's basically no tradeoffs. Only better. I think it would be worth doing the same hand-modifications here just to see. Max also had created custom infix operators, and taking them out was a big improvement.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 25, 2016

Member

Yeah, that's exactly what he did :) And having two libraries that want it means more folks can make sure the optimization is done correctly.

Member

evancz commented Jul 25, 2016

Yeah, that's exactly what he did :) And having two libraries that want it means more folks can make sure the optimization is done correctly.

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 25, 2016

Contributor

Hit the delete button by mistake, now your comment looks wierd. Haha. Anyway, cool. I'll have to try that out. :)

Contributor

Skinney commented Jul 25, 2016

Hit the delete button by mistake, now your comment looks wierd. Haha. Anyway, cool. I'll have to try that out. :)

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 26, 2016

Contributor

I'll chime in quickly to verify that Evan is correct: don't do (&&) = Bitwise.and, and having the compiler optimize bitwise ops will be a nice speedup. Also, I'm totally stealing your benchmarking setup.

Contributor

mgold commented Jul 26, 2016

I'll chime in quickly to verify that Evan is correct: don't do (&&) = Bitwise.and, and having the compiler optimize bitwise ops will be a nice speedup. Also, I'm totally stealing your benchmarking setup.

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 26, 2016

Contributor

I'll take that as a compliment @mgold

Contributor

Skinney commented Jul 26, 2016

I'll take that as a compliment @mgold

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 29, 2016

Contributor

@evancz The repo Skinney/elm-array-exploration is now up. I'm going to try to make some optimizations to slice before making a release, but maybe you could bless the repo now? (no idea how this works).

Contributor

Skinney commented Jul 29, 2016

@evancz The repo Skinney/elm-array-exploration is now up. I'm going to try to make some optimizations to slice before making a release, but maybe you could bless the repo now? (no idea how this works).

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Jul 30, 2016

Contributor

@evancz @mgold manually inlining the bitwise operations had a pretty nice effect on performance.

Before inline:

Starting Large Array suite.
bench.js:7347 Build x 128 ops/sec ±2.57% (53 runs sampled)
bench.js:7347 Repeat x 775 ops/sec ±2.36% (51 runs sampled)
bench.js:7347 Set x 1,256,978 ops/sec ±2.36% (59 runs sampled)
bench.js:7347 Push x 1,249,187 ops/sec ±2.13% (59 runs sampled)
bench.js:7347 Get x 3,078,710 ops/sec ±0.76% (56 runs sampled)
bench.js:7347 Append x 109 ops/sec ±2.45% (50 runs sampled)
bench.js:7347 Slice from end x 103 ops/sec ±2.34% (52 runs sampled)
bench.js:7347 Slice from both x 104 ops/sec ±1.65% (52 runs sampled)
bench.js:7347 Fold x 3,455 ops/sec ±0.85% (22 runs sampled)
bench.js:7347 Indexed Map x 98.51 ops/sec ±2.43% (50 runs sampled)
bench.js:7347 Indexed List x 1,507 ops/sec ±0.95% (60 runs sampled)
bench.js:7347 Equality x 1,102,689 ops/sec ±0.90% (59 runs sampled)
bench.js:7347 Equality fail x 1,110,780 ops/sec ±1.62% (60 runs sampled)
bench.js:7347 Equality worst case x 290 ops/sec ±0.95% (56 runs sampled)
bench.js:7349 Done with Large Array suite.

After inline:

bench.js:7345 Starting Large Array suite.
bench.js:7347 Build x 128 ops/sec ±2.67% (53 runs sampled)
bench.js:7347 Repeat x 1,029 ops/sec ±2.44% (46 runs sampled)
bench.js:7347 Set x 1,531,744 ops/sec ±2.24% (58 runs sampled)
bench.js:7347 Push x 1,351,414 ops/sec ±2.31% (57 runs sampled)
bench.js:7347 Get x 5,004,991 ops/sec ±1.32% (58 runs sampled)
bench.js:7347 Append x 120 ops/sec ±2.16% (54 runs sampled)
bench.js:7347 Slice from end x 115 ops/sec ±2.09% (52 runs sampled)
bench.js:7347 Slice from both x 115 ops/sec ±1.86% (53 runs sampled)
bench.js:7347 Fold x 3,491 ops/sec ±0.80% (22 runs sampled)
bench.js:7347 Indexed Map x 110 ops/sec ±3.04% (50 runs sampled)
bench.js:7347 Indexed List x 1,554 ops/sec ±0.64% (12 runs sampled)
bench.js:7347 Equality x 1,111,154 ops/sec ±0.90% (61 runs sampled)
bench.js:7347 Equality fail x 1,118,650 ops/sec ±0.96% (61 runs sampled)
bench.js:7347 Equality worst case x 285 ops/sec ±0.99% (59 runs sampled)
bench.js:7349 Done with Large Array suite.
Contributor

Skinney commented Jul 30, 2016

@evancz @mgold manually inlining the bitwise operations had a pretty nice effect on performance.

Before inline:

Starting Large Array suite.
bench.js:7347 Build x 128 ops/sec ±2.57% (53 runs sampled)
bench.js:7347 Repeat x 775 ops/sec ±2.36% (51 runs sampled)
bench.js:7347 Set x 1,256,978 ops/sec ±2.36% (59 runs sampled)
bench.js:7347 Push x 1,249,187 ops/sec ±2.13% (59 runs sampled)
bench.js:7347 Get x 3,078,710 ops/sec ±0.76% (56 runs sampled)
bench.js:7347 Append x 109 ops/sec ±2.45% (50 runs sampled)
bench.js:7347 Slice from end x 103 ops/sec ±2.34% (52 runs sampled)
bench.js:7347 Slice from both x 104 ops/sec ±1.65% (52 runs sampled)
bench.js:7347 Fold x 3,455 ops/sec ±0.85% (22 runs sampled)
bench.js:7347 Indexed Map x 98.51 ops/sec ±2.43% (50 runs sampled)
bench.js:7347 Indexed List x 1,507 ops/sec ±0.95% (60 runs sampled)
bench.js:7347 Equality x 1,102,689 ops/sec ±0.90% (59 runs sampled)
bench.js:7347 Equality fail x 1,110,780 ops/sec ±1.62% (60 runs sampled)
bench.js:7347 Equality worst case x 290 ops/sec ±0.95% (56 runs sampled)
bench.js:7349 Done with Large Array suite.

After inline:

bench.js:7345 Starting Large Array suite.
bench.js:7347 Build x 128 ops/sec ±2.67% (53 runs sampled)
bench.js:7347 Repeat x 1,029 ops/sec ±2.44% (46 runs sampled)
bench.js:7347 Set x 1,531,744 ops/sec ±2.24% (58 runs sampled)
bench.js:7347 Push x 1,351,414 ops/sec ±2.31% (57 runs sampled)
bench.js:7347 Get x 5,004,991 ops/sec ±1.32% (58 runs sampled)
bench.js:7347 Append x 120 ops/sec ±2.16% (54 runs sampled)
bench.js:7347 Slice from end x 115 ops/sec ±2.09% (52 runs sampled)
bench.js:7347 Slice from both x 115 ops/sec ±1.86% (53 runs sampled)
bench.js:7347 Fold x 3,491 ops/sec ±0.80% (22 runs sampled)
bench.js:7347 Indexed Map x 110 ops/sec ±3.04% (50 runs sampled)
bench.js:7347 Indexed List x 1,554 ops/sec ±0.64% (12 runs sampled)
bench.js:7347 Equality x 1,111,154 ops/sec ±0.90% (61 runs sampled)
bench.js:7347 Equality fail x 1,118,650 ops/sec ±0.96% (61 runs sampled)
bench.js:7347 Equality worst case x 285 ops/sec ±0.99% (59 runs sampled)
bench.js:7349 Done with Large Array suite.
@cobalamin

This comment has been minimized.

Show comment
Hide comment
@cobalamin

cobalamin Aug 2, 2016

I made something like a zipper for a tree - not a binary tree, but one where every node can have arbitrarily many children. I specifically chose Array for this because I need both get and slice a lot to traverse this zipper upwards or downwards. It shouldn't be too much work for the average computer of course, but when this tree grows big, a 700.000x performance degradation on slice (of which I do two in each step downwards or upwards) sounds pretty bad. I hope this can be optimised by a big margin 😄

Disclaimer: I don't know if my implementation makes any sense, it might be pretty naive, but I'm just giving an example for my usage of Array :)

cobalamin commented Aug 2, 2016

I made something like a zipper for a tree - not a binary tree, but one where every node can have arbitrarily many children. I specifically chose Array for this because I need both get and slice a lot to traverse this zipper upwards or downwards. It shouldn't be too much work for the average computer of course, but when this tree grows big, a 700.000x performance degradation on slice (of which I do two in each step downwards or upwards) sounds pretty bad. I hope this can be optimised by a big margin 😄

Disclaimer: I don't know if my implementation makes any sense, it might be pretty naive, but I'm just giving an example for my usage of Array :)

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Aug 2, 2016

Contributor

@cobalamin Glad to hear how you're using Array :)

If you look at the latest work in the new repo, (Skinney/elm-array-exploration), you will see that I've managed to significantly optimize slicing from the end/right, to the degree that it should be competetive with the current implementation (still have some explorations to do before I release new benchmark results). I also have an idea on how to do this for left slices, but I think my solution will reduce get performance :/

I'm also getting close to understand how rrbt (the current implementation) works, meaning I could replicate it with my own implementation. I'm pretty certain such an implementation would be slower than the current PR in everything except append (and maybe left slicing), but we won't know until we have benchmarks.

Also worth noting. The benchmark results you see are performed on Arrays of 10.000 elements in size. Being able to slice an array that big 115 times per second on a macbook air from 2012, should not be an issue for the majority of frontend applications. I'm not saying this as an excuse or anything, I'm trying as hard as I can to make this implementation as fast as possible, but it's important to keep in mind what the numbers actually mean.

Contributor

Skinney commented Aug 2, 2016

@cobalamin Glad to hear how you're using Array :)

If you look at the latest work in the new repo, (Skinney/elm-array-exploration), you will see that I've managed to significantly optimize slicing from the end/right, to the degree that it should be competetive with the current implementation (still have some explorations to do before I release new benchmark results). I also have an idea on how to do this for left slices, but I think my solution will reduce get performance :/

I'm also getting close to understand how rrbt (the current implementation) works, meaning I could replicate it with my own implementation. I'm pretty certain such an implementation would be slower than the current PR in everything except append (and maybe left slicing), but we won't know until we have benchmarks.

Also worth noting. The benchmark results you see are performed on Arrays of 10.000 elements in size. Being able to slice an array that big 115 times per second on a macbook air from 2012, should not be an issue for the majority of frontend applications. I'm not saying this as an excuse or anything, I'm trying as hard as I can to make this implementation as fast as possible, but it's important to keep in mind what the numbers actually mean.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Aug 3, 2016

Member

Important note: the existing thing is not a proper RRBT implementation. It is something inspired by the paper, but it is not doing the right thing as far as we can tell. Maybe @rtfeldman knows where the community RRBT array implementation lives? I heard it was coming along nicely.

Member

evancz commented Aug 3, 2016

Important note: the existing thing is not a proper RRBT implementation. It is something inspired by the paper, but it is not doing the right thing as far as we can tell. Maybe @rtfeldman knows where the community RRBT array implementation lives? I heard it was coming along nicely.

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Aug 3, 2016

Member

Also, I'd love to know what strategies you are considering when trying to optimize things. Even just a quick "here are two things I'm considering" on elm-dev would be pretty cool because (1) it'll be easier to tell when an "optimization technique" is really an optimization hack and (2) more folks will hear about the project and come up with examples like the recent commenter.

Member

evancz commented Aug 3, 2016

Also, I'd love to know what strategies you are considering when trying to optimize things. Even just a quick "here are two things I'm considering" on elm-dev would be pretty cool because (1) it'll be easier to tell when an "optimization technique" is really an optimization hack and (2) more folks will hear about the project and come up with examples like the recent commenter.

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Aug 3, 2016

Contributor

Oh, interesting. Sure, I'll write a little status report on elm-dev soon :)

Contributor

Skinney commented Aug 3, 2016

Oh, interesting. Sure, I'll write a little status report on elm-dev soon :)

@Skinney

This comment has been minimized.

Show comment
Hide comment
@Skinney

Skinney Aug 4, 2016

Contributor

I'm closing this PR. Work is now being continued in another repo, and I plan to pick up the discussion on the elm-dev mailing list soon.

Contributor

Skinney commented Aug 4, 2016

I'm closing this PR. Work is now being continued in another repo, and I plan to pick up the discussion on the elm-dev mailing list soon.

@Skinney Skinney closed this Aug 4, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment