Skip to content

Commit

Permalink
fix #8
Browse files Browse the repository at this point in the history
  • Loading branch information
make-github-pseudonymous-again committed Sep 9, 2015
1 parent f6debcd commit a3990c9
Showing 1 changed file with 254 additions and 0 deletions.
254 changes: 254 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,260 @@ Alternatively, you can use any tool mentioned [here](http://bower.io/docs/tools/
require( [ "aureooms-js-fingertree" ] , function ( fingertree ) { ... } ) ;
```

## Optimization of the code step by step

I will copy here the output of the benchmark run on my computer after each
optimization change together with a small explanation.
_
*__Note__: I did not check if the real issues were caused by generators, bound functions,
V8 optimization inhibitors, or anything else...*

### When the benchmark was added
> d903efb8d9011a44a5ff732c569b2ae33aa5b8f1
What the benchmark code does is, in order:

- build an empty tree T
- add 100000 elements to the beginning of T
- remove those 100000 elements by popping the first element of T 100000 times
- add 100000 elements to the end of T
- split T at all 100000 positions, one after the other
- remove the 100000 elements of T by popping the last element 100000 times

Here are the first running times I measured for this implementation. It is not
difficult to see that the implementation is not fast at all. 100000 elements is
not that much right?

```sh
$ node benchmark/benchmark.js
number of operations: 100000
cons: 8584ms
tail: 6841ms
push: 8581ms
split: 144243ms
init: 6896ms
```

However most of the motivation for those optimization steps came from applying
the same benchmark to
[another JavaScript implementation](https://github.com/qiao/fingertree.js)
(actually, I copied the benchmark from Joe's repository). This made it obvious
that my implementation was performing badly.

```sh
$ node ../fingertree.js/benchmark/benchmark.js
number of operations: 100000
cons: 155ms
tail: 221ms
push: 134ms
split: 1889ms
init: 141ms
```

### First implementation of lazy evaluation of subtrees
> f0ca58169a32140a8617625466f15489a3549124
A Deep instance posseses a subtree which in some cases is not accessed at all
during the whole lifetime of its parent. So why build it? The idea is to delay
the construction of the object, which might be expensive, to a later point in
time using a proxy: a finger tree look-alike object that will construct the
real subtree if needed.

Time measurements after applying lazy evaluation concepts:
```sh
$ node benchmark/benchmark.js
number of operations: 100000
cons: 8585ms
tail: 6812ms
push: 8113ms
split: 139449ms
init: 6777ms
```

Unfortunately there is no big performance improvement here but as you'll see
there are some nastier things to change.

### Unwrap values
> 19c8ca9298c98eef7e3f257257cb4e01adfe0edb
Before this change, each value was first wrapped in a unique-value-node-like
structure defined as:

```js
class Measured {

constructor ( element , v ) {
this.element = element ;
this.v = v ;
}

measure ( ) {
return this.v ;
}

}
```

where `v` is the value returned by `Measure.measure( element )`. This wrapping
allowed to give the same interface to elements, nodes, digits and trees with
respect to computing measures. This approach has at least 2 problems:

- It introduces one additional level of indirection by wrapping all values.
- The way it was implemented here caches measures for all elements. This might not
be a good thing in settings where measures are big objects. Moreover, this
is in contradiction with the *lazy way* of functional programming
paradigms.

However, there is a simpler way to solve the *common interface* problem. Indeed,
the elements contained in the digits of the root tree can be measured
with the Measure object left unchanged, and starting from the subtree of the
root level we just need a modified Measure object that uses the interface of
Node2 and Node3 to compute the node measures. Never will digits contain objects
of different types, so patching the Measure object this way suffices, there is
no need to give the same interface to nodes and elements.

Here are the measurements after the change:

```sh
$ node benchmark/benchmark.js
number of operations: 100000
cons: 8452ms
tail: 6930ms
push: 7983ms
split: 141338ms
init: 6657ms
```

Mostly noise for the moment. There must be something else to fix...

### Removing iterator loops and bindings for fixed size sequences
> 28f6ee08d33e2cb2b9d0e5cae69948a7013bc0b7
At some point I decided to use a better tool than intuition to progress:
profiling. This change is the first where I guided the pruning by looking at
the output of `node --prof benchmark/benchmark.js` using `node-tick-processor`
from the [tick](https://www.npmjs.com/package/tick) package.

Before this change, digit measures where computed using the idiom

```js
reduce( M.plus.bind( M ) , map( M.measure.bind( M ) , values ) , M.zero( ) ) ;
```

However, since all four digit types are hard-coded has classes One, Two, Three
and Four we could as well give a custom implementation of the measure( ) method
to each digit type. For type Two for example, we have:

```js
class Two {

...

measure ( M ) {
return M.plus( M.measure( this.a ) , M.measure( this.b ) ) ;
}

...

}
```

Here are the times:

```sh
$ node benchmark/benchmark.js
number of operations: 100000
cons: 350ms
tail: 493ms
push: 398ms
split: 25851ms
init: 442ms
```

A drastic improvement right? However, there seems to be a problem with the
split method...

### Removing loops: part II
> 5b0af102240b3f0cfda7001f6b108a9811594c7a
It turns out I didn't remove all occurences of this generic digit measurement
method. Here are the running times after removing occurences of this method
from `Deep.splitTree`.

```js
$ node benchmark/benchmark.js
number of operations: 100000
cons: 370ms
tail: 473ms
push: 293ms
split: 7563ms
init: 396ms
```

Better.

### Specialized procedures for small loops: part II
> 88592b2562bea585c868e036a04dd1687e91211c
> 726354a3bd0591d767fb658d206680a6a74d2fbb
These two commits introduce specialized procedures for building trees from
small lists and digits. These are really small trees, i.e. 0 to 4 elements, so
why use the generic `from_iterable` constructor?

```js
$ node benchmark/benchmark.js
number of operations: 100000
cons: 315ms
tail: 435ms
push: 249ms
split: 4834ms
init: 408ms
```

Here again a big improvement on the split method.

### Dropping some of the es6 syntax
> 295011e3293b3dd2fe1edc80bdf1a14ca6d4dcca
This is a big one. I started writing this library with es6 syntax in mind.
However, looking at the profiler output you will see:

```sh
$ node-tick-processor isolate-0x2af4cf0-v8.log | grep '3:24'
146 2.3% 2.4% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
278 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
6 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
165 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
97 99.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
127 98.4% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
163 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
146 2.3% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
138 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
58 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
129 100.0% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
3 2.3% LazyCompile: *get /home/aureooms/sandbox/js-fingertree/js/dist/fingertree.js:3:24
```

I didn't analyze much but it has to do with the way classes are handled by
babel.

Ok so I just threw the es6 classes and replaced them with plain old prototypes.
Hereunder are the new running times:

```sh
$ node benchmark/benchmark.js
number of operations: 100000
cons: 83ms
tail: 196ms
push: 64ms
split: 1177ms
init: 148ms
```

Great!

### To be continued...

## References

- [Hinze and Paterson](http://staff.city.ac.uk/~ross/papers/FingerTree.pdf)
Expand Down

0 comments on commit a3990c9

Please sign in to comment.