Creating observable array should be optimized #68

kmalakoff · 2015-12-06T16:42:11Z

I added one observable array property to an object and the creation cycle is taking up large amount of time.

I create them like (I really like the idea of shallowObservables for arrays to replace asFlat since I'm using it quite a bit as I optimize):

    this.tags = observable(asFlat(options.tags || []));

I think that the creation path for observable arrays should be streamlined rather than using the generic splicing logic...nothing can observe yet, no childObserving logic is needed if the values are flat, etc.

The text was updated successfully, but these errors were encountered:

mweststrate · 2015-12-11T07:24:14Z

Hi @kmalakoff

  findOrCreate(path) {
    let projectNode = this.projectNode(); // this makes sure there is a root node
    let node = projectNode.find(path);
    if (node) return node;

    // this traverses splits the path and traverses the tree to find the node and will generate the full tree down to it if the path does not exist. It will create the default observable array of tags for each new node based on the name of the node and assign the passed tags.
    node = this.projectNode().findOrCreate(path, (attributes) => {
      return new LocalNode(_.extend({project: this.project, path, attributes, tags: [NAME_TO_TYPE[attributes.name] || 'module'])); // this only gets called if the node is not found. It extends properties into the class
    });
    return node;
  }

This part seems quite expensive, how / why is this triggered during the creating of the tags array? (just trying to get the complete image). Because making an object observable in itself should basically side effect free, so how is this one triggered? Does simplifying this code make a difference for the performance?

I have no problem cutting the childObserving logic out, but in general that is so cheap that I doubt whether that makes a big difference. But as said I'm a bit puzzled how adding a tag triggers all this logic?

kmalakoff · 2015-12-11T14:36:38Z

@mweststrate I'm using chokidar to scan and sync with the filesystem and depending on the files it encounters, it adds tags to a folder node (eg. files map to tags and folders map to nodes).

Very asynchronous / event-driven..but it should be cheap because most of the time it finds an existing folder node (the find before findOrCreate so the create should only happen the first time the folder is encountered).

My hypothesis based on comparing using and not using an observable array is that this is just a scale issue that needs an optimization (1000s of nodes being created). I could have a bug somewhere and can investigate further, but the stack trace looked like it confirmed my hypothesis (I could interpreting this incorrectly!)...it looks like there is something slow in there, but I think it is probably just the overhead of doing setup (needed) and tracking (not needed), but it could be something else? (not sure if a computed calculation dependency is somehow slipping in and getting triggered)

Which part looks expensive to you?

mweststrate · 2015-12-11T16:10:25Z

@kmalakoff Eh... wait I was reading the chart upside down, hence I was totally confused and wondering how findOrCreate is a side effect of replace :). I usually use the top down view ;). The call stack is findOrCreate > replace > splice > updateLength so that makes perfect sense. I'll look into it! I was cleaning up the code base of the core algorithm a bit anyway, so I did already spot some additional optimization opportunities (both in performance and memory consumption).

mweststrate · 2015-12-11T21:56:33Z

@kmalakoff I've played around with short-circuiting the array initialization, see:
a6ac0c1

In my tests it was a bit faster (but not extremely significant, < 10%). Turns out the actual expensive part is defining the properties on the object. Now basically they can also be defined on the prototype, which saves in my tests 90% of the initialization time:
94151c2

Yet the down side of the latter optimization is that the indexes of the array are no longer reported as being enumerable. This the point where the abstraction of observable arrays is a bit leaky until ES6 proxies are general available; mobservable arrays tries their best to be normal arrays but are in fact objects. For most things this doesn't matter; index or length assignments, array methods etc all work correctly. However, Array.isArray will still report false on them, and Array.concat or lodash methods might not handle them correctly, so in these cases it is best to slice() them anyway before passing along.

Currently JSON.stringify and Object.keys do still work correctly, and debuggers are ok-ish thanks to the correct enumerability. But maybe this half working arrays are just confusing people and shouldn't I care too much about the correct enumerability, but just make it more clear that these aren't arrays (just like immutablejs lists aren't arrays) despite the fact that the api and syntax the same as for real arrays.

Anyway, would you mind checking both commits above to see whether this makes the performance of arrays acceptable? The first option can be safely applied without breaking anything. This branch is based on the latest transform branch that includes your commits.

EDIT: accidentally swapped the commits

kmalakoff · 2015-12-12T11:49:13Z

Excellent! I was away from a computer yesterday and will run some performance evaluations this first thing in the morning.

As for the tradeoffs:

indexes of the array are no longer reported as being enumerable seems like a rare, edge case that can probably be solved another way in client code if needed
But maybe this half working arrays are just confusing people and shouldn't I care too much about the correct enumerability. I ran into this and read your docs which made me wonder if there is an easier way to do this. I don't have a solution and need to play around with ES6 features more to better understand the syntactic sugar limits, but it is on my list to think about (not sure if there is a way to add this functionality onto the array instance instead of returning a different type? can it be handled like in knockout where dereferencing the array gets the underlying array? - I realize these are a bit naive, but just would need to go through the thought process and see if I end up in the same place as you).

Definitely, rather than always using slice because creates a copy (this is not guaranteed to scale well with large array sizes due to memory management and the corresponding reference copying operations), I would prefer peek or value for read-only use with libraries and I would know that if it needs to be edited, it should use slice and sync it back. Obviously there is some risk to this, but it seems acceptable (perhaps in a debug mode, you try to identify modifications to peeked values if it is reported as a common source of bugs). I was a video games developer in a past life so I'm keenly aware that a slice is not free, but a peek basically is.

kmalakoff · 2015-12-12T18:14:36Z

I've run some tests...

(1) tags are not observable - it still tracks as expensive from other parts of my code

(2) tags are observable - no optimizations yet

(3) tags are observable, short-circuiting the array initialization (a6ac0c1) - like you say, not much improvement

(4) tags are observable, defining the properties on the object (94151c2) - it doesn't track at all!

So I've confirmed like you discovered (4) has removed the expensive part and updateLength / spliceWithArray doesn't even track as significant. I'm going to use this version until it is released and start using transformations today!

Thank you for looking into and optimizing this!

mweststrate · 2015-12-14T20:42:43Z

Merged this back to feature/transform and released it as 1.1.6 (without the transform feature). Api changes:

Introduced mobservable.fastArray(array), in addition to mobservable.observable(array). Which is much faster when adding items but doesn't support enumerability (for (var idx in ar) .. loops). Fast arrays are by default initialized with asFlat.
Introduced observableArray.peek(), for fast access to the array values. Should be used read-only.

kmalakoff · 2015-12-15T14:44:30Z

Excellent! Thank you so much...

kmalakoff mentioned this issue Dec 9, 2015

Reactive graph transformations #67

Closed

mweststrate added a commit that referenced this issue Dec 14, 2015

Implemented fastArray and peek, see #68

a89e8df

mweststrate added a commit that referenced this issue Dec 14, 2015

Cherry-picked fastArray and peek from feature/transform, see #68

3fe8e71

mweststrate closed this as completed Dec 14, 2015

mweststrate mentioned this issue Dec 15, 2015

Roadmap ideas - umbrella issue for future development #74

Closed

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating observable array should be optimized #68

Creating observable array should be optimized #68

kmalakoff commented Dec 6, 2015

mweststrate commented Dec 11, 2015

kmalakoff commented Dec 11, 2015

mweststrate commented Dec 11, 2015

mweststrate commented Dec 11, 2015

kmalakoff commented Dec 12, 2015

kmalakoff commented Dec 12, 2015

mweststrate commented Dec 14, 2015

kmalakoff commented Dec 15, 2015

Creating observable array should be optimized #68

Creating observable array should be optimized #68

Comments

kmalakoff commented Dec 6, 2015

mweststrate commented Dec 11, 2015

kmalakoff commented Dec 11, 2015

mweststrate commented Dec 11, 2015

mweststrate commented Dec 11, 2015

kmalakoff commented Dec 12, 2015

kmalakoff commented Dec 12, 2015

mweststrate commented Dec 14, 2015

kmalakoff commented Dec 15, 2015