Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize h #187

Closed
wants to merge 6 commits into from
Closed

Optimize h #187

wants to merge 6 commits into from

Conversation

ngryman
Copy link
Contributor

@ngryman ngryman commented Apr 15, 2017

Let's start from this implementation which passes the current tests.

Just to have an idea, here are the benchs:

Before

NANOBENCH version 2
> node

# vnode with no child 1M times
ok ~315 ms (0 s + 315373353 ns)

# vnode with a single child 1M times
ok ~1.35 s (1 s + 353013098 ns)

# vnode with a single array child 1M times
ok ~1.66 s (1 s + 657042054 ns)

all benchmarks completed
ok ~3.33 s (3 s + 325428505 ns)

After

NANOBENCH version 2
> node

# vnode with no child 1M times
ok ~287 ms (0 s + 287287753 ns)

# vnode with a single child 1M times
ok ~692 ms (0 s + 692102600 ns)

# vnode with a single array child 1M times
ok ~822 ms (0 s + 822467710 ns)

all benchmarks completed
ok ~1.8 s (1 s + 801858063 ns)

@codecov
Copy link

codecov bot commented Apr 15, 2017

Codecov Report

Merging #187 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master   #187   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files           4      4           
  Lines         176    176           
  Branches       45     47    +2     
=====================================
  Hits          176    176
Impacted Files Coverage Δ
src/h.js 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00de017...63f2c03. Read the comment docs.

src/h.js Outdated
var val = arguments[i]
if (Array.isArray(val)) {
for (var j = 0; j < val.length; j++) {
children.push(val[j])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngryman val[j] can be an array, as we previously discussed on #185.

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 15, 2017

@ngryman Good start! Remember that when using Hyperx, children can be arrays that contain nested arrays too. Your implementation actually works fine with JSX. (Almost see comment.)

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 15, 2017

Here is the implementation that I eventually arrived at that works with both hyperx and jsx and drops the stack:

export default function (tag, data) {
  var children = []

  for (var i = 2; i < arguments.length;) {
    push(arguments[i++], children)
  }

  return typeof tag === "string"
    ? {
      tag: tag,
      data: data || {},
      children: children
    }
    : tag(data, children)

  function push(node) {
    if (Array.isArray(node)) {
      for (var i = 0; i < node.length; i++) {
        push(node[i], children)
      }
    } else if (node != null && node !== true && node !== false) {
      children.push(node)
    }
  }
}

It's small, but performance was bad, and I couldn't optimize it. I hope it gives you some ideas.

src/h.js Outdated
if (typeof node === "number") {
node = node + ""
}
else if (val && val !== true) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngryman You want:

} else if (val != null && val !== true && val !== false) {

Otherwise we will not be able to process text nodes with the number 0.

@ngryman
Copy link
Contributor Author

ngryman commented Apr 15, 2017

@jbucaran Coverage drops because no tests expose the way hyperx pass children as 3rd parameter and not as splats.

src/h.js Outdated
var val = arguments[i]
if (Array.isArray(val)) {
for (var j = 0; j < val.length; j++) {
children.push(val[j])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/h.js Outdated
stack[stack.length] = arguments[i]
}

while (stack.length) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbucaran There is no need for an additional stack. As you mentioned in #185, we are only treating 1 level of nested children. So we basically can concatenate, but without using concat which allocates a new array.
If you find that it's possible to have a deepest level of nesting, then we could discuss about this.

src/h.js Outdated
}
else if (val && val !== true) {
if (typeof val === 'number') {
val = val + ''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbucaran Out of curiosity, why?

@ngryman
Copy link
Contributor Author

ngryman commented Apr 15, 2017

New benchs, code is being deopt, working on it.

NANOBENCH version 2
> node

# vnode with no child 1M times
ok ~687 ms (0 s + 686546561 ns)

# vnode with a single child 1M times
Materialization [0x7fff5fbf5e80] <- 0x1704c2e9e771 ;  [removing optimized code for: h]
ok ~981 ms (0 s + 981468009 ns)

# vnode with a single array child 1M times
Materialization [0x7fff5fbf5e80] <- 0x369b85d129 ;  [removing optimized code for: h]
[removing optimized code for: addChild]
ok ~1.04 s (1 s + 39802550 ns)

all benchmarks completed
ok ~2.71 s (2 s + 707817120 ns)

@ngryman
Copy link
Contributor Author

ngryman commented Apr 15, 2017

New benchs, I can't really do better I think:

NANOBENCH version 2
> node

# vnode with no child 1M times
ok ~277 ms (0 s + 277363055 ns)

# vnode with a single child 1M times
ok ~905 ms (0 s + 905257714 ns)

# vnode with a single array child 1M times
ok ~934 ms (0 s + 934230712 ns)

all benchmarks completed
ok ~2.12 s (2 s + 116851481 ns)

But I'm gonna tease you @jbucaran, if we enforce every node to be the exact same object (even for text nodes), here is what we get:

NANOBENCH version 2
> node

# vnode with no child 1M times
ok ~329 ms (0 s + 328797485 ns)

# vnode with a single child 1M times
ok ~547 ms (0 s + 546626621 ns)

# vnode with a single array child 1M times
ok ~552 ms (0 s + 552139589 ns)

all benchmarks completed
ok ~1.43 s (1 s + 427563695 ns)

Basically the current vdom implementation mixes different types of nodes, it can be strings or object. If we statically type the node object by creating a constructor for it and we make sure that the tree only manipulates instances of this constructor, then v8 and friends can optimize this heavily because it always manipulates data of the same shape.

Behind the scenes it can use a fast backing store (something close to a c++ class) instead of a dictionary store (like a hash table). This ensure super fast properties access and code specialization.

It would be hugely beneficial for h and the diffing algorithm in terms of speed and memory.

src/h.js Outdated
children.push(val[j])
}
}
else if (val && val !== true) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left you a comment on the other diff about this.

Basically, you are forgetting about the number 0.

} else if (val != null && val !== true && val !== false) {

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngryman See above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're reviewing an old commit. It has been addressed since then.

@jorgebucaran
Copy link
Owner

@ngryman Coverage drops because no tests expose the way hyperx pass children as 3rd parameter and not as splats.

Then we need to add those tests :)

src/h.js Outdated
@@ -4,7 +4,7 @@ function addChild(children, val) {
children.push(val[j])
}
}
else if (val && val !== true) {
else if (val != null && val !== true && val !== false) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 16, 2017

@ngryman Your final implementation is very similar to the recursive one I created, but bigger.

On the other hand, performance is better than that one! 👍

Now, this version is not really faster than the current one on master (mine wasn't either which is why I didn't merge it).

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 16, 2017

@ngryman if we enforce every node to be the exact same object (even for text nodes), here is what we get...

Sounds good, but that does not seem to be what this PR is addressing no? Sorry, I am terrible at discussing two different things at the same time.

@ngryman
Copy link
Contributor Author

ngryman commented Apr 16, 2017

Then we need to add those tests :)

They have been added, 100% is back :)

@ngryman Your final implementation is very similar to the recursive one I created, but bigger.

Because it treats the 2 functions signatures in an optimal way:

  • h(tag, data, children)
  • h(tag, data, ...children)

arguments object must be used very carefully if you don't want to kill perfs. I also added some ifs to avoid to call Array.isArray with null values which deoptimize the code.

Now, this version is not really faster than the current one on master (mine wasn't either which is why I didn't merge it).

  • master version: ok ~3.34 s (3 s + 338336241 ns)
  • h-optimized version: ok ~2.12 s (2 s + 116851481 ns)

It depends how you bench it. What is your bench I'm curious?

Sounds good, but that does not seem to be what this PR is addressing no? Sorry, I am terrible at discussing two different things at the same time.

Yes it's just the next step. I also give too much information lately instead of giving you a concise report. It's because it's a complex subject also.

@jorgebucaran
Copy link
Owner

if (arguments.length > 2) {
if (Array.isArray(values)) {
for (var i = 0; i < values.length; i++) {
addChild(children, values[i])
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngryman What's the point of this loop here? Is it an optimization technique?

I replaced it with just addChild(children, values) and used addChild(children, child[j]) inside addChild itself the all tests pass.

@jorgebucaran
Copy link
Owner

@ngryman Because it treats the 2 functions signatures in an optimal way:

tag can be either a String or a Function, no? Doesn't that defeat what V8?

@jorgebucaran
Copy link
Owner

Is it necessary to repeat the for loop 3 times? What about this implementation that does not change bundle size?

export default function(tag, data, values) {
  var children = []

  if (arguments.length > 2) {
    if (Array.isArray(values)) {
      loop(children, values, 0)
    } else {
      loop(children, arguments, 2)
    }
  }

  return typeof tag === "string"
    ? {
        tag: tag,
        data: data || {},
        children: children
      }
    : tag(data, children)
}

function loop(children, child, i) {
  for (; i < child.length; ) {
    addChild(children, child[i++])
  }
}

function addChild(children, child) {
  if (child != null && child !== true && child !== false) {
    if (Array.isArray(child)) {
      loop(children, child, 0)
    } else {
      children.push(typeof child === "number" ? child + "" : child)
    }
  }
}

@ngryman
Copy link
Contributor Author

ngryman commented Apr 16, 2017

@jbucaran Yes, like @leeoniya says, sometimes we have to trade bytes for performance.

Your proposed solution gives this:

NANOBENCH version 2
> node

# vnode with no child 1M times
ok ~450 ms (0 s + 450218037 ns)

# vnode with a single child 1M times
ok ~1.41 s (1 s + 407109050 ns)

# vnode with a single array child 1M times
ok ~1.5 s (1 s + 502727094 ns)

all benchmarks completed
ok ~3.36 s (3 s + 360054181 ns)

@ngryman
Copy link
Contributor Author

ngryman commented Apr 16, 2017

I'm not saying that there is not a better implementation than mine though, not at all 😆

@ngryman
Copy link
Contributor Author

ngryman commented Apr 16, 2017

@jbucaran btw, do you test your bundle size gziped? gzip logically handles redundancies, multiple for loops should be grasped by gzip somehow.

@ngryman
Copy link
Contributor Author

ngryman commented Apr 16, 2017

Your version give a gziped bundle of 1702 B, mine 1710 B.

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 16, 2017

@ngryman Bummer. I just ran mine through the benchmark suite and it was definitely slower. 👍

I'm still not getting any perf gains from yours against the one on master though, so I am not compelled to merge this PR right now, at least not until other priorities are sorted out like improving the patch algorithm, which IMO it's where we should focus to make hyperapp faster.

Your version give a gziped bundle of 1702B, mine 1710B.

Yeah, that's about right. I use zopfli (just because) and it gives 1673 hehe.

btw, do you test your bundle size gziped?

Nope, I just test the minified bundle using js-framework-benchmark.

@ngryman
Copy link
Contributor Author

ngryman commented Apr 16, 2017

Bummer. I just ran it through the benchmark suite and it was definitely slower. 👍

I'm curious how this version could be slower than the one on master, it does not make sense.
Well, let's define what are the priorities then 😆

@jorgebucaran
Copy link
Owner

@ngryman Read my comment again.

@jorgebucaran
Copy link
Owner

Your version is certainly not slower than the current one on master, but it's not faster either. I'm using js-framework-benchmark. And they both score 1.35.

@leeoniya
Copy link

js-framework-benchmark won't always show improvements from micro-optimizations since timings in all its metrics are dominated by dom ops & relayouts/reflows/repaints. that's why it's good to have an additional bench that stresses framework internals without any effect on the dom.

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 17, 2017

@ngryman Well, let's define what are the priorities then 😆

Which is why I think we can leave this PR open for when the time is right. It's just weird to merge this right now, that's what I mean.

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 17, 2017

@developit Can you share your thoughts/feedback here?

Your https://github.com/developit/preact/blob/master/src/h.js is very similar and perhaps could benefit from this discussion too.

@ngryman
Copy link
Contributor Author

ngryman commented Apr 17, 2017

To make things short:

  • The PR version is faster but when tested in a global benchmark there is no difference.
  • The true perf gain will come when VNodes will all share the same constructor (hidden class: https://gist.github.com/twokul/9501770).
  • It's the start of several passes of optimizations, imho we have to start somewhere 😄

@jorgebucaran
Copy link
Owner

jorgebucaran commented Apr 17, 2017

@ngryman Thanks for the summary and the short article. 😄
And yes, we have to start somewhere, and that somewhere is possibly patch, who's doing all the heavy lifting around here. 💪😥

EDIT: Emoji.

@developit
Copy link

That little function has been through probably 100 different implementations. Regarding changing JSX: even if you were to get VNodes extended to support new nodeTypes like Text and Comment, children is still a passthrough - people provide all sorts of things in that Array, like functions. Your best bet is to design the function to perform well without trying to force it into being monomorphic.

I've experimented with making h() a small normalization function and moving the heavy lifting into another function (such as the VNode constructor) called with consistent types/shapes:
https://esbench.com/bench/587567b899634800a0347555

You can see some other contenders for h() benchmarked here:
https://esbench.com/bench/57ee8f8e330ab09900a1a1a0

@jorgebucaran
Copy link
Owner

@ngryman I'd like to revisit this PR and if possible, with your help, merge it.

Can we first double check that these changes don't break any existing code? JSX-users must not suffer from this.

👋😄🙏

@ngryman
Copy link
Contributor Author

ngryman commented Jul 3, 2017

@jbucaran Sure, I'll do it in the week.

@FlorianWendelborn FlorianWendelborn removed their request for review July 3, 2017 16:31
@ngryman
Copy link
Contributor Author

ngryman commented Jul 7, 2017

I'm going to create a new PR for this one if it's ok for you. This one was from my fork and has diverged quite a lot form master.

@ngryman ngryman mentioned this pull request Jul 7, 2017
3 tasks
@jorgebucaran
Copy link
Owner

SGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants