Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rewrite] vnode structure is difficult to statically optimize #1086

Closed
tivac opened this issue Jun 1, 2016 · 21 comments
Closed

[rewrite] vnode structure is difficult to statically optimize #1086

tivac opened this issue Jun 1, 2016 · 21 comments
Labels
Type: Question For issues that are purely questions about Mithril, not necessarily bug reports or suggestions
Milestone

Comments

@tivac
Copy link
Contributor

tivac commented Jun 1, 2016

There's a few decisions that have been made about rewrite's vnode structure that is making it tricky to update mithril-objectify to work safely & effectively with the rewrite branch.

Current WIP is on the rewrite branch, for reference.

Text

In the rewrite text has a couple of different representations

  1. TextNodes use tag: "#" and set the children property the string of text

    m("div", "a", "b")
    
    {
        tag: "div",
        ...,
        children: [{
            tag: "#",
            ...,
            children: "a"
        }, {
            tag: "#",
            ...,
            children: "b"
        }],
        text: undefined
    }
  2. Other nodes use tag: "<node>" and set the text property to the string of text

    m("div", "fooga")
    
    {
        tag: "div",
        ...,
        children: undefined,
        text: "fooga"
    }

Since it can be difficult to statically determine if an argument is a string both of these patterns are tough to optimize.

An example from the dbmonster example app shows this well.

m("td", ..., db.dbname)

// should become

{
    tag: "td",
    ...,
    text: db.dbname
}

but because it's unknown what type db.dbname is there's no way to optimize it statically.

In mithril@0.2.x this same line could be be optimized to

{ tag: "td", ..., children: [ db.dbname ] }

since the vnode structure for text/nodes was more consistent and bare string children were accepted.

Children and fragments

The change in behavior in rewrite where array children are treated as fragments instead of accepting nested arrays makes it difficult to correctly generate an array of children for various invocations of m().

  1. Arguments convert to an array of children nodes

    m("a", m("b"), m("c"))
    
    {
        tag: "a",
        ...,
        children: [{
            tag: "b",
            ...
        }, {
            tag: "c",
            ...
        }]
    }
  2. A single array argument converts to an array of children

    m("a", [ m("b"), m("c") ])
    
    {
        tag: "a",
        ...,
        children: [{
            tag: "b",
            ...
        }, {
            tag: "c",
            ...
        }]
    }
  3. Mixed array and non-array arguments convert to both fragments/direct children

    m("a", [ m("b") ], m("c"))
    
    {
        tag: "a",
        ...,
        children: [{
            tag: "[",
            ...,
            children: [{
                tag: "b",
                ...
            }]
        }, {
            tag: "c",
            ...
        }]
    }

Not knowing if a particular argument is an array or not makes the creation of the correct vnode structure for these types of invocations tough.

Another simplified example from dbmonster:

m("tr", [
    ...,
    db.lastSample.topFiveQueries.map(function(query) {
        return m("td");
    })
])

// becomes
{
    tag: "tr",
    ...,
    children: [
        ...
    , {
        tag: "[",
        ...,
        children: [ ... ]
    }]
}

Without making a leap of logic that any call to .map() will return an array (something I'm unwilling to do) there's no way to know that the nested array should be a fragment.

In mithril@0.2.x this could be optimized to

{
    tag: "tr",
    ...,
    children: [
        ...,
        [ ... ]
    ]
}

There's probably other instances I'm missing, but these are the two that are most problematic for me right now as I try to do perf testing against the dbmonster application to see if mithril-objectify even deserves a future in our brave new rewrite future.

I'm not necessarily advocating for changing these, especially if doing some would hurt run-time perf. I'm fine w/ accepting that the conversion rate will drop w/ rewrite. I'd just like to bring up these concerns to see if it's something that can potentially be simplified before the rewrite lands in a way that won't affect runtime perf but could improve the ability to statically compile out m() invocations.

CC @lhorie

@tivac tivac added Type: Question For issues that are purely questions about Mithril, not necessarily bug reports or suggestions discussion labels Jun 1, 2016
@lhorie
Copy link
Member

lhorie commented Jun 2, 2016

So generally speaking, the rewrite branch will probably lose performance if you compile to plain objects, most noticeably in Firefox, which can't figure hidden classes as well as Chrome. Instead, I'd strongly recommend emitting calls to the Node class so that hidden classes can kick in unambiguously.

Something we could experiment w/ is to simplify the signature of the Node constructor. Some arguments are only used in one or other edge case and could be populated via dot notation after node creation instead.

We could also experiment w/ refactoring some of the children/text detection logic into Node so that it's always a dynamic check.

Fragments are much harder to deal with. Frankly, I think you need a full-blown type inference system to be able to deal with them properly, but needless to say, that's not trivial at all.

@tivac
Copy link
Contributor Author

tivac commented Jun 2, 2016

A bit worried about figuring out how to inject a require("mithril/render/node") call into 3rd party code, we'll see if that pans out.

Do you think those edge cases are uncommon enough that the speed hit from dynamically modifying the hidden class will be worth it?

I'm certainly not interested in dealing w/ the bugs that I know a type inference system would cause, so I'll probably just need to add more draconian heuristics around not optimizing things that might end up being a fragment.

@lhorie
Copy link
Member

lhorie commented Jun 2, 2016

Here's one example (line 230)

Node("#", undefined, undefined, old.text, undefined, old.dom.firstChild)

Let's say we refactored node signature to Node(tag, attrs, childrenOrText). Then I could change that line to this:

var node = Node("#", undefined, old.text)
node.dom = old.dom.firstChild

This has no impact on hidden classes because the dom field was created by Node. The impact of the extra assignment is also completely negligible (i.e. I did not see an impact even on vdom bench, which is fairly sensitive to changes in the hot path)

The idea here is that Node would be a lot less annoying for you to use, and it would encapsulate the logic to figure out whether children should be an array or a text field

@tivac
Copy link
Contributor Author

tivac commented Jun 2, 2016

Oh, I misunderstood. Node() (still should be lower-cased right?) would create the dom field but just not expose it via the function arguments. Works for me.

I can investigate that in a bit, need to step back from this problem for a few and let my eyes uncross.

@dead-claudia
Copy link
Member

@lhorie @tivac What about vnode.state? That one in particular is exceptionally difficult to optimize, and the only way you're getting that fast in it stands is through the infamous toFastProperties hack for V8 only, or by calling oninit within the state's constructor itself like below:

function State(vnode) {
    vnode.state = this
    vnode.tag.oninit(vnode)
}

// You'll have to call it with `new` each time

@lhorie
Copy link
Member

lhorie commented Jun 2, 2016

@isiahmeadows I'm not sure what you mean. The state property is already part of the object structure from the beginning, so it's not gonna deopt the vnode afaik

@dead-claudia
Copy link
Member

@lhorie

When the vnode's state is mutated in oninit, which is very common for stateful components (like forms and interactive content), the state itself could easily become deopted (I believe the threshold is 4 mutations, which would show almost instantly), and people aren't going to like that for frequently replaced and/or rendered components.

@dead-claudia
Copy link
Member

To clarify, it's not uncommon to see code similar to this:

oninit: function (vnode) {
    vnode.state.counter++
},
view: function (vnode) {
    return [
        m(".counter", vnode.state.counter),
        m("input", onclicjk:  function () {
            vnode.state.counter= 10
        }),
}

@lhorie
Copy link
Member

lhorie commented Jun 2, 2016

Ok, but that's unrelated to the topic of making a vnode tree easier to statically analyze.

Re: fast properties, the cost of the toFastProperties hack would have to offset the cost of hashmap lookups. I think it would be very difficult to come up w/ a representative A/B test to figure out whether to apply the hack or not.

@dead-claudia
Copy link
Member

@lhorie I see it as indirectly related, but you're right in that particular problem belongs elsewhere.

I have another question: would it be possible to make the internal tree carry a different type for its entries than the external API?

By internalizing the representation, you've both simplified the external API (and @tivac may appreciate that), and exposed less of the implementation, which allows much more internal freedom for various changes and optimizations, including after it's released. This would reduce memory usage by a lot, since you're diffing the same data either way, but you're not creating nearly as much after the first redraw to later compare, which is much easier for GCs to handle (they like smaller, more static objects). Also, for the hyperscript API, most of this is waste.

@lhorie
Copy link
Member

lhorie commented Jun 3, 2016

@isiahmeadows Sorry, I don't understand your question. I also don't understand what exactly would reduce memory usage and I don't understand what part of the Node structure is "waste", given that every field is used for one use case or another.

Can you clarify?

@dead-claudia
Copy link
Member

@lhorie

I mean that, in the context of the hyperscript API, all you need are tag, key, attributes, and children. You could include state: null for memory/performance reasons, but the rest like dom, text, and the instance added here and here, are generally useless at first, and are more or less internal implementation details.

@dead-claudia
Copy link
Member

In the case of text nodes and raw HTML nodes, you only need two properties: tag and children. The other two properties even referenced in either, dom and domSize, are implementation details.

@lhorie
Copy link
Member

lhorie commented Jun 3, 2016

As I mentioned in the chat, the omission of fields in the Node structure are oversights. Unless something changed, my understanding is that having the extra fields is faster and more efficient than not having them (because we can take advantage of monomorphism if all the vnodes have the same structure)

@dead-claudia
Copy link
Member

dead-claudia commented Jun 3, 2016

@lhorie

I get the idea of monomorphism. And here's my idea of how it all could be handled:

// Internal representation
interface ComponentState {
  vnode: VirtualNode;

  // For SVG elements - they don't appear to propagate through component
  // boundaries ATM, if I'm reading the source correctly
  svg: boolean;
  dom: Node;
  domSize: number;
  events: {[key: string]: EventCallback};
}

// What the API passes in
type VirtualNode =
  ComponentNode |
  TextNode |
  TrustedNode |
  FragmentNode;

// m(Component, ...)
interface ComponentNode {
  tag: Component;
  attrs: {[key: string]: any};
  children: VirtualNode[];
  key: string | number;
  state: Object | null;
}

// m("div", ...)
interface ComponentNode {
  tag: string;
  attrs: {[key: string]: any};
  children: VirtualNode[];
  key: string | number;
  state: Object | null;
}

// "text" in m("div", "text")
interface TextNode {
  tag: "#";
  children: string;
}

// m.trust("text")
interface TrustedNode {
  tag: "<";
  children: string;
}

// fragments
// m("[", "text")
interface TrustedNode {
  tag: "<";
  children: VirtualNode[];
}

First, monomorphism doesn't simply work by only calling with one specific type. It actually deals with only the objects that can be referenced within the code path. In the engine's eyes, these two objects would be considered monomorphic when passed to value, because their value value is never actually used:

var obj1 = {
  left: true,
  value: {one: 1},
}

var obj2 = {
  left: false,
  value: {two: "two"},
}

function func(value) {
  if (value.left) {
    withLeft(value.value)
  } else {
    withRight(value.value)
  }
}

console.log(func(obj1)) // true
console.log(func(obj2)) // false

Also, this function would be seen as monomorphic, because for each code path, value is only one type. typeof doesn't introduce polymorphism in this case, because the engine is intelligent enough to know to narrow the possible types for value within that code path.

function toInt(value) {
  if (typeof value === "string") {
    return parseInt(value, 10) | 0
  } else {
    return value | 0
  }
}

var a = toInt(1)
var b = toInt("1")

Let's just say I did a recent bout of that with my WIP test framework, to optimize initialization and running the tests. Currently, it has a more complex initialization process than Mocha by necessity, and it's predictably slower, which makes optimization very helpful.

@dead-claudia
Copy link
Member

Oh, and to be fair, I haven't been on Gitter for a while.

@lhorie
Copy link
Member

lhorie commented Jun 3, 2016

I see, that's interesting to know.

For now, though, I'm not sure if this type of optimization is worth the effort. I did a quick benchmark and the cost of having extra properties in an object is in the range of +1ms per ~8 properties per 200,000 iterations, which is well into micro-optimization territory, so it's not really worth a large refactor of the current codebase.

@dead-claudia
Copy link
Member

dead-claudia commented Jun 3, 2016

@lhorie My two main reasons were actually memory and encapsulation of the API, where it would do wonders. Speedwise, it may make marginal difference (mod maybe GC), but half the properties would likely result in two thirds the memory usage, which would be highly beneficial for larger and highly interactive projects, where memory usage and GC speed would both be important.

@tivac
Copy link
Contributor Author

tivac commented Aug 9, 2016

Might be worth reworking to use the Node constructor, might not. Until I have time to investigate this thread isn't moving, so I'm closing it.

@tivac tivac closed this as completed Aug 9, 2016
@pygy
Copy link
Member

pygy commented Aug 9, 2016

A bit late to the party (I hadn't visited that part of the code when this was originally discussed), but I think that the text optimization could occur at diff time rather than at vnode generation time.

You'd have to mutate the current vnode so that the optimization is remembered on the next diff phase, once vnode has become old... It would be slightly more expensive in that you'd either have set the children field to undefined or keep the children array around for the lifetime of the node (it is generated anyway), rather than leaving it available for GC at the end of the m() call.

@futurist
Copy link
Contributor

futurist commented Jan 7, 2017

Not very related to this issue itself, but share a similar situation.

When I write this lib:

https://github.com/cssobj/cssobj-mithril

I'm in the hope of call some internal methods of mithril (1.x have oninit, that's very good!), and decide what to do.

m('li.item.!active', {class:'news !active'}, 'hello')

Want to transform above code into:

m('li.item_suffix_.active', {class:'news_suffix_ active'}, 'hello')

I want it work on both 0.2.x and 1.x, I found several ways doing this:

  1. (runtime) repeat the selectorParser of hyperscript function's work, transform the desired part, and call mithril again.

  2. (statically) Use babel

Both way have more work, if mithril can expose some internal methods, or provide some 'plugin' mechanism (hook into selector parser, etc.), that's easier to do such a thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Question For issues that are purely questions about Mithril, not necessarily bug reports or suggestions
Projects
None yet
Development

No branches or pull requests

5 participants