Skip to content

Data Hashing vs Binding

Brad Peabody edited this page Jul 21, 2019 · 2 revisions

Vugu Data Design: Hashing vs Binding

Although only a few short months old, I have yet been asked a number of times about some sort of data binding system for use in Vugu. Here are my thoughts on the matter:

Reactivity

React, Angular, Vue (and others like RiotJS, et al.) make good use of the idea of "reactivity". The essence of which is that whatever you are looking at on the screen (in your DOM) is a function of some state information. You don't manipulate DOM, instead you describe how to produce DOM from some state info. Conceptually it's just a function, and when the state changes, you call it to produce new DOM. Actions (clicks, taps, key strokes, or external things like HTTP responses for a JSON file) all result in some change to the state, and you call the function again, re-rendering the DOM.

This approach of course has gained great popularity in recent years and for good reason. It's a lot easier to reason about a UI written like this.

The Uh-Oh Part

It does however come at a not-immediately-apparent cost, with UI frameworks hiding the implementation details behind the magic described above.

The cost is that for even just a medium-sized application, this complete process of going from your state info to the DOM of your page can be quite expensive. A large list, with each row itself having a number of little widgets in it, plus whatever navigation, toolbars, etc., all these things together can very quickly wind you up with hundreds or even thousands of DOM elements. Doing a complete re-render becomes impractical (much too slow).

Figuring Out If Stuff Changed

So how do we make it faster? Well the most obvious answer is we only re-create the DOM elements that have changed. And there-in lies the rub: Sure, you can do that, but it's not particularly simple.

The approach Vue takes is to use "data binding" (basically the Observer Pattern), so the appropriate thing can get some sort of event or function call when data changes. So you do mylist.push(item), and then the appropriate part of the DOM is re-rendered, the part corresponding to mylist, hopefully also preserving the DOM pieces therein that didn't change, since only a new item was added to the bottom.

They make this work in JS. It certainly is one reasonable approach to the problem. But I think trying to do this Go is a mistake. Why? Because, in short, Go is not JS. Go does not have Proxy. You can't magically replace the push method and make it do weird and wonderful things automatically. In order to make such a system work in Go, it would require lots of custom types and methods to implement observer-style notifications. Possible? Certainly. But it would add a lot of cruft. Code that would otherwise be a simple struct would end up being a complex mess of functions and callbacks and events being broadcast. That stuff is fun when you're 16 learning to code, but to introduce it into a library when it is not necessary seems... well, unnecessary.

Data Hashing to the Rescue

So the idea of how to do this in Go is that instead of using an event system to send some sort of message when data is changed, instead we use a one-way hash system to make it efficient to check which pieces of this state information have changed. We re-check everything by comparing hashes, and then actually only do the render work for the ones where the has has changed.

Since the core of the issue is indeed being able to efficiently determine "is this data changed" (or "is it the same" if you prefer) for each part of our state, a one-way hash is a tried and true method of doing this. Obviously hashes have the risk of collision. However, one can use a modern hash algorithm (such as xxhash) to create, with good performance, a hash that is effectively, for practical purposes, very unlikely to collide. Seems workable.

Circling back to what was described above, instead of doing mylist.push(item), we are writing in Go so instead we append to a slice. No magic there. When all updates are done, we just need to signal to the UI that it should attempt a re-render. This re-render process walks the tree of its state, comparing hashes along the way, when things don't match, it regenerates that part. This means that your state can be whatever data type you want, the ComputeHash function already knows how to traverse struct fields, and if you want to customize something, you only need implement DataHash() uint64. This also gives a clear way to speed things up if this hashing is being too slow - your DataHash() method can always start doing things like precomputing the hash value when things mutate, etc. if it comes to that.

In Closing

All that is to basically say "no, I will not be making a 'Vuex' for Vugu" - instead I'll be making the above. Let's see how it goes.

I honestly don't know if this will end up working out. There may be unforeseen problems. However, so far I'm convinced it's worth a shot, and that the simplicity it will bring to writing Vugu programs will be well worth it. (It incidentally means that we hopefully can avoid some of the mess that Vue got itself into, i.e Vue.set() and it's corresponding list of edge cases.)

You can’t perform that action at this time.