Upgrading data transforms to 2.0

Jeffrey Heer edited this page Mar 7, 2017 · 2 revisions

This wiki documents Vega version 2. For Vega 3 documentation, see vega.github.io/vega.

NOTE that there's another page that describes how to update v1 specs to work with Vega 2. This page describes changes to Vega internals, particularly those related to data transforms.

Upgrading Vega 1 Transforms

In addition to the minor API changes noted elsewhere, there have been major changes to Vega internals for v2. This means that even a simple transform will require fairly major revisions, as demonstrated by the old (v1) and new (v2) versions of the basic Sort. Here are the salient differences in v2 that pertain to data transforms:

  • Most JavaScript has been refactored into CommonJS modules. For a typical transform, this just means adding a few require() statements to assign any dependencies to local variables.

  • Each transform is embodied as a subclass of Transform, then registered by exporting the subclass from the 'transforms' module so that appears in your JS as, for example, vg.transforms.Sort. (See below for ways to add a new transform without rebuilding all of Vega or modifying its code.)

  • Incoming parameters are defined with Transform.addParameters, and assigned one of several types:

    /* Types used in `addParam`, check out the vega 2 transforms to learn more
     * https://github.com/vega/vega/tree/master/src/transforms
     */
    {type: 'value'}         // a simple value (string, boolean, number)
    {type: 'data'}
    {type: 'expr'}           
    {type: 'field'}
    {type: 'custom'}                 // provide custom getter and setter
    {type: 'field', default: null}   // you can provide a default value
    {type: 'value', default: 0}
    {type: 'value', default: false}
    {type: 'array<value>'}
    {type: 'array<field>', default: ['data']}

    NOTE that there's a different set of "primitive" types used in the formal JSON Schema that more clearly documents the expected parameters, their default values, and any constraints. (Another, more readable intro to JSON Schemas is http://spacetelescope.github.io/understanding-json-schema/)

  • There are several flags available in each transform that help Vega to efficiently traverse the graph of possible transforms. These are not well documented beyond the code shown below, but I've had good results with educated guesswork and examining similar existing transforms, as well as studying the code in vega-dataflow.

    var Flags = Node.Flags = {
      Router:     0x01, // Responsible for propagating tuples, cannot be skipped.
      Collector:  0x02, // Holds a materialized dataset, pulse node to reflow.
      Produces:   0x04, // Produces new tuples. 
      Mutates:    0x08, // Sets properties of incoming tuples.
      Reflows:    0x10, // Forwards a reflow pulse.
      Batch:      0x20  // Performs batch data processing, needs collector.
    };

    These flags are false by default, but can be set to true by calling a series of corresponding functions on the transform before returning from your class's constructor function, for example:

    return this.router(true).produces(true);
    return this.router(true)
               .reflows(true)
               .mutates(true);
  • The heart of your v1 transform will become an inner function transform assigned to its subclass as shown here.

  • Most transforms will manipulate data in a new, streaming form instead of the simple "materialized" data used in v1. The input object passed to the inner transform function carries added tuples (input.add), modified tuples (input.mod) and removed typles (input.rem) that your code can manipulate and return.

Additional Javascript Tools Required

If you're planning to make changes to Vega, or to build compatible code like data transforms, there are some tools you'll need to install for bundling and managing its CommonJS modules.

There are lots of tutorials on the web for these tools -- here's one I particularly like -- so we won't go into great detail here. But once you have the general idea, it's useful to see a working package.json file for an app that includes custom transforms. This makes it easy to install all the required dependencies with npm, including "shims" for JS scripts that weren't written as CommonJS modules.

Using Transforms With Prebuilt Vega

There is apparently a plugin system in the works for transforms, but for now the general assumption is that you'd add them to vega's source code and bundle it with browserify.

But if you'd rather not fork Vega to add your own transforms, it's still possible to build and use "third-party" transforms separately, by taking advantage of CommonJS modules. Simply include vega and your transforms using require(), then register your custom transforms with vega.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.