Skip to content

Upgrading to 2.0

Alex Gonzalez edited this page Jul 17, 2020 · 21 revisions

This wiki documents Vega version 2. For current Vega documentation, see vega.github.io/vega.

Vega 2.0 is the first major release since 1.0 was released in April 2013. Specifications created for previous versions of Vega are largely compatible with v2. However, changes to the specification language have been introduced in order to clarify confusing functionality and support significant new features, such as streaming data and interactive behaviours. This document guides you through the process of upgrading your Vega 1.x specifications to 2.0.

If you're working with Vega code or updating data transforms, see this page to read about some of the changes to Vega's internals and additional dev tools required.

Loading Input Data Values

data. prefix removed

Vega 2.0 no longer stores original data values under the data property. As a result, the data. prefix is removed, and raw data values can be accessed directly like so:

"data": [{
  "name": "table",
  "values": [{"x":1, "y": "red"}, {"x":2, "y": "green"}],
  "transform": [{"type": "filter", "test": "datum.x > 1"}]
}],

"marks": [{
  "from": {"data": "table"},
  "properties": {
    "enter": {
      "fill": {"field": "y"}
    }
  }
}]

Raw data values remain sandboxed from the outside environment. For example, if data values are specified at runtime, any additional properties derived within Vega (e.g., using data transforms) will not pollute the original data objects.

index replaced with _id

The index field has been removed in favor of an _id field, to better support streaming data values. IDs are automatically assigned by Vega to uniquely identify data values, but may not run consecutively.

Data Transforms

stats replaced with aggregate

The old stats transform has been replaced with the new, more powerful aggregate transform:

"data": [{
  "name": "barley",
  "url":  "data/barley.json",
  "transform": [
    {
      "type": "aggregate",
      "groupby": ["variety"],
      "summarize": [{
        "name": "yield",  
        "ops": ["min", "max", "median"], // Aggregate statistics are stored in
        "as":  ["ymin", "ymax", "ymed"] // stat_field by default, e.g., min_yield
      }]
    },
    {"type": "sort", "by": "-ymed"}
  ]
}]

Syntax Changes

Bin

  • The output property has been standardized. By default, the Bin transform returns the input data set, with an additional property, bin, that contains the binned value for the specified field. This property may be renamed by specifying an output parameter like so: "output": {"bin": "b"}.

Facet

  • The keys property has been renamed as groupby.
  • The sort property has been removed.
  • An additional summarize property is available, akin to that found with the aggregate transform. It calculates summary statistics over the values within each facet, and stores the result on the corresponding facet.
  • Any/all transforms can now follow a facet transform, and will operate over data values representing each facet (i.e., one record per facet).
  • A new transform property allows you to specify a pipeline of transformations to be applied to the values within each facet.

Stack

  • A facet transform is no longer required before a stack transform. But, it may still be necessary to create series (as in the stacked_area example).
  • For clarity, the point property has been renamed as groupby.
  • Similarly, the height property is now field.
  • The order property has been replaced by the sortby property.
  • Names of output values are changed (See stack in the next section)

Common stack transformation upgrade

{ type: "facet", keys: [ "data.x" ] },
{ type: "stats", value: "data.y" }

would be converted to

{ type: "aggregate", groupby: [ "x" ], summarize: { y: "sum" }}

Visual Encoding Transforms

For consistency, the value property of the geopath, pie, and treemap layouts has been renamed as field. The output values of visual encoding transforms have been changed to more clearly differentiate them from raw data values.

  • force
    • xlayout_x
    • ylayout_y
  • geo
    • xlayout_x
    • ylayout_y
  • geopath
    • pathlayout_path
  • linklinkpath
    • pathlayout_path
  • pie
    • startAnglelayout_start
    • endAnglelayout_end
    • midAnglelayout_mid
  • stack
    • ylayout_start
    • y2layout_end
    • New layout_mid output value
  • treemap
    • xlayout_x
    • ylayout_y
    • widthlayout_width
    • heightlayout_height

The names of these output values can still be changed using the output property on each transform.

Removed Transforms

The following transforms have been removed: array, copy, flatten, slice, truncate, unique, window, zip.

Use lookup instead of zip type:

  • type: "zip"type: "lookup"
  • key: "data.id"keys: [ "id" ]
  • with: "datasource"on: "datasource" name of the other datasource
  • withKey: "data.id"onKey: "id" key in the other datasource
  • as: "zipped"as: [ "zipped" ] resulting table
  • default: { data: { field: defaultValue } }default: { field: defaultValue }

Value References

The group property of a value reference has been removed. Instead, all data and group lookups occur through the field property entirely, including indirect lookups. For example:

  • "field": "price" continues to pull the value of the price field from the current mark's data.
  • It is a shorthand for "field": {"datum": "price"}.
  • To use a property of the enclosing group mark, use "field": {"group": "width"}.
  • To use a property of the enclosing group mark's data, use "field": {"parent": "f"}.
  • To perform indirect lookups, combine these properties. For example, "field": {"datum": {"parent": "f"}} will first retrieve the value of the f field on the group mark's data. This is then used as the property name on the current mark's data object.
  • group and parent properties can be given an optional level to access grandparents and higher ancestors. By default, level = 1 (i.e. parents).

Scale lookups follow suit:

  • "scale": "x" uses the scale named x.
  • "scale": {"datum": "s"} uses the value of the current mark's s data field as the scale name.
  • "scale": {"parent": "t"} uses the value of the t data field on the current group's data as the scale name.
  • "scale": {"datum": {"parent": "t"}} first retrieves the t data field on the enclosing group's data, and uses that as the property name on the current mark's data object. The value of this property is used as the scale name.

When using a scale transform, an input (either a field, value, or signal) must also be specified. Vega 1 value references that only specify a scale should be updated to include field: "data" as well. For example, "text": {"scale": "xlabels"} becomes "text": {"scale": "xlabels", "field": "data"}.

Expression Language and Templates

For simplicity and security, Vega's expression language has been updated to be a restricted subset of JavaScript. One of the changes introduced renames the d variable, often used in Formula and Filter transforms, to datum. Signals can be referred to directly by name within the expression. For example, "transforms": [{"type": "filter", "test": "datum.x > hover"}] retains tuples who have an x property that is greater than the value of the hover signal.

The syntax of template value references now mirrors the expression language as well -- data values are available under the datum variable, and signal values can be referenced by name.

Runtime View API

The runtime view API has been modified to support streaming data values. The follow code block demonstrates how to convert a call to Vega 1's view.data to Vega 2:

var values = [
  {"x": 1,  "y": 28}, {"x": 2,  "y": 55},
  {"x": 3,  "y": 43}, {"x": 4,  "y": 91},
  {"x": 5,  "y": 81}, {"x": 6,  "y": 53},
  {"x": 7,  "y": 19}, {"x": 8,  "y": 87}
];

// Vega 1
view.data({ table: values });

// Vega 2
view.data("table")
  .remove(function(d) { return true; }) // Remove all current values in "table"
  .insert(values);                      // Insert new values

// Vega 2 Batch Alternative
view.data({
  table: function(data) {
    data.remove(function(d) { return true; })
      .insert(values);
  }
});