Upgrading to 2.0
This wiki documents Vega version 2. For current Vega documentation, see vega.github.io/vega.
Vega 2.0 is the first major release since 1.0 was released in April 2013. Specifications created for previous versions of Vega are largely compatible with v2. However, changes to the specification language have been introduced in order to clarify confusing functionality and support significant new features, such as streaming data and interactive behaviours. This document guides you through the process of upgrading your Vega 1.x specifications to 2.0.
If you're working with Vega code or updating data transforms, see this page to read about some of the changes to Vega's internals and additional dev tools required.
Vega 2.0 no longer stores original data values under the data
property. As a result, the data.
prefix is removed, and raw data values can be accessed directly like so:
"data": [{
"name": "table",
"values": [{"x":1, "y": "red"}, {"x":2, "y": "green"}],
"transform": [{"type": "filter", "test": "datum.x > 1"}]
}],
"marks": [{
"from": {"data": "table"},
"properties": {
"enter": {
"fill": {"field": "y"}
}
}
}]
Raw data values remain sandboxed from the outside environment. For example, if data values are specified at runtime, any additional properties derived within Vega (e.g., using data transforms) will not pollute the original data objects.
The index
field has been removed in favor of an _id
field, to better support streaming data values. IDs are automatically assigned by Vega to uniquely identify data values, but may not run consecutively.
The old stats
transform has been replaced with the new, more powerful aggregate
transform:
"data": [{
"name": "barley",
"url": "data/barley.json",
"transform": [
{
"type": "aggregate",
"groupby": ["variety"],
"summarize": [{
"name": "yield",
"ops": ["min", "max", "median"], // Aggregate statistics are stored in
"as": ["ymin", "ymax", "ymed"] // stat_field by default, e.g., min_yield
}]
},
{"type": "sort", "by": "-ymed"}
]
}]
- The
output
property has been standardized. By default, the Bin transform returns the input data set, with an additional property,bin
, that contains the binned value for the specifiedfield
. This property may be renamed by specifying an output parameter like so:"output": {"bin": "b"}
.
- The
keys
property has been renamed asgroupby
. - The
sort
property has been removed. - An additional
summarize
property is available, akin to that found with theaggregate
transform. It calculates summary statistics over the values within each facet, and stores the result on the corresponding facet. - Any/all transforms can now follow a
facet
transform, and will operate over data values representing each facet (i.e., one record per facet). - A new
transform
property allows you to specify a pipeline of transformations to be applied to the values within each facet.
- A
facet
transform is no longer required before astack
transform. But, it may still be necessary to create series (as in thestacked_area
example). - For clarity, the
point
property has been renamed asgroupby
. - Similarly, the
height
property is nowfield
. - The
order
property has been replaced by thesortby
property. - Names of output values are changed (See
stack
in the next section)
{ type: "facet", keys: [ "data.x" ] },
{ type: "stats", value: "data.y" }
would be converted to
{ type: "aggregate", groupby: [ "x" ], summarize: { y: "sum" }}
For consistency, the value
property of the geopath
, pie
, and treemap
layouts has been renamed as field
. The output values of visual encoding transforms have been changed to more clearly differentiate them from raw data values.
-
force
-
x
→layout_x
-
y
→layout_y
-
-
geo
-
x
→layout_x
-
y
→layout_y
-
-
geopath
-
path
→layout_path
-
-
link
→linkpath
-
path
→layout_path
-
-
pie
-
startAngle
→layout_start
-
endAngle
→layout_end
-
midAngle
→layout_mid
-
-
stack
-
y
→layout_start
-
y2
→layout_end
- New
layout_mid
output value
-
-
treemap
-
x
→layout_x
-
y
→layout_y
-
width
→layout_width
-
height
→layout_height
-
The names of these output values can still be changed using the output
property on each transform.
The following transforms have been removed: array
, copy
, flatten
, slice
, truncate
, unique
, window
, zip
.
Use lookup
instead of zip
type:
-
type: "zip"
→type: "lookup"
-
key: "data.id"
→keys: [ "id" ]
-
with: "datasource"
→on: "datasource"
name of the other datasource -
withKey: "data.id"
→onKey: "id"
key in the other datasource -
as: "zipped"
→as: [ "zipped" ]
resulting table -
default: { data: { field: defaultValue } }
→default: { field: defaultValue }
The group
property of a value reference has been removed. Instead, all data and group lookups occur through the field
property entirely, including indirect lookups. For example:
-
"field": "price"
continues to pull the value of theprice
field from the current mark's data. - It is a shorthand for
"field": {"datum": "price"}
. - To use a property of the enclosing group mark, use
"field": {"group": "width"}
. - To use a property of the enclosing group mark's data, use
"field": {"parent": "f"}
. - To perform indirect lookups, combine these properties. For example,
"field": {"datum": {"parent": "f"}}
will first retrieve the value of thef
field on the group mark's data. This is then used as the property name on the current mark's data object. -
group
andparent
properties can be given an optionallevel
to access grandparents and higher ancestors. By default,level = 1
(i.e. parents).
Scale lookups follow suit:
-
"scale": "x"
uses the scale namedx
. -
"scale": {"datum": "s"}
uses the value of the current mark'ss
data field as the scale name. -
"scale": {"parent": "t"}
uses the value of thet
data field on the current group's data as the scale name. -
"scale": {"datum": {"parent": "t"}}
first retrieves thet
data field on the enclosing group's data, and uses that as the property name on the current mark's data object. The value of this property is used as the scale name.
When using a scale transform, an input (either a field
, value
, or signal
) must also be specified. Vega 1 value references that only specify a scale
should be updated to include field: "data"
as well. For example, "text": {"scale": "xlabels"}
becomes "text": {"scale": "xlabels", "field": "data"}
.
For simplicity and security, Vega's expression language has been updated to be a restricted subset of JavaScript. One of the changes introduced renames the d
variable, often used in Formula and Filter transforms, to datum
. Signals can be referred to directly by name within the expression. For example, "transforms": [{"type": "filter", "test": "datum.x > hover"}]
retains tuples who have an x
property that is greater than the value of the hover
signal.
The syntax of template value references now mirrors the expression language as well -- data values are available under the datum
variable, and signal values can be referenced by name.
The runtime view API has been modified to support streaming data values. The follow code block demonstrates how to convert a call to Vega 1's view.data
to Vega 2:
var values = [
{"x": 1, "y": 28}, {"x": 2, "y": 55},
{"x": 3, "y": 43}, {"x": 4, "y": 91},
{"x": 5, "y": 81}, {"x": 6, "y": 53},
{"x": 7, "y": 19}, {"x": 8, "y": 87}
];
// Vega 1
view.data({ table: values });
// Vega 2
view.data("table")
.remove(function(d) { return true; }) // Remove all current values in "table"
.insert(values); // Insert new values
// Vega 2 Batch Alternative
view.data({
table: function(data) {
data.remove(function(d) { return true; })
.insert(values);
}
});