We have recently proposed visual embedding as an operational model for automatically generating and evaluating visualizations (see also our original proposal presented at Vis'11). In the paper, we provide three examples of visual embedding. This repository contains the data and source code used to generate these examples. In order to demonstrate discrete visual embedding, we crowdsource perceptual distances using Amazon's Mechanical Turk service. You can access the data and source code for our crowdsourcing experiments in src/mturkExperiments. We also provide the resulting perceptual distance matrix for the shapes shown in Figure 4 as a text file (data/polygonKernel.txt). Note that we extend upon this idea of learning kernels of perceptual similarity using crowdsourcing in our recent paper.
What is Visual Embedding?
Visual embedding is an operational model for automated visualization design and evaluation. Although researchers have proposed numerous guidelines and heuristics, a formal framework for design and evaluation is still elusive. Instead, conducting a posteriori user studies is still the primary tool for assessing a visualization’s effectiveness. Using theoretical models presents another, albeit less explored, approach. Model-based approaches that integrate perceptual considerations into design process in a measurable, data-driven form can accelerate visual design and complement summative nature of user studies.
An Operational Framework for Visualization
Developing a theory of visualization that is both descriptive and generative is, however, difficult. The space of visualizations is large, and the use of visualization spans many issues in human perception and cognition. Additional factors, such as interaction techniques, can significantly affect a visualization’s success. Given our current knowledge, visualization design is an underconstrained problem. So, there is value in developing simpler, constrained models, each addressing certain aspects of visualization while ignoring others, like spotlights on a theater stage. The main advantage of this approach is that it is conducive to developing operational models.
In this context, we introduce visual embedding as a model for visualization construction. We define a visualization as a function that maps from a domain of data points to a range of visual primitives. We claim a visualization is “good” if the embedded visual elements preserve structures present in the data domain. A function meeting this criterion constitutes a visual embedding of the data points (see Figure 1).
Figure 1. Visual embedding is a function that preserves structures in the data (domain) within the embedded perceptual space (range).
Perceptual Painting of Data Relations
Visual embedding is based on the premise that good visualizations should convey structures or relations (e.g., similarities, distances, etc.) in data with corresponding, proportional perceptual relations. For example, if two data values are similar to each other in a user-defined or some other sense, they should be encoded with perceptually similar values of visual variables (color, shape, etc.) and vice versa. The underlying assumption here is, however, that degrees of perceptual affinities between and within visual encoding variables are available to us.
Visualizations Beyond Visual Encoding
Data visualization is, more than anything else, a user experience production and, as such, needs to embrace and utilize all human sensing capacities eventually. Embedding spaces, as discussed here, need not to be restricted to visual stimuli. They could be any perceptual channel or combinations thereof, such as color, texture, shape, icon, tactile, and audio features. For example, we could, in theory, apply our formulation to construct sonifications for people with visual disabilities.
What About Interaction?
Visual embedding can also be useful in modeling an important class of interactions that couple data with its visual representation so that when users interactively change one, they can observe a corresponding change in the other. This class of interactions, which we call dynamic visualization interactions, is important for running what-if analyses on data by directly modifying the data attributes or instances and their visualizations. Visual embedding immediately gives us criteria on which dynamic interactions should be considered effective (Figure 2): 1) a change in data (e.g., induced by user through direct manipulation) should cause a proportional change in its visual representation and 2) a perceptual change in a visual encoding value (e.g., by dragging nodes in a scatter plot or changing the height of a bar in a bar chart) should be reflected by a change in data value that is proportional to the perceived change. However, to enable a dynamic interaction on a visualization, we need to have access to both the visualization function f and its inverse f-1. The visual embedding model also clearly suggests when implementing back mapping (f-1) to the data space can be challenging. See the penultimate section of our recent paper for more discussion on this.
Figure 2. Visual embedding can be also useful for modeling dynamic visualization interactions. Dynamic visualization interactions bidirectionally couple data and its visual encoding. The goal is to enable users to "experiment" by dynamically changing the visualizations and attribute values of a dataset (see, e.g., Media for Thinking the Unthinkable, Stop Drawing Dead Fish, DimpVis for further motivation).
Examples of Visual Embedding
The basic framework proposed above can be used to generate and evaluate visualizations on the basis of both the underlying data and—through the choice of preserved structure—desired perceptual tasks.
Coloring Neural Tracts
Our first example is coloring neural-fiber pathways estimated from a diffusion-imaging brain dataset. Given a set of tracts, we first compute distances (or dissimilarities) between pairs of pathways. To do this, we use a simple measure that quantifies the similarity of two given neural pathways’ trajectories. We then construct the visualization function by embedding the distances in the CIELAB color space using multidimensional scaling. Figure 3 shows the obtained colorings; perceptual variations in color reflect the spatial variations in the tracts.
Figure 3. Coloring neural tracts: (left) the internal capsule and (right) the corpus callosum. We colored them using visual embedding in CIELAB, a perceptually uniform color space. Perceptual variations in color reflect the spatial variations in the tracts.
Scatter Plotting With Shapes
In the above example, the visual space, CIELAB, is a subspace of the continuous three space. Embedding in continuous spaces is relatively well studied, thanks partly to the fact that centuries of mathematical analysis (i.e., calculus) left us with a powerful toolset operating on continuous spaces.
How does visual embedding apply if the range of visualization function is discrete?
Here, a toy problem demonstrates embedding in a discrete visual space. We want to assign polygonal icons from the discrete polygonal-shape space Vp (Figure 4) to a given set of 2D points so that the points’ spatial proximity was redundantly encoded via the assigned polygons’ perceptual proximity.
Figure 4. A palette of polygonal shapes.
Though simple, this setup is realistic: redundant visual encoding is common in visualization. Alternatively, we could have used icons to convey attributes of other dimensions of the data points. Unlike the coloring example, here we lack a perceptual model for estimating perceived distance. So, we first obtained a crowdsourced estimate of the perceptual distances between the elements of Vp, using Amazon’s Mechanical Turk (Figure 5).
Figure 5. The task interface of our crowdsourcing experiment on Amazon’s Mechanical Turk website.
The study participants saw all possible pairs, including identical ones. We used errant ratings of identical polygon pairs to filter “spammers.” After this initial filtering, we normalized each participant’s ratings and averaged the ratings across the users. Finally, we normalized the averaged ratings and accumulated the results in a distance matrix. Figure 6 shows the resulting perceptual-distance matrix and its two-dimensional projection.
Figure 6. (Left) Estimated perceptual-distance matrix. Darker colors indicate closer distances. (Right) Two-dimensional projection of the shapes based on the distance matrix. (Click on the figure to see an interactive version of the distance matrix and its projection.)
After estimating the perceptual-distance matrix, we pose the embedding problem as maximum a posteriori estimation in a Markov random field (an undirected graphical model) to find an embedding of a simple 2D point set in Vp. Figure 7 shows the result, where the shape assignment reflects the data points’ clustering, as we desired.
Figure 7. Visual embedding in a discrete visual space. We embed the planar data points in Vp. The shape assignment reflects the data points’ spatial variation and clustering.
Evaluating Tensor Glyphs
Can we evaluate visualizations without running user studies? Models, of any sort, are particularly useful if they have predictive and evaluative power, going beyond being descriptive.
With our model, given suitable data and perceptual metrics, we can assess competing visualization techniques’ structure-preserving qualities. Here, we compare superquadrics and cuboids, two alternative glyphs used in visualizing second-order diffusion tensors (Figure 8).
Figure 8. A superquadric and a cuboid glyph, used for visualizing the same tensor field. The insets represent the diagonal tensor D.
You can think a second-order diffusion tensor as the covariance matrix of a three-dimensional Gaussian distribution. So, superquadrics and cuboids, together with 3D position, can also be seen as glyphs for visualizing three-dimensional Gaussian distributions.
We conduct a simple experiment: We rotate the diagonal tensor D = [2.1 0 0; 0 2 0; 0 0 1] around its smallest eigenvector (0, 0, 1) with five incremental degrees. We compute how the tensor value changes as the Euclidean distance between the reference tensor and the rotated tensor changes. We approximate the perceptual change in the corresponding glyph visualizations with the sum of the magnitudes of the optical flow at each pixel in the image domain. We average the optical-flow distances over nine viewpoints uniformly sampled on a circumscribed sphere under fixed lighting and rendering conditions. The trends in Figure 9 suggest that superquadrics represented the change in the data more faithfully (that is, preserved the structure better) than cuboids. This supports the visualization design choice motivating superquadrics.
Figure 9. Changes in the size of D and its superquadric and cuboid representations with respect to rotations around the tensor’s smallest eigenvector. The tensor size and the superquadric glyph’s appearance follow a similar trend; the cuboid glyph’s appearance differs. This suggests that superquadric glyphs better preserved the structure in the data.
Researchers have proposed general and specific models of visualization. In order to put the visual embedding model in a historical context, we discuss a representative subset of earlier work here.
Jock Mackinlay introduced one of the most influential systems for automatically generating visualizations. Following Jacques Bertin’s aphorism of graphics as a language for the eye, Mackinlay formulated visualizations as sentences in a graphical language. He argued that good visualizations will meet the criteria of expressiveness and effectiveness. A visualization meets the expressiveness criterion if it faithfully presents the data, without implying false inferences. Effectiveness concerns how accurately viewers can decode the chosen visual-encoding variables; it’s informed by prior studies in graphical perception (for example, by William Cleveland and Robert McGill). Mackinlay’s APT (A Presentation Tool) employed a composition algebra over a basis set of graphical primitives derived from Bertin’s encodings to generate alternative visualizations. The system then selected the visualization that best satisfied formal expressiveness and effectiveness criteria. APT didn’t explicitly take into account user tasks or interaction. To this end, Steven Roth and his colleagues extended Mackinlay’s work with new types of interactive presentations. Similarly, Stephen Casner built on APT by incorporating user tasks to guide visualization generation. Some of these ideas now support visualization recommendation in Tableau, a commercial visualization tool.
Donald House and his colleagues’ automatic visualization system integrated user preferences using combinatorial optimization. Genetic algorithms refined a population of visualizations in response to user ratings. In contrast to this empirical approach, Daniel Pineo and Colin Ware used a computational model of the retina and primary visual cortex to automatically evaluate and optimize visualizations. These two works represent two antipodal approaches to automated visualization. The former probably provides a fast-track engineering solution while the latter may lead to a better understanding of how visualizations work perceptually and cognitively.
Jarke van Wijk argued for first modeling a perceptual domain (for example, luminance or shape perception) and then optimizing for some perceptual goal according to that model. Visual embedding can be viewed as a reusable template within van Wijk’s discussion on perceptually optimal visualizations. If we chose a motto for visual embedding, it would be “visualization as a perceptual painting of structure in data.” In this sense, visual embedding’s perceptual-structure preservation criterion closes the cycle, explicitly linking Mackinlay’s expressiveness and effectiveness criteria while providing a recipe to achieve both.
While the above background provides a useful backdrop for visual embedding, this background itself would benefit from a better understanding of the context of data visualizations. What does this larger context constitute? It has essentially three elements: 1) Perceptual stimuli (visual encoding, chart size, chart type, etc.), 2) data visualized and 3) the cognitive state of user, of which task is part. To this end, we recommend close reading of the following three classics:
The visual embedding examples above are intended to be only a proof of concept, including our approach for estimating perceptual distance through crowdsourcing. Visualizations live in context; crowdsourcing-based estimated perceptual distances can’t capture all the perceptual interactions of every context. Also, running large-scale crowdsourcing studies can be difficult. Because we used a small discrete space, we could present every pair of embedding-space points to each study participant. Running a similar experiment with thousands of discrete visual primitives will require larger studies and more sophisticated analysis methods for estimating a distance matrix. Similarly, large-scale embedding can be slow; however, many heuristics, such as restricting pairwise distances to local neighborhoods and sampling, can ameliorate the problem.
On the basis of these challenges and insights derived from our examples, we envision the following research directions.
A Standard Library of Visual Spaces
The visualization community could benefit from a standard library of visual spaces with associated perceptual measures. The library would be a practical resource for constructing useful defaults for visualizations. Perceptual kernels is a step in this direction.
Probabilistic Models of Visualizations
Implementing visual embedding with graphical models provides an opportunity to explore probabilistic models of visualization design spaces. This might prove fruitful because several “optimum” visualizations often exist. Using graphical models can also help express high-level structures in data. Such models might also make it easier to incorporate aesthetic or subjective criteria into automatic visualization generation.
To use visual embedding to evaluate visualizations, a primary challenge is to devise and validate appropriate image-space measures (for example, optical flow) to approximate perceptual distances.
Finally, developing tools that facilitate construction of visualizations under our model is crucial. Two challenges stand out. The first is to develop a visualization language that lets users express and create visual embeddings without implementing an optimization algorithm. This language should integrate libraries of visualization defaults for different data and task domains. It might also benefit from crowd-programming ideas to enable automated support for running crowd-sourced evaluations. The second challenge is to develop a visualization debugger in the spirit of the tensor glyph example, letting users get runtime feedback about visualization quality. We envision future visualization development environments integrating such languages and debuggers.