# findVIS

The following scripts load the *visualization publication dataset* (http://vispubdata.org) and provide search features within this data.

## Regex Search

Search in abstracts with a regular expression

In [2]:
regexQuery = "treemap(s)?"

import csv, re
from IPython.display import display, Markdown, Latex

regex = re.compile(regexQuery, re.IGNORECASE)

display(Markdown("## Results"))

count = 0
with open('data/vispub.csv', newline='', encoding='ansi') as csvfile:
    reader = csv.reader(csvfile, delimiter=';')
    for row in reader:
        if len(row) > 1 and regex.search(row[1]):
            count += 1
            display(Markdown("### ["+str(count)+"] "+row[0]))
            display(Markdown(regex.sub("**\g<0>**", row[1])))
            print("http://dx.doi.org/"+row[2])


## Results

### [1] Visualizing Business Data with Generalized Treemaps

Business data is often presented using simple business graphics. These familiar visualizations are effective for providing overviews, but fall short for the presentation of large amounts of detailed information. **Treemaps** can provide such detail, but are often not easy to understand. We present how standard **treemap** algorithms can be adapted such that the results mimic familiar business graphics. Specifically, we present the use of different layout algorithms per level, a number of variations of the squarified algorithm, the use of variable borders, and the use of non-rectangular shapes. The combined use of these leads to histograms, pie charts and a variety of other styles

http://dx.doi.org/10.1109/TVCG.2006.200


### [2] LOD Map - A Visual Interface for Navigating Multiresolution Volume Visualization

In multiresolution volume visualization, a visual representation of level-of-detail (LOD) quality is important for us to examine, compare, and validate different LOD selection algorithms. While traditional methods rely on ultimate images for quality measurement, we introduce the LOD map - an alternative representation of LOD quality and a visual interface for navigating multiresolution data exploration. Our measure for LOD quality is based on the formulation of entropy from information theory. The measure takes into account the distortion and contribution of multiresolution data blocks. A LOD map is generated through the mapping of key LOD ingredients to a **treemap** representation. The ordered **treemap** layout is used for relative stable update of the LOD map when the view or LOD changes. This visual interface not only indicates the quality of LODs in an intuitive way, but also provides immediate suggestions for possible LOD improvement through visually-striking features. It also allows us to compare different views and perform rendering budget control. A set of interactive techniques is proposed to make the LOD adjustment a simple and easy task. We demonstrate the effectiveness and efficiency of our approach on large scientific and medical data sets

http://dx.doi.org/10.1109/TVCG.2006.159


### [3] Visual Analysis of Network Traffic for Resource Planning, Interactive Monitoring, and Interpretation of Security Threats

The Internet has become a wild place: malicious code is spread on personal computers across the world, deploying botnets ready to attack the network infrastructure. The vast number of security incidents and other anomalies overwhelms attempts at manual analysis, especially when monitoring service provider backbone links. We present an approach to interactive visualization with a case study indicating that interactive visualization can be applied to gain more insight into these large data sets. We superimpose a hierarchy on IP address space, and study the suitability of **Treemap** variants for each hierarchy level. Because viewing the whole IP hierarchy at once is not practical for most tasks, we evaluate layout stability when eliding large parts of the hierarchy, while maintaining the visibility and ordering of the data of interest.

http://dx.doi.org/10.1109/TVCG.2007.70522


### [4] Visualizing Changes of Hierarchical Data using Treemaps

While the **treemap** is a popular method for visualizing hierarchical data, it is often difficult for users to track layout and attribute changes when the data evolve over time. When viewing the **treemaps** side by side or back and forth, there exist several problems that can prevent viewers from performing effective comparisons. Those problems include abrupt layout changes, a lack of prominent visual patterns to represent layouts, and a lack of direct contrast to highlight differences. In this paper, we present strategies to visualize changes of hierarchical data using **treemaps**. A new **treemap** layout algorithm is presented to reduce abrupt layout changes and produce consistent visual patterns. Techniques are proposed to effectively visualize the difference and contrast between two **treemap** snapshots in terms of the map items' colors, sizes, and positions. Experimental data show that our algorithm can achieve a good balance in maintaining a **treemap**'s stability, continuity, readability, and average aspect ratio. A software tool is created to compare **treemaps** and generate the visualizations. User studies show that the users can better understand the changes in the hierarchy and layout, and more quickly notice the color and size differences using our method.

http://dx.doi.org/10.1109/TVCG.2007.70529


### [5] Browsing Zoomable Treemaps: Structure-Aware Multi-Scale Navigation Techniques

**Treemaps** provide an interesting solution for representing hierarchical data. However, most studies have mainly focused on layout algorithms and paid limited attention to the interaction with **treemaps**. This makes it difficult to explore large data sets and to get access to details, especially to those related to the leaves of the trees. We propose the notion of zoomable **treemaps** (ZTMs), an hybridization between **treemaps** and zoomable user interfaces that facilitates the navigation in large hierarchical data sets. By providing a consistent set of interaction techniques, ZTMs make it possible for users to browse through very large data sets (e.g., 700,000 nodes dispatched amongst 13 levels). These techniques use the structure of the displayed data to guide the interaction and provide a way to improve interactive navigation in **treemaps**.

http://dx.doi.org/10.1109/TVCG.2007.70540


### [6] Balloon Focus: a Seamless Multi-Focus+Context Method for Treemaps

The **treemap** is one of the most popular methods for visualizing hierarchical data. When a **treemap** contains a large number of items, inspecting or comparing a few selected items in a greater level of detail becomes very challenging. In this paper, we present a seamless multi-focus and context technique, called Balloon Focus, that allows the user to smoothly enlarge multiple **treemap** items served as the foci, while maintaining a stable **treemap** layout as the context. Our method has several desirable features. First, this method is quite general and can be used with different **treemap** layout algorithms. Second, as the foci are enlarged, the relative positions among all items are preserved. Third, the foci are placed in a way that the remaining space is evenly distributed back to the non-focus **treemap** items. When Balloon Focus enlarges the focus items to a maximum degree, the above features ensure that the **treemap** will maintain a consistent appearance and avoid any abrupt layout changes. In our algorithm, a DAG (Directed Acyclic Graph) is used to maintain the positional constraints, and an elastic model is employed to govern the placement of the **treemap** items. We demonstrate a **treemap** visualization system that integrates data query, manual focus selection, and our novel multi-focus+context technique, Balloon Focus, together. A user study was conducted. Results show that with Balloon Focus, users can better perform the tasks of comparing the values and the distribution of the foci.

http://dx.doi.org/10.1109/TVCG.2008.114


### [7] Spatially Ordered Treemaps

Existing **treemap** layout algorithms suffer to some extent from poor or inconsistent mappings between data order and visual ordering in their representation, reducing their cognitive plausibility. While attempts have been made to quantify this mismatch, and algorithms proposed to minimize inconsistency, solutions provided tend to concentrate on one-dimensional ordering. We propose extensions to the existing squarified layout algorithm that exploit the two-dimensional arrangement of **treemap** nodes more effectively. Our proposed spatial squarified layout algorithm provides a more consistent arrangement of nodes while maintaining low aspect ratios. It is suitable for the arrangement of data with a geographic component and can be used to create tessellated cartograms for geovisualization. Locational consistency is measured and visualized and a number of layout algorithms are compared. CIELab color space and displacement vector overlays are used to assess and emphasize the spatial layout of **treemap** nodes. A case study involving locations of tagged photographs in the Flickr database is described.

http://dx.doi.org/10.1109/TVCG.2008.165


### [8] The Shaping of Information by Visual Metaphors

The nature of an information visualization can be considered to lie in the visual metaphors it uses to structure information. The process of understanding a visualization therefore involves an interaction between these external visual metaphors and the user's internal knowledge representations. To investigate this claim, we conducted an experiment to test the effects of visual metaphor and verbal metaphor on the understanding of tree visualizations. Participants answered simple data comprehension questions while viewing either a **treemap** or a node-link diagram. Questions were worded to reflect a verbal metaphor that was either compatible or incompatible with the visualization a participant was using. The results suggest that the visual metaphor indeed affects how a user derives information from a visualization. Additionally, we found that the degree to which a user is affected by the metaphor is strongly correlated with the user's ability to answer task questions correctly. These findings are a first step towards illuminating how visual metaphors shape user understanding, and have significant implications for the evaluation, application, and theory of visualization.

http://dx.doi.org/10.1109/TVCG.2008.171


### [9] A 3D treemap approach for analyzing the classificatory distribution in patent portfolios

Due to the complexity of the patent domain and the huge amount of data, advanced interactive visual techniques are needed to support the analysis of large patent collections and portfolios. In this paper we present a new approach for visualizing the classificatory distribution of patent collections among the International Patent Classification (IPC) - todaypsilas most important internationally agreed patent classification system with about 70.000 categories. Our approach is based on an interactive three-dimensional **treemap** overlaid with adjacency edge bundles.

http://dx.doi.org/10.1109/VAST.2008.4677380


### [10] ResultMaps: Visualization for Search Interfaces

Hierarchical representations are common in digital repositories, yet are not always fully leveraged in their online search interfaces. This work describes ResultMaps, which use hierarchical **treemap** representations with query string-driven digital library search engines. We describe two lab experiments, which find that ResultsMap users yield significantly better results over a control condition on some subjective measures, and we find evidence that ResultMaps have ancillary benefits via increased understanding of some aspects of repository content. The ResultMap system and experiments contribute an understanding of the benefits-direct and indirect-of the ResultMap approach to repository search visualization.

http://dx.doi.org/10.1109/TVCG.2009.176


### [11] Perceptual Guidelines for Creating Rectangular Treemaps

**Treemaps** are space-filling visualizations that make efficient use of limited display space to depict large amounts of hierarchical data. Creating perceptually effective **treemaps** requires carefully managing a number of design parameters including the aspect ratio and luminance of rectangles. Moreover, **treemaps** encode values using area, which has been found to be less accurate than judgments of other visual encodings, such as length. We conduct a series of controlled experiments aimed at producing a set of design guidelines for creating effective rectangular **treemaps**. We find no evidence that luminance affects area judgments, but observe that aspect ratio does have an effect. Specifically, we find that the accuracy of area comparisons suffers when the compared rectangles have extreme aspect ratios or when both are squares. Contrary to common assumptions, the optimal distribution of rectangle aspect ratios within a **treemap** should include non-squares, but should avoid extremes. We then compare **treemaps** with hierarchical bar chart displays to identify the data densities at which length-encoded bar charts become less effective than area-encoded **treemaps**. We report the transition points at which **treemaps** exhibit judgment accuracy on par with bar charts for both leaf and non-leaf tree nodes. We also find that even at relatively low data densities **treemaps** result in faster comparisons than bar charts. Based on these results, we present a set of guidelines for the effective use of **treemaps** and suggest alternate approaches for **treemap** layout.

http://dx.doi.org/10.1109/TVCG.2010.186


### [12] Stacking Graphic Elements to Avoid Over-Plotting

An ongoing challenge for information visualization is how to deal with over-plotting forced by ties or the relatively limited visual field of display devices. A popular solution is to represent local data density with area (bubble plots, **treemaps**), color(heatmaps), or aggregation (histograms, kernel densities, pixel displays). All of these methods have at least one of three deficiencies:1) magnitude judgments are biased because area and color have convex downward perceptual functions, 2) area, hue, and brightnesshave relatively restricted ranges of perceptual intensity compared to length representations, and/or 3) it is difficult to brush or link toindividual cases when viewing aggregations. In this paper, we introduce a new technique for visualizing and interacting with datasets that preserves density information by stacking overlapping cases. The overlapping data can be points or lines or other geometric elements, depending on the type of plot. We show real-dataset applications of this stacking paradigm and compare them to other techniques that deal with over-plotting in high-dimensional displays.

http://dx.doi.org/10.1109/TVCG.2010.197


### [13] DICON: Interactive Visual Analysis of Multidimensional Clusters

Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a **treemap**-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis.

http://dx.doi.org/10.1109/TVCG.2011.188


### [14] Product Plots

We propose a new framework for visualising tables of counts, proportions and probabilities. We call our framework product plots, alluding to the computation of area as a product of height and width, and the statistical concept of generating a joint distribution from the product of conditional and marginal distributions. The framework, with extensions, is sufficient to encompass over 20 visualisations previously described in fields of statistical graphics and infovis, including bar charts, mosaic plots, **treemaps**, equal area plots and fluctuation diagrams.

http://dx.doi.org/10.1109/TVCG.2011.227


### [15] Capturing the Design Space of Sequential Space-filling Layouts

We characterize the design space of the algorithms that sequentially tile a rectangular area with smaller, fixed-surface, rectangles. This space consist of five independent dimensions: Order, Size, Score, Recurse and Phrase. Each of these dimensions describe a particular aspect of such layout tasks. This class of layouts is interesting, because, beyond encompassing simple grids, tables and trees, it also includes all kinds of **treemaps** involving the placement of rectangles. For instance, Slice and dice, Squarified, Strip and Pivot layouts are various points in this five dimensional space. Many classic statistics visualizations, such as 100% stacked bar charts, mosaic plots and dimensional stacking, are also instances of this class. A few new and potentially interesting points in this space are introduced, such as spiral **treemaps** and variations on the strip layout. The core algorithm is implemented as a JavaScript prototype that can be used as a layout component in a variety of InfoViz toolkits.

http://dx.doi.org/10.1109/TVCG.2012.205


### [16] Organizing Search Results with a Reference Map

We propose a method to highlight query hits in hierarchically clustered collections of interrelated items such as digital libraries or knowledge bases. The method is based on the idea that organizing search results similarly to their arrangement on a fixed reference map facilitates orientation and assessment by preserving a user's mental map. Here, the reference map is built from an MDS layout of the items in a Voronoi **treemap** representing their hierarchical clustering, and we use techniques from dynamic graph layout to align query results with the map. The approach is illustrated on an archive of newspaper articles.

http://dx.doi.org/10.1109/TVCG.2012.250


### [17] Sketchy Rendering for Information Visualization

We present and evaluate a framework for constructing sketchy style information visualizations that mimic data graphics drawn by hand. We provide an alternative renderer for the Processing graphics environment that redefines core drawing primitives including line, polygon and ellipse rendering. These primitives allow higher-level graphical features such as bar charts, line charts, **treemaps** and node-link diagrams to be drawn in a sketchy style with a specified degree of sketchiness. The framework is designed to be easily integrated into existing visualization implementations with minimal programming modification or design effort. We show examples of use for statistical graphics, conveying spatial imprecision and for enhancing aesthetic and narrative qualities of visualization. We evaluate user perception of sketchiness of areal features through a series of stimulus-response tests in order to assess users' ability to place sketchiness on a ratio scale, and to estimate area. Results suggest relative area judgment is compromised by sketchy rendering and that its influence is dependent on the shape being rendered. They show that degree of sketchiness may be judged on an ordinal scale but that its judgement varies strongly between individuals. We evaluate higher-level impacts of sketchiness through user testing of scenarios that encourage user engagement with data visualization and willingness to critique visualization design. Results suggest that where a visualization is clearly sketchy, engagement may be increased and that attitudes to participating in visualization annotation are more positive. The results of our work have implications for effective information visualization design that go beyond the traditional role of sketching as a tool for prototyping or its use for an indication of general uncertainty.

http://dx.doi.org/10.1109/TVCG.2012.262


### [18] Nmap: A Novel Neighborhood Preservation Space-filling Algorithm

Space-filling techniques seek to use as much as possible the visual space to represent a dataset, splitting it into regions that represent the data elements. Amongst those techniques, **Treemaps** have received wide attention due to its simplicity, reduced visual complexity, and compact use of the available space. Several different **Treemap** algorithms have been proposed, however the core idea is the same, to divide the visual space into rectangles with areas proportional to some data attribute or weight. Although pleasant layouts can be effectively produced by the existing techniques, most of them do not take into account relationships that might exist between different data elements when partitioning the visual space. This violates the distance-similarity metaphor, that is, close rectangles do not necessarily represent similar data elements. In this paper, we propose a novel approach, called Neighborhood **Treemap** (Nmap), that seeks to solve this limitation by employing a slice and scale strategy where the visual space is successively bisected on the horizontal or vertical directions and the bisections are scaled until one rectangle is defined per data element. Compared to the current techniques with the same similarity preservation goal, our approach presents the best results while being two to three orders of magnitude faster. The usefulness of Nmap is shown by two applications involving the organization of document collections and the construction of cartograms illustrating its effectiveness on different scenarios.

http://dx.doi.org/10.1109/TVCG.2014.2346276


### [19] Visualizing Statistical Mix Effects and Simpson's Paradox

We discuss how â€œmix effectsâ€ can surprise users of visualizations and potentially lead them to incorrect conclusions. This statistical issue (also known as â€œomitted variable biasâ€ or, in extreme cases, as â€œSimpson's paradoxâ€) is widespread and can affect any visualization in which the quantity of interest is an aggregated value such as a weighted sum or average. Our first contribution is to document how mix effects can be a serious issue for visualizations, and we analyze how mix effects can cause problems in a variety of popular visualization techniques, from bar charts to **treemaps**. Our second contribution is a new technique, the â€œcomet chart,â€ that is meant to ameliorate some of these issues.

http://dx.doi.org/10.1109/TVCG.2014.2346297


### [20] How do People Make Sense of Unfamiliar Visualizations?: A Grounded Model of Novice's Information Visualization Sensemaking

In this paper, we would like to investigate how people make sense of unfamiliar information visualizations. In order to achieve the research goal, we conducted a qualitative study by observing 13 participants when they endeavored to make sense of three unfamiliar visualizations (i.e., a parallel-coordinates plot, a chord diagram, and a **treemap**) that they encountered for the first time. We collected data including audio/video record of think-aloud sessions and semi-structured interview; and analyzed the data using the grounded theory method. The primary result of this study is a grounded model of NOvice's information VIsualization Sensemaking (NOVIS model), which consists of the five major cognitive activities: 1 encountering visualization, 2 constructing a frame, 3 exploring visualization, 4 questioning the frame, and 5 floundering on visualization. We introduce the NOVIS model by explaining the five activities with representative quotes from our participants. We also explore the dynamics in the model. Lastly, we compare with other existing models and share further research directions that arose from our observations.

http://dx.doi.org/10.1109/TVCG.2015.2467195


### [21] PowerSet: A Comprehensive Visualization of Set Intersections

When analyzing a large amount of data, analysts often define groups over data elements that share certain properties. Using these groups as the unit of analysis not only reduces the data volume, but also allows detecting various patterns in the data. This involves analyzing intersection relations between these groups, and how the element attributes vary between these intersections. This kind of set-based analysis has various applications in a variety of domains, due to the generic and powerful notion of sets. However, visualizing intersections relations is challenging because their number grows exponentially with the number of sets. We present a novel technique based on **Treemaps** to provide a comprehensive overview of non-empty intersections in a set system in a scalable way. It enables gaining insight about how elements are distributed across these intersections as well as performing fine-grained analysis to explore and compare their attributes both in overview and in detail. Interaction allows querying and filtering these elements based on their set memberships. We demonstrate how our technique supports various use cases in data exploration and analysis by providing insights into set-based data, beyond the limits of state-of-the-art techniques.

http://dx.doi.org/10.1109/TVCG.2016.2598496


### [22] Optimizing Hierarchical Visualizations with the Minimum Description Length Principle

In this paper we examine how the Minimum Description Length (MDL) principle can be used to efficiently select aggregated views of hierarchical datasets that feature a good balance between clutter and information. We present MDL formulae for generating uneven tree cuts tailored to **treemap** and sunburst diagrams, taking into account the available display space and information content of the data. We present the results of a proof-of-concept implementation. In addition, we demonstrate how such tree cuts can be used to enhance drill-down interaction in hierarchical visualizations by implementing our approach in an existing visualization tool. Validation is done with the feature congestion measure of clutter in views of a subset of the current DMOZ web directory, which contains nearly half million categories. The results show that MDL views achieve near constant clutter level across display resolutions. We also present the results of a crowdsourced user study where participants were asked to find targets in views of DMOZ generated by our approach and a set of baseline aggregation methods. The results suggest that, in some conditions, participants are able to locate targets (in particular, outliers) faster using the proposed approach.

http://dx.doi.org/10.1109/TVCG.2016.2598591


### [23] How ideas flow across multiple social groups

Tracking how correlated ideas flow within and across multiple social groups facilitates the understanding of the transfer of information, opinions, and thoughts on social media. In this paper, we present IdeaFlow, a visual analytics system for analyzing the lead-lag changes within and across pre-defined social groups regarding a specific set of correlated ideas, each of which is described by a set of words. To model idea flows accurately, we develop a random-walk-based correlation model and integrate it with Bayesian conditional cointegration and a tensor-based technique. To convey complex lead-lag relationships over time, IdeaFlow combines the strengths of a bubble tree, a flow map, and a timeline. In particular, we develop a Voronoi-**treemap**-based bubble tree to help users get an overview of a set of ideas quickly. A correlated-clustering-based layout algorithm is used to simultaneously generate multiple flow maps with less ambiguity. We also introduce a focus+context timeline to explore huge amounts of temporal data at different levels of time granularity. Quantitative evaluation and case studies demonstrate the accuracy and effectiveness of IdeaFlow.

http://dx.doi.org/10.1109/VAST.2016.7883511
