Data Visualization Show & Tell @Feedzai

# [Width-Scale Bar Charts for Data with Large Value Range](https://diglib.eg.org/handle/10.2312/evs20201056)

M. Höhn, M. Wunderlich, K. Ballweg, T. von Landesberger @EGEV2020

In [None]:
%load_ext autoreload
%autoreload 2

from utils import load_data
from charts import ssb_chart, omm_chart, wsb_chart, bar_chart, show_population, show_log_error
from theme import set_alt_aesthetic

In [None]:
set_alt_aesthetic()

In [None]:
show_population()

# 💪 Motivation

Datasets with a wide range of values (bar heights ▂▁▇) can be difficult to visualize through linear bar charts.

So, let's try to improve the **readibility** of the values with a new design for bar charts!

# 💪 Motivation

Personally, I found the idea of dividing numbers into several encodings (spoiler!) quite interesting and *simple*.

The focus of this presentation will be on the bar chart designs contained in this paper.

# 🎨 Width-Scale Bar Chart

- Each value ($v$) is divided into two parts — mantissa ($m$) and exponent ($e$) — that follow scientific notation: $v = m \times 10^e$ (w/ $1 \leq m < 10$ and $e \in \mathbb{Z}$ a.k.a. set of integers)

- 3 visual variables: height, width, and color

- $m$ is mapped to the height

- The domain of the Y-axis is [0, 10]

# 🎨 Width-Scale Bar Chart

- $e$ is mapped to both the width and the color (double encoding)

- Larger $e$ are represented by darker colors and wider bar width

# 🎨 Width-Scale Bar Chart

- For color, a yellow-orange (or orange-red?) sequential multi-hue scheme is used

- The authors pointed out two reasons for this choice:

- (1) Our visual system has maximum sensitivity to changes in luminance (perceived brightness) in this type of scheme

- (2) This type of scheme is colorblind-friendly (*contrast* is achieved)

- **Off-topic**: The [*colorspace*](http://colorspace.r-forge.r-project.org/index.html) R package is a great resource for learning about color

In [None]:
data = load_data(show_unicode=True)
data

In [None]:
wsb_chart(data)

# 📝 Evaluation

This design was contrasted with the following designs:

- Linear Bar Chart (LIN)
- Log Bar Chart (LOG)
- Order of Magnitude Markers (OMM)
- Scale-Stack Bar Chart (SSB)

In [None]:
bar_chart(data, yscale="linear") | bar_chart(data, yscale="log")

In [None]:
omm_chart(data)

In [None]:
ssb_chart(data)

# 📝 Evaluation

- Data: $0 < e \leq 4$
- Number of participants: 136 (-21 = 115)
- Between-subject study: 1 design × 4 tasks × 6 repetitions = 24 tasks per participant
- Metrics: log error ($e_{log} = log_{10}(\frac{response_{v}}{encoded_{v}})$) for the _Value_ and _Ratio_ tasks, binary error ($1 - accuracy$) for the _Sort_ and _Trend_ tasks, and response time

In [None]:
show_log_error()


# 📝 Evaluation

**Tasks**:

- Value: "read the value"
- Sort: "sort all values in ascending order"
- Ratio: "determine the ratio of two values"
- Trend: "identify the trend in the data from the choice of linear, a logarithmic or an exponential, or none"

# Let's take a look...

<figure>
  <img src="study-results-hohn_et_al_2020.png" alt="Evaluation results" style="width:95%">
  <figcaption>Source: Höhn et al., 2020</figcaption>
</figure>

# 💯 Results

Ranking:

- Value: **WSB**≺LOG≺OMM≼LIN≺SSB
- Sort: SSB≼**WSB**≼LOG≼OMM≺LIN
- Ratio: SSB≼**WSB**≺LOG≼OMM≺LIN
- Trend: LIN≼SSB≺**WSB**≺OMM≼LOG

# Thanks! 📊

The Width-Scale Bar Chart is an alternative to the typical bar chart to visualize data with a wide value/bar height range.

- [Paper](https://diglib.eg.org/handle/10.2312/evs20201056)
- [Presentation @EGEV2020](https://youtu.be/edIPfDIH1p0?t=1064)
- [Order of Magnitude Markers paper](http://cs.swan.ac.uk/~csmark/publications/2014_Order_of_Magnitude_Markers.html)
- [Scale-Stack Bar Chart paper](https://www.researchgate.net/publication/263008873_Scale-Stack_Bar_Charts)