# The Lie Factor$^1$

## Definition

The "Lie Factor" is a value to describe the relation between the size of effect shown in a graphic and the size of effect shown in the data.

Edward Tufte, Prof. at the Yale University, defined the “Lie Factor” in his book “The Visual Display of Quantitative Information” in 1983.[Tufte, 1991]

He states the principle that

The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.

This results in the following formula:
$$
\text{Lie Factor} = \frac{\text{size of effect shown in graphic}}{\text{size of effect in data}}
$$
where
$$
\text{size of effect} = \frac{\text{|second value - first value|}}{\text{first value}}
$$

In other words, the Lie Factor is “the size of an effect shown in a graph divided by the actual size of the effect in the data on which the graph is based”. [Craven, 2001]

## Description

To ensure the Integrity of a graphic, its Lie Factor should have a value between 0.95 and 1.05. If the value is less or greater, it indicates a substantial (and often intended) distortion, far beyond minor inaccuracies (e.g. caused by plotting).

For example, if the lie factor had the value 1.02, then the graphic would not be described as a “lying” one, since a jitter in the graphic or another minor error is more likely the cause for the given distortion. On the other hand, if the factor had the value 1.45, we can generally assume that the producer intentionally distorted the graphic. Is the Lie Factor greater than 1, the concerned graphic is so called “Overstating”. Is it less than 1, the graphic is “Understating”.





1: https://infovis-wiki.net/wiki/Lie_Factor

# Example
![](https://miro.medium.com/max/700/1*uI-z9s9_URfjzpQc3FCDxw.png)

In this chart from the Argentinian TV channel C5N, it looks like Argentina is testing for COVID-19 at almost the same rate as the United States (EEUU). However, by inspecting the numbers, we immediately observe that Argentina is doing significantly worse than the US. Let's calculate the Lie Factor:

* The Argentinian bar is about 80 pixels tall. The American bar is about 100 pixels tall.
* The size of the effect shown in graphic is then (100-80)/80 = 0.25, and the size of the effect in data is (7000-330)/(330) = 20.2
* The lie factor is thus 0.25 / 20.2=0.0123.

We can now conclude that the graphic is understating.

# Exercise
Find the Lie Factors (outside the range [0.95 : 1.05]) in the following charts.
In order to measure pixel coordinates, you can use the screenshot tool (Command-Shift-4) on Mac. On Windows, you can save the screenshot from snipping tool (Win+Shift+s) and look at its properties to see its size in pixels. On Linux, use the screenshot tool of your choice (scrot and flameshot are popular), and use the `file` command to see its dimensions.

![](https://www.callingbullshit.org/tools/img/GTAI_average_hours.gif)

In [7]:
Romania = 472 - 191
UK = 450 - 191
Germany = 429 - 191
Poland = 401 - 191
Neherlands = 395 - 191
Slovak_Republic = 395 - 191
Czech = 391 - 191
EU_28 = 381 - 191
Spain = 381 - 191
Hungary = 365 - 191
Sweden =332 - 191
Belgium = 305 - 191
Italy = 300 - 191
France = 268 - 191

In [8]:
def lie_factor(graphic_1, graphic_2, data_1, data_2):
    graph_effect = (graphic_1 - graphic_2) / graphic_2
    data_effect = (data_1 - data_2) / data_2
    
    return graph_effect/data_effect

'''
Compared to Value1, Value2 is either understated or overstated
'''

In [9]:
lie_factor(Romania, UK, 41.2, 40.8)

8.66409266409254

In [10]:
lie_factor(Romania, France, 41.2, 37.4)

26.075187969924784

In [14]:
lie_factor(Czech, EU_28, 39.7, 39.5)

10.394736842105115

![](https://peltiertech.com/images/2011-11/Ybroken.png)

In [11]:
lie_factor(200, 50, 35, 1)

0.08823529411764706

![](https://pbs.twimg.com/media/CzenChZXAAASXED?format=png&name=small)

In [12]:
'''
THe problem with this is that the bars are equal length, so their difference is 0 and Lie Factor cannot be used.
'''
lie_factor(100, 100, 100, 99)

0.0