# Hanging Rootogram

![alt text](https://datavizproject.com/wp-content/uploads/2016/06/DVP_1_100-83.png)

Comparing the distribution of data with a theoretical distribution from an ordinary histogram can be difficult because small frequencies are dominated by the larger frequencies and it is hard to perceive the pattern of differences between the histogram bars and the curve. Therefore John Tukey introduced the Hanging Rootogram in 1971 (also called Tukey’s Hanging Rootogram) in order to solve these problems. In this visualization the comparison is made easier by ‘hanging’ the observed results from the theoretical curve, so that the discrepancies are seen by comparison with the horizontal axis rather than a sloping curve. As in the rootogram, the vertical axis is scaled to the square-root of the frequencies so as to draw attention to discrepancies in the tails of the distribution.

 

It is a variation of the concept of histograms and Pareto charts by combining observed and predicted in a simple way where the line charts display that the data is continuously changing.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
#import matplotlib.mlab as mlab
from scipy.stats import norm

import matplotlib as mpl
mpl.style.use(['fivethirtyeight'])

%matplotlib inline

fig, ax = plt.subplots(1, 2)
mu = 10
sig = 0.3
my_data = np.random.normal(mu, sig, 200)
x = np.linspace(9, 11, 100)

# I plot the data twice, one for the histogram only for comparison,
# and one for the rootogram.
# The trick will be to modify the histogram to make it hang to
# the expected distribution curve:

for a in ax:
    a.hist(my_data, density=True)
    #a.plot(x, mlab.normpdf(x, mu, sig))
    a.plot(x, norm.pdf(x, mu, sig))
    a.set_ylim(-0.2)
    a.set_xlim(9, 11)
    a.hlines(0, 9, 11, linestyle="--")

for rectangle in ax[1].patches:

    # expected value in the middle of the bar
    #exp = mlab.normpdf(rectangle.get_x() + rectangle.get_width()/2., mu, sig)
    exp = norm.pdf(rectangle.get_x() + rectangle.get_width()/2., mu, sig)

    # difference to the expected value
    diff = exp - rectangle.get_height()
    rectangle.set_y(diff)

    ax[1].plot(rectangle.get_x() + rectangle.get_width()/2., exp, "ro")

ax[0].set_title("histogram")
ax[1].set_title("hanging rootogram")
plt.tight_layout()