# Visualize amounts with dot plots and heatmaps


This is the second installment in a series of blog posts where we reproduce plots from Claus Wilke’s book, *Fundamentals of Data Visualization.*

This page demonstrates how to  recreate the dot plots and heatmaps found in the [Visualising amounts](https://clauswilke.com/dataviz/visualizing-amounts.html#dot-plots-and-heatmaps) 
chapter of the book. We will use the Bokeh `circle()`, and `rect()` [glyphs](https://docs.bokeh.org/en/latest/docs/user_guide/topics/categorical.html#heatmaps) to create the dot plots and heatmaps.


In [None]:
from bokeh.io import output_notebook

output_notebook()  # render plots inline in notebook

## Dot plots

The plots in this sub-section represent the life expectancies of countries in the Americas, for the year 2007.

The `circle()` glyph is used to create the dot plots.

#### Data preparation

In [None]:
# import the relevant libraries
import pandas as pd

In [None]:
file = "../data/csv_files/life_expectancy.csv"
df = pd.read_csv(file)

# select only the relevant columns
df = df.loc[:, ["country", "2007"]]

americas = (
    "Argentina",
    "Bolivia",
    "Brazil",
    "Canada",
    "Chile",
    "Colombia",
    "Costa Rica",
    "Cuba",
    "Dominican Republic",
    "Ecuador",
    "El_Salvador",
    "Guatemala",
    "Haiti",
    "Honduras",
    "Jamaica",
    "Mexico",
    "Nicaragua",
    "Panama",
    "Paraguay",
    "Peru",
    "Puerto Rico",
    "Trinidad and Tobago",
    "United States",
    "Uruguay",
    "Venezuela",
)

# create a new dataframe consisting of only american countries
df = df[df["country"].isin(americas)].reset_index(drop=True)
df = df.rename(columns={"2007": "years"})
df["years"] = df["years"].round()

df

#### Plotting

In [None]:
from bokeh.plotting import figure, show

In [None]:
# plot by country in alphabetical order

# sort dataframe by "country" column in descending order
df = df.sort_values("country", ascending=False)

# create figure object
p = figure(
    title="Figure 6.13 Life expectancy",  # plot title
    height=400,  # plot height
    y_range=df.country,  # categorical range on the y-axis
    x_axis_label="life expectancy (years)",
    sizing_mode="stretch_width",  # make plot width responsive to screen size
)

# create dot plot
p.circle(
    x="years",  # x-axis column name
    y="country",  # y-axis column name
    source=df,  # data source for x and y axis
    size=8,  # circle size
)

# plot customization

# remove line color and minor ticks in x-axis
p.xaxis.minor_tick_out = 0
p.xaxis.axis_line_color = None

# remove line color in y-axis
p.yaxis.axis_line_color = None


show(p)  # display plot

In [None]:
# plot by life expectancy in descending order

# sort dataframe by "years" column in ascending order
df = df.sort_values("years")

p = figure(
    title="Figure 6.11 Life expectancy",
    height=400,
    y_range=df.country,
    x_axis_label="life expectancy (years)",
    sizing_mode="stretch_width",
)

p.circle(x="years", y="country", source=df, size=8)

p.xaxis.minor_tick_out = 0
p.xaxis.axis_line_color = None
p.yaxis.axis_line_color = None

show(p)

You can further customise your `circle()` plot by using additional paramaters such as:

- `alpha`
- `color`
- `legend_field`

For more information on the `circle()` glyph, check our user guide [here](https://docs.bokeh.org/en/latest/docs/reference/plotting/figure.html#bokeh.plotting.figure.circle).

## Heatmap

The plot in this sub-section represents Internet adoption over time for selected countries.

The `rect()` glyph is used to create the heatmap.

#### Data preparation

In [None]:
file = "../data/csv_files/Internet_user.csv"
df = pd.read_csv(file, encoding="ISO-8859-1")

countries = (
    "Iceland",
    "Norway",
    "United Kingdom",
    "Japan",
    "Canada",
    "Germany",
    "New Zealand",
    "France",
    "Israel",
    "United States",
    "Argentina",
    "Chile",
    "Italy",
    "Brazil",
    "Mexico",
    "South Africa",
    "China",
    "Algeria",
    "India",
    "Kenya",
)

# create new dataframe with only the selected countries and columns
df = df[df["country"].isin(countries)].reset_index(drop=True).fillna(0)
df = df.drop(["country_code", "indicator", "indicator_code"], axis=1)

# stack dataframe columns
df = pd.DataFrame(df.set_index("country").stack(), columns=["percentage"])
df = df.reset_index().rename(columns={"level_1": "year"}).fillna(0)

# convert "year" column to integer type
df["year"] = df.year.astype(int)

df

#### Plotting

In [None]:
# import relevenat libraries
from bokeh.transform import transform
from bokeh.models import ColorBar, LinearColorMapper, FixedTicker

In [None]:
# plot heatmap

# create figure object
p = figure(
    title="Figure 6.15 Internet adoption over time",  # plot title
    height=400,  # plot height
    toolbar_location=None,  # remove toolbars
    y_axis_location="right",  # display y axis on the right of plot
    y_range=countries[::-1],  # categorical range of y-axis in reverse order
)

# create color mapper object
mapper = LinearColorMapper(
    palette="Magma256", low=min(df["percentage"]), high=max(df["percentage"])
)

# create rectangle glyph
p.rect(
    x="year",  # x-axis column name
    y="country",  # y-axis column name
    width=2,  # rectangle width
    height=1,  # rectangle height
    source=df,  # data source for x and y axis columns
    # map percentage values to color mapper object using transform
    fill_color=transform("percentage", mapper),
    line_color="white",  # rectangle line color
)


# plot customization

# configure x-axis ticks to show only specified tick labels
p.xaxis.ticker = [1995, 2000, 2005, 2010, 2015]

# start and end x-axis at the specified years
p.x_range.start = 1993
p.x_range.end = 2016

# remove x-axis major ticks
p.xaxis.major_tick_line_color = None
p.xaxis.major_tick_out = 0

# remove y-axis lines and ticks
p.yaxis.minor_tick_out = 0
p.yaxis.major_tick_out = 0
p.yaxis.major_tick_line_color = None
p.yaxis.axis_line_color = None

# create color bar object
color_bar = ColorBar(
    color_mapper=mapper,
    location=(0, 0),
    ticker=FixedTicker(ticks=[0, 25, 50, 75, 100]),
    title="internet users / 100 people",
    title_text_font_style="normal",
    major_tick_line_color=None,
    width=300,
    height=20,
)

# add color bar above the plot
p.add_layout(color_bar, "above")


show(p)

The `transform` method is used to apply the color to the rectangles using the `fill_color` parameter. It takes a column name and applies a transform function to the column name. For more information about `transform`, visit our reference section [here](https://docs.bokeh.org/en/latest/docs/reference/transform.html#module-bokeh.transform)

For more information on the `rect()` glyph, check our user guide [here](https://docs.bokeh.org/en/latest/docs/user_guide/topics/categorical.html#heatmaps).