# Pie Chart

A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical proportion.

In [1]:
from lets_plot import *
from lets_plot.mapping import *
from lets_plot.geo_data import *

The geodata is provided by © OpenStreetMap contributors and is made available here under the Open Database License (ODbL).


In [2]:
LetsPlot.setup_html()

In [3]:
def get_data():
    import pandas as pd
    df = pd.read_csv('https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv')
    df['explode'] = [0.2 if c == 'pickup' else 0.0 for c in df['class']]
    return df

In [4]:
df = get_data()
print(df.shape)
df.head()

(234, 13)


Unnamed: 0.1,Unnamed: 0,manufacturer,model,displ,year,cyl,trans,drv,cty,hwy,fl,class,explode
0,1,audi,a4,1.8,1999,4,auto(l5),f,18,29,p,compact,0.0
1,2,audi,a4,1.8,1999,4,manual(m5),f,21,29,p,compact,0.0
2,3,audi,a4,2.0,2008,4,manual(m6),f,20,31,p,compact,0.0
3,4,audi,a4,2.0,2008,4,auto(av),f,21,30,p,compact,0.0
4,5,audi,a4,2.8,1999,6,auto(l5),f,16,26,p,compact,0.0


## Basic Pie Chart

You can use `identity` statistical transformation to leave the data unchanged.

To do this, let's calculate the classes of cars.

In [5]:
class_counts_df = df['class'].value_counts().reset_index()
class_counts_df.columns = ['class', 'count']
class_counts_df

Unnamed: 0,class,count
0,suv,62
1,compact,47
2,midsize,41
3,subcompact,35
4,pickup,33
5,minivan,11
6,2seater,5


In [6]:
ggplot(class_counts_df) + \
    geom_pie(aes(slice='count', fill='class'), stat='identity') + \
    ggsize(600, 400)

## `count2d` Statistical Transformation

`geom_pie()` uses `count2d` stat by default. 
It allows to make a slice sizes proportional to the number of cases in each group  (or if the weight aesthetic is supplied, the sum of the weights). 

There is no need to count quantities manually, statistics will do this.

In [7]:
ggplot(df) + geom_pie(aes(fill='class')) + ggsize(600, 400)

`count2d` statistics provides the following variables:

- `..count..` and `..sum..` - number of observations at a given location and total number of observations;
- `..prop..` and `..proppct..` - proportion of observations belonging to a given group relative to the number of observations at a given location (and the same in percent);
- `..sumprop..` and `..sumpct..` - proportion of observations at a given location relative to the total number of observations (and in percent).

Using `layer_tooltips()` let's prepare tooltips by adding the variables provided by `count2d`.

In [8]:
tooltip_content = layer_tooltips().line('count|@{..count..} (@{..proppct..})')\
                                  .line('total|@{..sum..}')

In [9]:
ggplot(df) + \
    geom_pie(aes(fill='class'), tooltips=tooltip_content) + \
    ggsize(600, 400)

Let's order sectors by count. The following ordering rule is used for the pie chart: the first slice goes to the left of 12 o'clock and others go clockwise.

In [10]:
ggplot(df) + \
    geom_pie(aes(fill=as_discrete('class', order_by='..count..')),
             tooltips=tooltip_content) + \
    ggsize(600, 400)

Let's compute weighted sum instead of simple count with aesthetic `weight`.

In [11]:
ggplot(df) + \
    geom_pie(aes(fill='class', weight='displ'),
             tooltips=tooltip_content.format('..sum..', '.1f')) + \
    ggsize(600, 400)

Let's compare the variable proportions.

In [12]:
ggplot(df, aes('drv', as_discrete('year'))) + \
    geom_pie(aes(fill='class'),
             tooltips=layer_tooltips()\
                      .line('class size|@{..count..} (@{..proppct..})')\
                      .format('@..count..', 'd')\
                      .line('total size|@{..sum..} (@{..sumpct..})')) + \
    scale_y_discrete(format="d") + \
    scale_size(guide='none') + \
    ggsize(600, 400)

## Improve Appearance

We can improve appearance using additional parameters for pie-chart:
- make the pie bigger (`size`)
- add a hole to draw donut-like chart (`hole`)
- use blank theme (it removes axes and grid)

In [13]:
ggplot(df) + \
    geom_pie(aes(fill='class'),
             size=20, hole=.3, tooltips=tooltip_content) + \
    ggsize(600, 400) + \
    theme_void()

## Add Labels to Pie Sectors

Let's label the sectors with their names. You can configure annotations via `layer_labels()` function.

In [14]:
ggplot(df) + \
    geom_pie(aes(fill=as_discrete('class', order_by='..count..')),
             size=20, hole=.3,
             labels=layer_labels().line('@class').size(14)) + \
    ggsize(600, 400) + \
    theme_void() + \
    theme(legend_position='none')

## Pie Size Depending on Data

To make the size of the pie chart dependent on the data you can map total count (`..sum..` variable) to the `size`.

Note that it has its own special representation of the size in the legend.

In [15]:
ggplot(df, aes('drv', as_discrete('year'))) + \
    geom_pie(aes(fill='class', size='..sum..')) + \
    scale_y_discrete(format="d") + \
    guides(fill='none') + \
    ggsize(600, 400)

## Parameter `size_unit`

You can use the `size_unit` parameter to relate the size of the 'pie' to the length of the unit step along one of the axis.

Let's make the pie diameter equal to the unit step along the x-axis.

In [16]:
def get_plot(size_unit=None):
    size = 8
    return ggplot(df) + \
        geom_pie(aes(fill='class', weight='displ'),
                 size=size, size_unit=size_unit) + \
        coord_fixed(xlim=[-size/2, size/2], ylim=[-size/2, size/2]) + \
        ggtitle("size={0}, size_unit={1}".format(size, size_unit))

gggrid([get_plot(), get_plot('x')]) + ggsize(800, 300)

## Explode

You can use values between 0 and 1 to explode slices away from their center point, detaching them from the main pie.

In [17]:
ggplot(df) + \
    geom_pie(aes(fill='class', explode='explode'),
             color='black', size=20) + \
    scale_fill_gradient(low='dark_blue', high='light_green') + \
    ggsize(600, 400) + \
    theme_void()

## Pie Chart Stroke and Spacers

The `stroke` and the `color` aesthetics respectively set **line width** and **line color** of the pie sector arcs. 

The `stroke_side` parameter - 'inner', 'outer', 'both' (default) - specifies where to show the arc.

By default `stroke` is 0, thus no arc is shown regardless of the value of `stroke_side` parameter.

Parameters `spacer_width` and `spacer_color` define lines between sectors. The default is a narrow segment of the same color as the plot background. Spacers are not applied to exploded sectors and to the sides of adjacent sectors.

### `stroke` and `color`

In [18]:
ggplot(df) + \
    geom_pie(aes(fill='class', color='class'),
             size=20, stroke=7, alpha=.3) + \
    ggsize(600, 400) + \
    theme_void()

### `stroke_side`

Note: `stroke=7` was added to make the arcs visible.

In [19]:
p1 = ggplot(df, aes(fill='class', color='class')) + theme_void()

gggrid([
    p1 + geom_pie(hole=0.3, stroke=7, alpha=.3, show_legend=False) + ggtitle('Default'),
    p1 + geom_pie(hole=0.3, stroke=7, alpha=.3, stroke_side='Inner', show_legend=False) + ggtitle('Inner stroke'),
    p1 + geom_pie(hole=0.3, stroke=7, alpha=.3, stroke_side='Outer', show_legend=False) + ggtitle('Outer stroke'),
    p1 + geom_pie(hole=0.3, stroke=7, alpha=.3, stroke_side='both', show_legend=False) + ggtitle('Inner & outer stroke')
]) + ggsize(1200, 200)

### `spacer_width` and `spacer_color`

Spacer is a thin line separating the pie' slices. You can adjust width and color of spacers.

In [20]:
ggplot(df) + \
    geom_pie(aes(fill='class'), 
             size=20, hole=.3, stroke=0,
             spacer_width=4, spacer_color='light-gray') + \
    ggsize(600, 400) + \
    theme_void()

### Spacers with Exploded Sectors

Spacers are not shown for exploded sectors.

In [21]:
ggplot(df) + \
    geom_pie(aes(fill='class', explode='explode'), 
             size=20, hole=.3,
             stroke=2, color='black',
             spacer_width=4, spacer_color='light-gray') + \
    ggsize(600, 400) + \
    theme_void()

## Pie Chart on Map

In [22]:
data = {
    'city': ['New York', 'New York', 'Philadelphia', 'Philadelphia'],
    'est_pop_2020': [4_381_593, 3_997_959, 832_685, 748_846],
    'sex': ['female', 'male', 'female', 'male']
}

centroids = geocode_cities(data['city']).get_centroids()

ggplot() + \
    geom_livemap() + \
    geom_pie(aes(slice='est_pop_2020', fill='sex', size='est_pop_2020'),
             stat='identity', data=data, map=centroids, map_join='city', 
             hole=.2, alpha=.6, color='black', stroke=2,
             spacer_color='black', spacer_width=2,
             tooltips=layer_tooltips().format("@est_pop_2020", ",.3~s")) + \
    scale_size(range=[5, 10], guide='none') + \
    labs(fill='Gender') + \
    ggsize(800, 600)