# Case Study: Donut Chart

In this case study, we will work with a very simple synthetic dataset and generate a reasonble looking donut chart. Along the way, we will take a closer look at several handy Altair functionalities.  Let's get started!

## Data

In [1]:
import pandas as pd
import altair as alt
import numpy as np

# Generate a very simple data frame.
source = pd.DataFrame({"category": [1, 2, 3, 4, 5, 6], "value": [4, 6, 10, 3, 7, 8]})
source.describe()

Unnamed: 0,category,value
count,6.0,6.0
mean,3.5,6.333333
std,1.870829,2.581989
min,1.0,3.0
25%,2.25,4.5
50%,3.5,6.5
75%,4.75,7.75
max,6.0,10.0


This is a very simple dataset with 2 columns. The "category" column provides the category index, and the "value" column provides the quantity associated with each category.

## Visualization

Let's first plot this data as a simple scatter plot.

In [2]:
base = alt.Chart(source)
scatter_plot = base.mark_circle()\
                   .encode(x="category:O", y="value:Q")
scatter_plot

In the code above, we are using circle as the mark. We also use "category" as the x coordinate and "value" as the y coordinate.  Notice that the suffix `:O` and `:Q` after the column names. `:O` indicates the column encodes ordered categorical data. `:Q` indicates the column encode quantitative data. Altair automatically chooses the appropriate scale for each axis based on the type of the data. Other types of data include nominal, temporal and geojson data.  You can read more about them [here](https://altair-viz.github.io/user_guide/encoding.html#encoding-data-types).

Now, let's change this chart to a bar chart by replacing `mark_circle` with `mark_bar`.

In [3]:
bar_chart = base.mark_bar()\
                .encode(x="category:O", y="value:Q", color="category:N")
bar_chart

As simple as that!  Note that we also added a color to each bar by encoding the "category" column as color. When setting this encoding, we used the `:N` suffix to indicate that "category" column should be viewed as unordered data.  This allows Altair to use discrete color palette.  Try changing the suffix to `:O` and see the different color palette choicese for nominal and ordinal data. Read more about data type and color scale [here](https://altair-viz.github.io/user_guide/encoding.html#effect-of-data-type-on-color-scales).

As a fun experiment, let's try stack the bars into a single column.

In [4]:
source['name'] = ["category"] * len(source)
bar_chart = alt.Chart(source).mark_bar()
bar_chart = bar_chart.encode(x="value:Q", y="name:N", color="category:N")
bar_chart

In order to stack the bars, we to change the encoding such that all bars are corresponding to the same x value. To do this, we add a new column named "name" in our data frame. We set its values to the string "category" for all observations. This time, we use the column "value" as the x coordinate and the column "name" as the y coordinate in our bar chart so that bars are stacked horizontally. Altair automatically stacks the bars on top of one another when multiple observations correspond to the same categorical data.

The horizontal stacked bars are very useful because it uses the space more effecitvely than vertically stacked bar charts. Now, let get back to the main goal of this case study: generate a donut chart!

In [5]:
base2 = base.encode(theta="value:Q", color="category:N")
donut_chart = base2.mark_arc(innerRadius=100, outerRadius=150)
donut_chart

Here, we replaced `mark_bar` with `mark_arc`.  The arc mark is useful for generate pie chart, donut chart and other visualization with radial layout. Note that we set the `innerRadius` property to 100 pixels to create a donut chart. We can also set it to 0 to obtain a pie chart. Feel free to give it a try.

With arcs, we no longer need the `x` and `y` encodings.  Instead, we use the `theta` and `radius` encoding.  Here, we encode the "value" column as `theta` (the radial angle of each arc) and skip the `radius` encoding so it uses the defaul radius setting.

This is nice! Let's add some minor adjustment to make the chart better.

In [6]:
text_chart = base2.mark_text(radius=100, radiusOffset=25)\
                  .encode(theta=alt.Theta("value:Q", stack=True), 
                          text="category:N", 
                          color=alt.value('white'))
(donut_chart + text_chart).configure_text(fontSize=20).configure_arc(stroke='black', strokeWidth=1.5)

We made 2 adjustment above: added a text layer and added black strocks

The text layer uses the text mark and has nearly the same encoding as the donut chart layer. Becasue we are using a different mark now (text instead of arc), altair no long automatically stack the `theta` encoding, we need to explicitly tell it to do so with `stack=True`.  We manually set the color of our text to white.

We also used "addition" (`donut_chart + text_chart`) to combine the text layer with donut chart layer. This created a [layered chart](https://altair-viz.github.io/user_guide/compound_charts.html#layer-chart) with `text_chart` drawn on top of the `donut_chart`. One can change the layer order by switching the operands of the addition.  Feel free to give it a try!

Lastly, we also adjust the global configurations.  We use `configure_text` to change the font size, and `configure_arc` to change the stroke settings. See [here](https://altair-viz.github.io/user_guide/customization.html) to learn more about these configurations.

## Summary

Great! We have completed this case study.  Here is a summary of the key points:
* We have seen 4 different [marks](https://altair-viz.github.io/user_guide/marks.html) used: circle, bar, arc and text.
* We also touched on the concept of [data type](https://altair-viz.github.io/user_guide/encoding.html#encoding-data-types): quatitative, ordernal, nominal, etc.
* We saw simple example of [layered chart](https://altair-viz.github.io/user_guide/compound_charts.html#layer-chart).
* We have adjusted chart [configuration](https://altair-viz.github.io/user_guide/customization.html) to make it looks better.