# Making a bar graph in Altair

You can see many examples of line charts [on the Altair examples page](https://altair-viz.github.io/gallery/index.html#bar-charts).


## Basic bar chart

Making a bar graph in Altair is easy.

In [75]:
import pandas as pd
import altair as alt

df = pd.DataFrame([
    { 'species': 'cat', 'num_animals': 8 },
    { 'species': 'dog', 'num_animals': 22 },
    { 'species': 'rabbit', 'num_animals': 2 },
])
df

Unnamed: 0,species,num_animals
0,cat,8
1,dog,22
2,rabbit,2


When we plot, we use `geom_bar` to plot a bar. 

In [76]:
alt.Chart(df).mark_bar().encode(
    x='num_animals',
    y='species'
)


## Aggregate bar chart (median)

A lot of the time your data isn't set up like you want it to be set up. Say for example we have a bunch of countries with life expectancies. Want we want to plot is different, though: we want the median life expectancy per continent.

In [77]:
import pandas as pd
from plotnine import *

df = pd.read_csv('countries.csv')
df.head(2)

Unnamed: 0,country,continent,gdp_per_capita,life_expectancy,population
0,Afghanistan,Asia,663,54.863,22856302
1,Albania,Europe,4195,74.2,3071856


Instead of telling `encode` to plot the life expectancy (the normal actual value), we tell it to plot a **summary statistic**. Which summary statistic? The median!

In [78]:
alt.Chart(df).mark_bar().encode(
    x='median(life_expectancy)',
    y='continent'
)


## Aggregate bar chart (count)

Plotting a bar chart of counts is similar to how we did the median, but we use `count()` instead of asking for a specific statistic.

In [79]:
df = pd.read_csv('countries.csv')
df.head(2)

Unnamed: 0,country,continent,gdp_per_capita,life_expectancy,population
0,Afghanistan,Asia,663,54.863,22856302
1,Albania,Europe,4195,74.2,3071856


In [80]:
alt.Chart(df).mark_bar().encode(
    x='count()',
    y='continent'
)


## Vertical column charts

To make a vertical column graph, you just flip the `x` and `y`.

In [81]:
df = pd.DataFrame([
    { 'species': 'cat', 'num_animals': 8 },
    { 'species': 'dog', 'num_animals': 22 },
    { 'species': 'rabbit', 'num_animals': 2 },
])
df

Unnamed: 0,species,num_animals
0,cat,8
1,dog,22
2,rabbit,2


In [82]:
alt.Chart(df).mark_bar().encode(
    x='species',
    y='num_animals'
).properties(width=200, height=150)

## Sorting your bars

By defualt, Altair sorts based on alphabetical order. In order to sort your bars in plotnine, you need to start using `alt.X` or `alt.Y` to customize the axis.

In [83]:
alt.Chart(df).mark_bar().encode(
    x='num_animals',
    y='species'
)


In [84]:
alt.Chart(df).mark_bar().encode(
    x='num_animals',
    y=alt.Y('species', sort='x')
)


`sort='x'` means "I want you to sort this axis by the value of `x`.

### Reversing your bar order

Instead of sorting with `x`, you're going to sort with `-x`, which I guess it's the ... subtracted negative version?

In [85]:
alt.Chart(df).mark_bar().encode(
    x='num_animals',
    y=alt.Y('species', sort='-x')
)


## Stacked bar graph

To make a stacked bar graph, you use one of your columns to specify `color`, which is the color of the bar. All of the filled bars get stacked next to each other based on their `x` or `y`.

In [86]:
df = pd.DataFrame([
    { 'species': 'cat', 'num_animals': 8, 'county': 'Kings' },
    { 'species': 'dog', 'num_animals': 22, 'county': 'Kings' },
    { 'species': 'cat', 'num_animals': 3, 'county': 'Queens' },
    { 'species': 'dog', 'num_animals': 2, 'county': 'Queens' },
])
df

Unnamed: 0,species,num_animals,county
0,cat,8,Kings
1,dog,22,Kings
2,cat,3,Queens
3,dog,2,Queens


In [87]:
alt.Chart(df).mark_bar().encode(
    x='num_animals',
    y='county',
    color='species'
)

## Grouped bar graph

To make a grouped bar graph, you'll specify a `row` or `column` to separate each cluster based on a column of your data.

In [89]:
df = pd.DataFrame([
    { 'species': 'cat', 'num_animals': 8, 'county': 'Kings' },
    { 'species': 'dog', 'num_animals': 22, 'county': 'Kings' },
    { 'species': 'cat', 'num_animals': 3, 'county': 'Queens' },
    { 'species': 'dog', 'num_animals': 2, 'county': 'Queens' },
])
df

Unnamed: 0,species,num_animals,county
0,cat,8,Kings
1,dog,22,Kings
2,cat,3,Queens
3,dog,2,Queens


In [90]:
alt.Chart(df).mark_bar().encode(
    x='num_animals',
    y='species',
    color='species',
    row='county'
)

In [93]:
alt.Chart(df).mark_bar().encode(
    x='species',
    y='num_animals',
    color='species',
    column='county'
).properties(width=200, height=150)

## 100% stacked bar graph

For a 100% stacked bar, you'll you'll add `stack="normalize"` to the `x` or `y` that you're trying to make add up to 100%.

In [104]:
alt.Chart(df).mark_bar().encode(
    x=alt.X('num_animals', stack="normalize"),
    y='county',
    color='species',
).properties(width=200, height=150)