# Charts

Python has a number of charting tools that can work hand-in-hand with pandas. While [Altair](https://altair-viz.github.io/) is a relatively new package compared to classics like [matplotlib](https://matplotlib.org/), it has great documentation and is easy to configure. Let's take it for a spin.

## Setup

First, let's prepare our data and import the necessary libraries:

In [None]:
# Setup data for chart examples
import warnings
warnings.simplefilter("ignore")
import pandas as pd

# Load and prepare accident data
accident_list = pd.read_csv("https://raw.githubusercontent.com/palewire/first-python-notebook/main/docs/src/_static/ntsb-accidents.csv")
accident_list["latimes_make_and_model"] = accident_list["latimes_make_and_model"].str.upper()
accident_counts = accident_list.groupby(["latimes_make", "latimes_make_and_model"]).size().rename("accidents").reset_index()

# Load survey data and merge
survey = pd.read_csv("https://raw.githubusercontent.com/palewire/first-python-notebook/main/docs/src/_static/faa-survey.csv")
survey["latimes_make_and_model"] = survey["latimes_make_and_model"].str.upper()
merged_list = pd.merge(accident_counts, survey, on="latimes_make_and_model")

# Calculate accident rates
merged_list["per_hour"] = merged_list.accidents / merged_list.total_hours
merged_list["per_100k_hours"] = (merged_list.accidents / merged_list.total_hours) * 100_000

print("Data prepared for charting")
merged_list.head()

In [None]:
import altair as alt
print("Altair imported for data visualization")

```{note}
If the import triggers an error that says your notebook doesn't have Altair, you can install it by running `uv add altair` in the terminal. This will download and install the library using the uv package manager.
```

In a typical analysis, you'd import all of your libraries in one cell at the top of the file. That way, if you need to install or make changes to the packages a notebook uses, you know where to find them and you won't hit errors importing a package midway through running a file.

## Make a basic bar chart

With Altair imported, we can now feed it our DataFrame to make a simple bar chart. Let's take a look at the basic building block of an Altair chart: the `Chart` object. We'll tell it that we want to create a chart from `merged_list` by passing the DataFrame in:

In [None]:
# This will show an error - Altair needs a "mark" to know how to visualize the data
alt.Chart(merged_list)

OK! We got an error, but don't panic. The error says that Altair needs a "mark" — that is to say, it needs to know not only what data we want to visualize, but also _how_ to represent that data visually. There are lots of different marks that Altair can use (you can [check them all out here](https://altair-viz.github.io/user_guide/marks.html)). But let's try out the most versatile mark in our visualization toolbox: the bar.

In [None]:
# This will show another error - Altair needs to know which columns to use
alt.Chart(merged_list).mark_bar()

That's an improvement, but we've got a new error: Altair doesn't know which columns of our DataFrame to look at! At a minimum, we also need to define the column to use for the x- and y-axes. We can do that by chaining in the `encode` method.

In [None]:
# Basic bar chart with accident rates
alt.Chart(merged_list).mark_bar().encode(
    x="latimes_make_and_model",
    y="per_100k_hours"
)

That's more like it!

Here's an idea — maybe we do horizontal bars instead of vertical. How would you rewrite this chart code to reverse those bars?

In [None]:
# Horizontal bar chart
alt.Chart(merged_list).mark_bar().encode(
    x="per_100k_hours",
    y="latimes_make_and_model"
)

This chart is an okay start, but it's sorted alphabetically by y-axis value, which is pretty sloppy and hard to visually parse. Let's fix that.

We want to sort the y-axis values by their corresponding x values. We know how to do that in Pandas, but Altair has its own opinions about how to sort a DataFrame, so it will override any sort order on the DataFrame we pass in.

## Sorting charts

Instead, we need to tell Altair how we want the axis to be organized by using the `sort` parameter of the `Y` encoding:

In [None]:
# Sorted horizontal bar chart
alt.Chart(merged_list).mark_bar().encode(
    x="per_100k_hours",
    y=alt.Y("latimes_make_and_model").sort("-x")
)

Much better! Now we can easily see which helicopter models have the highest accident rates.

## Adding titles and labels

Let's make this chart more presentation-ready by adding a title and better axis labels:

In [None]:
# Chart with title and labels
alt.Chart(merged_list).mark_bar().encode(
    x=alt.X("per_100k_hours", title="Accidents per 100,000 flight hours"),
    y=alt.Y("latimes_make_and_model", title="Helicopter model").sort("-x")
).properties(
    title="Helicopter accident rates by model",
    width=500,
    height=400
)

Perfect! We now have a professional-looking chart that clearly shows helicopter accident rates by model.

This is just the beginning of what you can do with Altair. The library supports many different chart types, interactive features, and advanced styling options. You can explore more in the [Altair documentation](https://altair-viz.github.io/).