Now that you know how to create your own **line charts**, it's time to learn about more chart types!  In this lesson, you'll learn about **bar charts** and **heatmaps**.

# Set up the notebook

As always, we begin by setting up the coding environment.  (_This code is hidden, but you can un-hide and re-hide it by clicking on the "Code" button immediately below this text, on the right._)

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
print("Setup Complete")

# Select a dataset

We'll work with a dataset from the US Department of Transportation that tracks 2015 flight delays and cancellations.

Opening the corresponding CSV file in Excel shows that each flight has a different ID number (corresponding to **column A** in the spreadsheet), and the various columns track detailed information about each flight.

<img src="images/tut2_flight_head.png">

In this lesson, we'll work with only three of the columns, including:
- `'Month'` - the month of the departure date
- `'Airline'` - the airline code (_`AS` stands for Alaska Airlines, `AA` stands for American Airlines, etc_)
- `'Arrival delay'` - how many minutes late the flight arrived (_where negative values denote a flight that arrived early_)

# Load the data

As before, we load the dataset using the `pd.read_csv` command.

In [None]:
# Path of the file to read
flight_filepath = "../input/flights.csv"

# Read the file into a variable flight_data
flight_data = pd.read_csv(flight_filepath, index_col="Id")

You may notice that the code is slightly shorter than what we used before.  In this case, since the row labels (from the `'Id'` column) don't correspond to dates, we don't add `parse_dates=True` in the parentheses.  But, we keep the first two pieces of text as before, to provide both: 
- the filepath for the dataset (in this case, `flight_filepath`), and 
- the name of the column that will be used to index the rows (in this case, `index_col="Id"`). 

# Examine the data

We'll use the familiar `head` command to print the first five rows of the data.  Note that due to the large size of the dataset, not all of the columns are shown - but it's still enough information for us to verify that the dataset was correctly loaded.

In [None]:
# Print the first 5 rows of the data
flight_data.head()

# Bar chart, Part 1

We're ready to create our first **bar chart**.  For instance, say we'd like to see the number of flights corresponding to each airline.

As before, this can be accomplished with a single line of code (_where customizing settings like the size and title of the figure involves some additional commands_).

In [None]:
# Set the width and height of the figure
plt.figure(figsize=(10,6))

# Add title
plt.title("Number of Flights, for Each Airline")

# Bar chart showing number of flights for each airline
sns.countplot(y=flight_data['Airline'])

# Add label for horizontal axis
plt.xlabel("")

The commands for setting the size and title of the figure are familiar from the previous lesson.  The code that creates the bar chart is new:
```python
# Bar chart showing number of flights for each airline
sns.countplot(y=flight_data['Airline'])
```
It has two main components:
- `sns.countplot` tells the notebook that we want to create a **_special_** kind of bar plot.  (More on that soon!)
- `y=flight_data['Airline']` selects the data that will be used to create the bar plot (in this case, the `'Airline'` column of `flight_data`).  
 - Using `y=` ensures that the categories will appear along the vertical axis (to yield a **horizontal bar chart**).  
 - If you'd prefer to create a **vertical bar chart**, you need only change `y=` to `x=`.

So, why is this a **_special_** bar plot?  To see this, recall that it only uses the `'Airline'` column of `flight_data`.  

<img src="images/tut2_airline_column.png">

Notice that this column doesn't have any numerical values!  And that's why this bar plot is **_special_** -- because the `sns.countplot` command does all of the work of adding up the number of entries for each airline for us.  

And, in the case that our dataset **_did_** explicitly contain a numerical value for the height of each bar, we'd use a different command to create the bar plot.  More on that soon! :)

# Load and examine a *slightly* different dataset

For the next charts, we'll work with a different CSV file.  This new file is the result of some quick data munging, where we've taken the relatively "raw", extremely detailed information in `flight_data` and rearranged it to a table with only the information that we need to build the next visualizations.

Opening this new CSV file in Excel shows a row for each month (where `1` = January, `2` = February, etc) and a column for each airline code.

<img src="images/tut2_flight_delay_head.png">

Each entry shows the average arrival delay (in minutes) for a different airline and month.  Negative entries denote flights that (_on average_) tended to arrive early.  For instance, in 2015, the average American Airlines flight (_airline code: **AA**_) in January arrived roughly 7 minutes late, and the average Alaska Airlines flight (_airline code: **AS**_) in April arrived roughly 3 minutes early.

Below, we use the `pd.read_csv` command to load the data and the `head` command to check that it loaded properly.

In [None]:
# Path of the file to read
flight_delay_filepath = "../input/flight_delays.csv"

# Read the file into a variable flight_delay_data
flight_delay_data = pd.read_csv(flight_delay_filepath, index_col="Month")

# Print the first five rows of the data
flight_delay_data.head()

# Bar chart, Part 2

In [None]:
# Set the width and height of the figure
plt.figure(figsize=(10,6))

# Add title
plt.title("Average Arrival Delay for Spirit Airlines Flights, by Month")

# Bar chart showing average arrival delay for Spirit Airlines flights by month
sns.barplot(x=flight_delay_data.index, y=flight_delay_data.NK)

# Add label for vertical axis
plt.ylabel("Arrival delay (in minutes)")

# Heatmap

In [None]:
# Set the width and height of the figure
plt.figure(figsize=(14,7))

# Add title
plt.title("Average Arrival Delay for Each Airline, by Month")

# Heatmap showing average arrival delay for each airline by month
sns.heatmap(data=flight_delay_data, annot=True)

# Add label for horizontal axis
plt.xlabel("Airline")