**Day 4: Visualize categorical data with a bar chart**

What are we doing today? Today, we are looking at categorical data with a bar chart. We will explore the Aircraft Wildlife Strikes database.

In [40]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_csv("../input/database.csv")

That error looks a little scary, but it just means that there are some columns where the data types are mixed, which Pandas doesn't like. We could rewrite the read_csv command to either nix the warning or omit those columns, but let's ignore it for now and see what our columns actually are.

In [4]:
print(list(data))

The columns with mixed data types are "Aircraft Model,"  "Engine Model," "Engine1 Position," and "Engine3 Position." If we were going to work with those columns, I would investigate further, but we're not. 

Instead, let's pull out the column we *are* going to use: "Species Name," and take a look at the values and counts in that column.

In [30]:
species = data["Species Name"]
print(species.value_counts())

A few things are clear: three of the five highest counts belong to some variation of "unknown" (and a fourth, "unknown large bird," appears not much further down); and there are many, many very specific species that only have a count of 1. 

For the sake of making a nice and informative bar graph, we're going to chop this up a little. Let's look at only the top five known species. 

In [37]:
top_five = ["MOURNING DOVE", "GULL", "KILLDEER", "AMERICAN KESTREL", "BARN SWALLOW"] #Our top known five, based on the list above. 
top_five_species = species[species.isin(top_five)] #isin() returns a Boolean array, which we check our original Series (species) against to create a new Series.
print(top_five_species.value_counts()) #The new column has the same values and counts as the top five known species in the old column.

Let's try a count plot!

In [42]:
species_count = sns.countplot(top_five_species) #A countplot() is a type of bar plot specifically for value counts.
plt.title("Top Five Known Species That Impact with Aircraft") #That's kind of a morbid title, isn't it?
plt.xticks(rotation='vertical') #This rotates our x-axis labels so they don't smush together. 