# Seaborn Intro
You might need to upgrade Seaborn to 0.9 with `conda update seaborn`.

Seaborn is a plotting library built directly on top of matplotlib. It makes plotting much easier by offering functions that are very straightforward and simple. 

* Reference the [Seaborn API](http://seaborn.pydata.org/api.html)

In [None]:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

# Main seaborn plotting functions
* countplot - for categorical (string) data
* boxplot - finding outliers of continuous data
* violinplot - comparing distributions
* barplot - summarizing data
* lmplot - plotting a scatterplot with a regression line

# Integration with Pandas DataFrames
Seaborn integrates directly with Pandas DataFrames. Pass the DataFrame name to the **`data`** parameter and then use the column names as strings for the parameters.

# All plotting functions look similar
A nice feature of seaborn is that the plotting methods look quite similar. You will be using the following parameters:

* **`x`** - variable on x axis
* **`y`** - variable on y axis
* **`data`** - Pandas DataFrame
* **`hue`** - variable to split and color data by

## Univariate Categorical Plots
When first beginning a data analysis, it's usually good to start with simple plots involving one variable (univariate analysis). You can only do a few things with categorical variables - plotting their frequency counts is one of them.

In [None]:
emp = pd.read_csv('data/employee.csv')
emp.head()

In [None]:
sns.countplot(x='race', data=emp)

In [None]:
# horizontal bar plot
sns.countplot(y='race', data=emp)

# Univariate Continuous Variable Plots

In [None]:
sns.boxplot(x='salary', data=emp)

In [None]:
sns.violinplot(y='experience', data=emp)

# Bivariate Plots - Categorical vs Continuous

In [None]:
sns.boxplot(x='race', y='salary', data=emp)

### Aggregate with a barplot

In [None]:
sns.barplot(x='race', y='salary', data=emp)

# Bivariate Continuous vs Continuous

In [None]:
sns.lmplot(x='experience', y='salary', data=emp)

# Multivariate analysis - use `hue` to add extra dimensionality

### Using the `hue` parameter for extra dimensionality

In [None]:
sns.barplot(x='race', y='salary', hue='gender', data=emp)

In [None]:
sns.violinplot(x='race', y='salary', hue='gender', data=emp)

### Violin plots are only useful when you split the violin to compare distributions

In [None]:
sns.violinplot(x='race', y='salary', hue='gender', data=emp, split=True)

# Practice using seaborn to do exploratory data analysis with this dataset

In [None]:
# your code here