# Week 7
# Advanced Plotting

Today we will study the `seaborn` library. Seaborn is a powerful visualization library that provides a high-level interface to Matplotlib.

On Seaborn’s official website, they state:

>If matplotlib “tries to make easy things easy and hard things possible”, seaborn tries to make a well-defined set of hard things easy too.

In [None]:
!pip install seaborn

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.__version__

## 1. Scatter Plot
`sns.scatterplot()`
Parameters:
- x, y: columns for x and y axes.
- data: data frame
- hue: variable that will produce points with different colors
- marker: how to draw the markers for different variables.
- alpha: opacity
- legend: "auto", "brief", "full", or False

In [None]:
# Seaborn comes with several educational datasets
penguins = sns.load_dataset("penguins")
penguins.head()

In [None]:
# sns.scatterplot(data=penguins, x="bill_length_mm", y="bill_depth_mm")

# sns.scatterplot(data=penguins, x="bill_length_mm", y="bill_depth_mm",
#                 hue="species")

# sns.scatterplot(data=penguins, x="bill_length_mm", y="bill_depth_mm",
#                 alpha=0.5)

# sns.scatterplot(data=penguins, x="bill_length_mm", y="bill_depth_mm",
#                 hue="species",
#                 legend=False)

# sns.scatterplot(data=penguins, x="bill_length_mm", y="bill_depth_mm",
#                 hue="species",
#                 marker="X")

## 2. Line Chart
`sns.lineplot()`
- x, y
- data
- hue

In [None]:
flights = sns.load_dataset("flights")
flights.head()

In [None]:
may_flights = flights[flights['month'] == 'May']
may_flights.head()

In [None]:
sns.lineplot(data=may_flights, x="year", y="passengers")

In [None]:
# Passing the entire dataset in long-form mode will aggregate over repeated values (each year) 
# to show the mean and 95% confidence interval:
sns.lineplot(data=flights, x="year", y="passengers")

In [None]:
plt.figure(figsize=(6, 6))
sns.lineplot(data=flights, x="year", y="passengers", hue="month")

## 3. Histogram
`sns.histplot`
- bins

In [None]:
x = np.random.randn(100)
ax = sns.distplot(x)

## 4. Bar Plot
- `sns.countplot`
- `sns.barplot`

In [None]:
titanic = sns.load_dataset("titanic")
titanic.head()

In [None]:
ax = sns.countplot(x="class", data=titanic)

In [None]:
sns.countplot(x="class", hue="who", data=titanic)

In [None]:
tips = sns.load_dataset("tips")
tips.head()

In [None]:
sns.barplot(x="day", y="total_bill", data=tips)

## 5. Box Plot

In [None]:
sns.boxplot(x="day", y="total_bill", data=tips)

Use `swarmplot()` to show the datapoints on top of the boxes:

In [None]:
sns.boxplot(x="day", y="total_bill", data=tips)
sns.swarmplot(x="day", y="total_bill", data=tips, color=".25")

## 6. Regplot()
Plot data and a linear regression model fit.

In [None]:
sns.regplot(x="total_bill", y="tip", data=tips)

In [None]:
sns.scatterplot(x="total_bill", y="tip", data=tips)

## 7. Catplot()

In [None]:
exercise = sns.load_dataset("exercise")
exercise.head()

In [None]:
sns.catplot(x="time", y="pulse", hue="kind",
                col="diet", data=exercise,
                height=5, aspect=.8)

In [None]:
sns.catplot(x="class", hue="who", col="survived",
                data=titanic, kind="count",
                height=4, aspect=.7);

In [None]:
sns.catplot(x="alive", col="deck", col_wrap=4,
                data=titanic[titanic.deck.notnull()],
                kind="count", height=2.5, aspect=.8)