# Lesson 2 - seaborn

### References
* Official Website: https://seaborn.pydata.org
* Github Dataset: https://github.com/mwaskom/seaborn-data

Seaborn is a library for making **statistical graphics** in Python.

It is built on top of matplotlib and closely integrated with pandas data structures.

___

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

### Introductory Example

In [None]:
sns.set()

tips = sns.load_dataset('tips')

sns.relplot(x = 'total_bill', y = 'tip', col = 'time',
            hue = 'smoker', style = 'smoker', size = 'size',
            data = tips)

**List of available datasets**

https://github.com/mwaskom/seaborn-data

In [None]:
sns.get_dataset_names()

**API reference**

https://seaborn.pydata.org/api.html

___

## 1. Relational Plots

### 1-1. Scatter plot

In [None]:
tips = sns.load_dataset("tips")

sns.relplot(x = "total_bill", y = "tip", 
            hue = 'smoker', col = 'time', data = tips)

### 1-2. Line plot

In [None]:
df = pd.DataFrame(dict(time = np.arange(500),
                       value = np.random.randn(500).cumsum()))

sns.relplot(x = "time", y = "value", kind = "line", data = df)

In [None]:
fmri = sns.load_dataset('fmri')
fmri.head()

In [None]:
sns.relplot(x = "timepoint", y = "signal", 
            hue = "event", style = "event",
            kind = "line", data = fmri)

In [None]:
sns.relplot(x = 'timepoint', y = 'signal', kind = 'line', ci = None, data = fmri)

In [None]:
sns.relplot(x = 'timepoint', y = 'signal', kind = 'line', estimator = None, data = fmri)

___

## 2. Categorical Plots

In [None]:
sns.set(style = 'ticks')

In [None]:
tips = sns.load_dataset('tips')
tips.head()

**Strip plot**

In [None]:
sns.catplot(x = 'day', y = 'total_bill', data = tips)

**Swarm plot**

In [None]:
sns.catplot(x = 'day', y = 'total_bill', hue = 'sex', kind = 'swarm', data = tips)

In [None]:
sns.catplot(x = 'total_bill', y = 'day', hue = 'time', kind = 'swarm', data = tips)

**Box plot**

In [None]:
sns.catplot(x = 'day', y = 'total_bill', hue = 'smoker', kind = 'box', data = tips)

In [None]:
diamonds = sns.load_dataset('diamonds')
diamonds.head()

In [None]:
sns.catplot(x = 'color', y = 'price', kind = 'boxen', 
            data = diamonds.sort_values('color'))

**Violin plot**

In [None]:
sns.catplot(x = 'day', y = 'total_bill', hue = 'sex', 
            kind = 'violin', inner = 'stick', split = True,
            palette = 'pastel', data = tips)

In [None]:
sns.catplot(x = "day", y = "total_bill", kind = "violin", inner = None, data = tips)
sns.swarmplot(x = "day", y = "total_bill", color = "k", size = 3, data = tips)

**Bar plot**

In [None]:
titanic = sns.load_dataset("titanic")
sns.catplot(x="sex", y="survived", hue="class", kind="bar", data=titanic)

Error bars: *confidence interval* around the estimate

**Count plot**

In [None]:
sns.catplot(x = "deck", hue = "class", kind = "count",
            palette = "pastel", edgecolor = ".6", data = titanic)

___

## 3. Distribution Plots

### 3-1. `.distplot()` - univariate distributions

In [None]:
x = np.random.normal(size = 100)
sns.distplot(x, bins = 20, kde = True, rug = True)

### 3-2. `.jointplot()` - bivariate distributions

In [None]:
sns.jointplot(x = 'total_bill', y = 'tip', kind = 'reg', data = tips)

### 3-3. `.pairplot()` - pairwise relationships

In [None]:
iris = sns.load_dataset("iris")
sns.pairplot(iris)

In [None]:
sns.pairplot(tips, hue = 'sex')

___

## 4. Regression Plots

### `.lmplot()`

In [None]:
sns.lmplot(x="total_bill", y="tip", hue="smoker", col="time", data=tips)