## Visualization with Seaborn

**Here we want to highlight why tidy data is useful for exploratory analysis**

- Visualizing individuals, distributions or aggregations of numerical measures
- Splitting by categorical variables
    - separating subsets spatially along an axis, 
    - distinguising by color,
    - or making separate plots in columns or rows

Documentation: [https://seaborn.pydata.org/api.html]()

Tips data set: [https://github.com/mwaskom/seaborn-data/blob/master/tips.csv]()

*Bryant, P. G. and Smith, M (1995) Practical Data Analysis: Case Studies in Business Statistics. Homewood, IL: Richard D. Irwin Publishing*

In [None]:
import seaborn as sns
sns.set_style("whitegrid")

tips = sns.load_dataset("tips")

In [None]:
tips.head(10)

This is a really nice data set for exploring differences between numerical values and distributions across a population distinguished by lots of categorical variables.

### Individual variables & distributions

#### Histogram

The most basic form of exploration is to visualize the distribution of values in numerical columns

In [None]:
ax = sns.distplot(tips.total_bill)

In [None]:
ax = sns.distplot(tips.total_bill, bins=30, kde=False, rug=True)

#### Swarm plot 

This is an interesting alternative to a histogram where each individual value is represented by a separate point. 

In [None]:
ax = sns.swarmplot(y="total_bill", data=tips)

In [None]:
ax = sns.swarmplot(x="day", y="total_bill", data=tips)

In [None]:
ax = sns.violinplot(x="day", y="total_bill", data=tips)

In [None]:
ax = sns.boxplot(x="day", y="total_bill", data=tips)

In [None]:
ax = sns.swarmplot(y="tip", data=tips)

In [None]:
ax = sns.swarmplot(x="sex", y="tip", palette="Set2", data=tips)

In [None]:
g = sns.catplot(x="sex", y="total_bill",
                    hue="smoker", col="time",
                    data=tips, kind="swarm");

In [None]:
g = sns.catplot(x="sex", y="total_bill",
                    hue="smoker", col="time",
                    data=tips, kind="bar");

In [None]:
ax = sns.pointplot(x="time", y="total_bill", hue="smoker",
                    data=tips, dodge=True)

In [None]:
g = sns.jointplot(x="total_bill", y="tip", data=tips)

In [None]:
g = sns.jointplot("total_bill", "tip", data=tips, kind="hex", stat_func=None)

In [None]:
ax = sns.regplot(x="total_bill", y="tip", data=tips)

In [None]:
g = sns.lmplot(x="total_bill", y="tip", hue="smoker", data=tips)

In [None]:
g = sns.lmplot(x="total_bill", y="tip", col="day", hue="day",
                data=tips, col_wrap=2, height=4)

In [None]:
ax = sns.regplot(x="size", y="total_bill", data=tips, x_jitter=.1)