# Visualizations - Seaborn
Seaborn is a commonly used library for visualizations (http://seaborn.pydata.org/index.html). It makes it fast and easy to create pretty charts. What is more, it has a great documentation full of interesting and inspiring examples. Examples in this notebook are taken from this gallery (http://seaborn.pydata.org/examples/index.html).

%matplotlib command tells the notebook to show charts as output.

In [None]:
%matplotlib inline
import numpy as np
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd

## Histograms
You can get the simplest histograms using distplot. First, draw numbers from the normal distribution, then modify default size of the image. Seaborn is albo based on matplotlib, which is a basic Python library to draw.

In [None]:
rs = np.random.RandomState(10)
d = rs.normal(size=200)


matplotlib.rcParams['figure.figsize'] = (5.0, 2.5)

sns.distplot(d, kde=False)
plt.show()

sns.distplot(d, hist=False)
plt.show()

sns.distplot(d)
plt.show()

Default colors are quite readable and nice to look at. You have a few default styles available to use. If you do not like dark-grey color scheme, you may use bright one. Additionally, in the last chart you may fill the area under the distribution curve. despine gets rid of chart borders, which is added by default to "white" style.

In [None]:
sns.set(style="white", palette="pastel", color_codes=True)
plt.figure(figsize=(5.0, 2.5))
sns.distplot(d, hist=True, color="b", kde_kws={"shade": True})
sns.despine(left=True, bottom=True)
plt.show()

## Charts of two variables
Seaborn offers great possibilities to draw charts of two variables. Apart from the usual scatterplot (or extended with histograms) you may easily generate hexheatmap or  KDF chart.

In [None]:
#height instead of size

data = np.random.multivariate_normal([1,4],[[.5,.3], [.3,.8]],1000).T
sns.set(style="white", palette="muted", color_codes=True)
sns.jointplot(x=data[0], y=data[1], size=4)

sns.jointplot(x=data[0], y=data[1], kind="hex", height=4)
sns.jointplot(x=data[0], y=data[1], kind="kde", height=4)
plt.show()

## Many series on one chart
The example below shows perfecly how to add more series to one chart.

In [None]:
# Generate data
iris = sns.load_dataset("iris")
setosa = iris.query("species == 'setosa'")
virginica = iris.query("species == 'virginica'")

sns.set(style="darkgrid")
# "Break up" the chart object to two variables, ax is related to our data series.
f, ax = plt.subplots(1, 1, figsize=(6, 4))
# Make sure that both series have the same scaling
ax.set_aspect("equal")

#Add further series. Do not shade the lowest level.
# ax = sns.kdeplot(setosa.sepal_width, setosa.sepal_length,
#                 cmap="Reds", shade=True, shade_lowest=False)
# ax = sns.kdeplot(virginica.sepal_width, virginica.sepal_length,
#                 cmap="Blues", shade=True, shade_lowest=False)

# Ensure same scaling for both series
ax.set_aspect("equal")

# Plotting KDE plots with the updated syntax
sns.kdeplot(x=setosa.sepal_width, y=setosa.sepal_length,
            cmap="Reds", shade=True, shade_lowest=False,ax=ax)
sns.kdeplot(x=virginica.sepal_width, y=virginica.sepal_length,
            cmap="Blues", shade=True, shade_lowest=False, ax=ax)


# Add text in chosen places.
red = sns.color_palette("Reds")[-2]
blue = sns.color_palette("Blues")[-2]
ax.text(2.5, 8.2, "virginica", size=12, color=blue)
ax.text(3.8, 4.5, "setosa", size=12, color=red)
plt.show()

## Violin charts and boxplot
Violin charts are quite popular recently. See two Seaborn examples. They are essentially classic boxplots extended by estimated KDF. The first example shows how to draw multiple distributions in an attractive way. Note that every violin chart is symmetric. It seems to be wasted potential. The second example uses split option, which allows to draw each half differently for two subgroups.

In [None]:
sns.set(style="whitegrid")

# Load the example dataset of brain network correlations
df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)

# Pull out a specific subset of networks
used_networks = [1, 3, 4, 5, 6, 7, 8, 11, 12, 13, 16, 17]
used_columns = (df.columns.get_level_values("network")
                          .astype(int)
                          .isin(used_networks))
df = df.loc[:, used_columns]

# Compute the correlation matrix and average over networks
corr_df = df.corr().groupby(level="network").mean()
corr_df.index = corr_df.index.astype(int)
corr_df = corr_df.sort_index().T

# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(10, 5))

# Draw a violinplot with a narrower bandwidth than the default
sns.violinplot(data=corr_df, palette="viridis", bw=.2, cut=1, linewidth=1, ax=ax)

# Finalize the figure
ax.set(ylim=(-.7, 1.05))
sns.despine(left=True, bottom=True)

In [None]:
sns.set(style="whitegrid", palette="pastel", color_codes=True)

# Load the example tips dataset
tips = sns.load_dataset("tips")
plt.figure(figsize=(8.0, 4))
# Draw a nested violinplot and split the violins for easier comparison
sns.violinplot(x="day", y="total_bill", hue="sex", data=tips, split=True,
               inner="quart", palette={"Male": "b", "Female": "y"})
sns.despine(left=True)

Of course, the classic boxplot is also available.

In [None]:
sns.set(style="ticks", palette="muted", color_codes=True)

# Load the example planets dataset
planets = sns.load_dataset("planets")

plt.figure(figsize=(10.0, 6))
# Plot the orbital period with horizontal boxes
ax = sns.boxplot(x="distance", y="method", data=planets,
                 whis=np.inf, color="c")

# Add in points to show each observation
sns.stripplot(x="distance", y="method", data=planets,
              jitter=True, size=3, color=".3", linewidth=0)


# Make the quantitative axis logarithmic
ax.set_xscale("log")
sns.despine(trim=True)

## Additional settings
Seaborn is perfect to create charts quickly and it is usually fine not to change a lot of settings. However, if you feel like it, you may adjust the chart for your particular needs.
* http://seaborn.pydata.org/tutorial/aesthetics.html
* http://seaborn.pydata.org/tutorial/color_palettes.html