### WHAT IS SEABORN?

_Seaborn is built on top of Python’s core visualization library Matplotlib. It is meant to serve as a complement, and not a replacement. However, Seaborn comes with some very important features. Let us see a few of them here. The features help in −_

1. _Built in themes for styling matplotlib graphics_
2. _Visualizing univariate and bivariate data_
3. _Fitting in and visualizing linear regression models_
4. _Plotting statistical time series data_
5. _Seaborn works well with NumPy and Pandas data structures_


`!pip install seaborn`

_Seaborn comes with a few important datasets in the library. When Seaborn is installed, the datasets download automatically._


In [0]:
# Importing Seaborn for plotting and styling
import seaborn as sns

In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [0]:
sns.get_dataset_names()

In [0]:
df = sns.load_dataset('flights')
df.head()

In [0]:
tips = sns.load_dataset("tips")
tips.head()

_The interface for manipulating the styles is **set_style()**. Using this function you can set the theme of the plot._



In [0]:
sns.set_style("whitegrid")

In [0]:
sns.axes_style() # We can use the below available parameters to set_style to set custom styles

#### SEABORN PLOTTING FUNCTIONS

##### VISUALIZING STATISTICAL RELATIONSHIPS

_The process of understanding relationships between variables of a dataset and how these relationships, in turn, depend on other variables is known as statistical analysis._

###### relplot()

_This is a figure-level-function that makes use of two other axes functions for Visualizing Statistical Relationships which are:_

```
1. scatterplot() (with kind="scatter"; the default)
2. lineplot()  (with kind="line")
```

_These functions can be specified using the ‘kind’ parameter of relplot(). In case this parameter is given, it takes the default one which is scatterplot()_.

In [0]:
sns.relplot(x="passengers", y="month", data=df);

In [0]:
sns.relplot(x="passengers", y="month",data = df, hue="year")

In [0]:
sns.relplot(x="total_bill", y="tip",hue = 'sex', style ='sex', data=tips, kind = 'scatter')

In [0]:
sns.relplot(x="total_bill", y="tip", data=tips, hue = 'size',size="size", sizes=(15, 200))

In [0]:
sns.relplot(x="size", y="tip", kind="line", data=tips) # kind defines LINE PLOT

In [0]:
sns.relplot(x="size", y="tip", kind="line", data=tips, ci=None) # disabling the Confidence Interval

In [0]:
sns.relplot(x="size", y="tip", kind="line",hue='sex', data=tips, ci=None) # Adding multiple lines by Hue

In [0]:
 sns.relplot(x="size", y="tip", kind="line",hue='sex',style = 'smoker', data=tips, ci=None,markers=True) # Showing Markers

In [0]:
sns.relplot(x="total_bill", y="tip", hue="smoker", col="time", data=tips) # make multiple axes and plot subsets of the data on each of them

In [0]:
sns.relplot(x="size", y="tip", kind="line",hue='sex',col = 'smoker', data=tips, ci=None,markers=True)

In [0]:
sns.relplot(x="size", y="tip", kind="line",hue='sex',col = 'smoker',row = 'sex', data=tips,ci = None) # shows multiple graphs

##### PLOTTING WITH CATEGORICAL DATA

_This approach comes into the picture when our main variable is further divided into discrete groups (categorical). This can be achieved using the catplot() function._

###### catplot()

_This is a figure-level-function like relplot(). It can be characterized by three families of axes level functions namely:_

```
Categorical scatterplots:
1. stripplot() (with kind="strip"; the default)
2. swarmplot() (with kind="swarm")

Categorical distribution plots:
1. boxplot() (with kind="box")
2. violinplot() (with kind="violin")
3. boxenplot() (with kind="boxen")

Categorical estimate plots:
1. pointplot() (with kind="point")
2. barplot() (with kind="bar")
3. countplot() (with kind="count")
```


In [0]:
 sns.set(style="ticks", color_codes=True)

In [0]:
# strip plot by default
sns.catplot(x="day", y="total_bill", data=tips)

In [0]:
sns.catplot(x="day", y="total_bill", jitter=False, data=tips) # jitter controls the magnitude

In [0]:
# swarm plot

sns.catplot(x="day", y="total_bill", hue="sex", kind="swarm", data=tips)

In [0]:
sns.catplot(x="smoker", y="tip", order=["No", "Yes"], data=tips)

In [0]:
# Box Plot
sns.catplot(x="day", y="total_bill", kind="box", data=tips);

In [0]:
sns.catplot(x="day", y="total_bill", hue="smoker", kind="box", data=tips);

In [0]:
# Voilin Plot : combine box plot with Kernel density

sns.catplot(x="total_bill", y="day", hue="sex", kind="violin", data=tips)

In [0]:
sns.catplot(x="day", y="total_bill", hue="sex", kind="violin", split=True, data=tips) # using split between 2 levels

In [0]:
# Bar Plot

titanic = sns.load_dataset("titanic")
sns.catplot(x="sex", y="survived", hue="class", kind="bar", data=titanic)

In [0]:
# Count Plot
sns.catplot(x="deck", kind="count", palette="ch:.25", data=titanic)

In [0]:
# Point plot 
sns.catplot(x="sex", y="survived", hue="class", kind="point", data=titanic)

In [0]:
sns.catplot(x="class", y="survived", hue="sex",
            palette={"male": "g", "female": "m"},
            markers=["^", "o"], linestyles=["-", "--"],
            kind="point", data=titanic) # Customizing Point Plot

In [0]:
sns.catplot(x="fare", y="survived", row="class",
                kind="box", orient="h", height=2, aspect=4,
                data=titanic.query("fare > 0")) # Multiple Charts using Rows and col

##### VISUALIZING DISTRIBUTION OF DATASET

###### UNIVARIATE DISTRIBUTION

_The most convenient way to take a quick look at a univariate distribution in seaborn is the distplot() function. By default, this will draw a histogram and fit a kernel density estimate (KDE)._

In [0]:
x = np.random.normal(size=100)
sns.distplot(x)

In [0]:
sns.distplot(x, bins=20, kde=False,hist=True,axlabel='x')
plt.title('Histogram')
plt.show()

In [0]:
sns.kdeplot(x, shade=True)

In [0]:
sns.distplot(x,hist=False)

###### BIVARIATE DISTRIBUTION

In [0]:
mean, cov = [0, 1], [(1, .5), (.5, 1)]
data = np.random.multivariate_normal(mean, cov, 200)
df = pd.DataFrame(data, columns=["x", "y"])

In [0]:
sns.jointplot(x="x", y="y", data=df);

In [0]:
sns.jointplot(x="x", y="y", data=df, kind="kde")

###### PAIRWISE RELATIONSHIP

_To plot multiple pairwise bivariate distributions in a dataset, you can use the pairplot() function. This creates a matrix of axes and shows the relationship for each pair of columns in a DataFrame. By default, it also draws the univariate distribution of each variable on the diagonal Axes_

In [0]:
iris = sns.load_dataset("iris")
sns.pairplot(iris)

In [0]:
sns.pairplot(iris, hue="species")

##### VISUALIZING LINEAR RELATIONSHIP

_Two main functions in seaborn are used to visualize a linear relationship as determined through regression. These functions, regplot() and lmplot() are closely related, and share much of their core functionality._

_In the simplest invocation, both functions draw a scatterplot of two variables, x and y, and then fit the regression model y ~ x and plot the resulting regression line and a 95% confidence interval for that regression_

In [0]:
sns.regplot(x="total_bill", y="tip", data=tips)

In [0]:
sns.lmplot(x="total_bill", y="tip", data=tips)

_regplot() accepts the x and y variables in a variety of formats including simple numpy arrays, pandas Series objects, or as references to variables in a pandas DataFrame object passed to data. In contrast, lmplot() has data as a required parameter and the x and y variables must be specified as strings._

In [0]:
tips["big_tip"] = (tips.tip / tips.total_bill) > .15
sns.lmplot(x="total_bill", y="big_tip", data=tips, logistic= True)

In [0]:
sns.lmplot(x="total_bill", y="tip", hue="smoker", data=tips,
           markers=["o", "x"], palette="Set1")

#### MULTI PLOT GRIDS



```
1. FacetGrid()
2. PairGrid()
```



In [0]:
g = sns.FacetGrid(tips, row="sex", col="smoker", margin_titles=True, height=2.5)
g.map(plt.scatter, "total_bill", "tip", color="#334488", edgecolor="white", lw=.5);
g.set_axis_labels("Total bill (US Dollars)", "Tip");
g.set(xticks=[10, 30, 50], yticks=[2, 6, 10]);
g.fig.subplots_adjust(wspace=.02, hspace=.02)

In [0]:
iris = sns.load_dataset("iris")
g = sns.PairGrid(iris)
g.map(plt.scatter)

In [0]:
g = sns.PairGrid(iris, hue="species")
g.map_diag(plt.hist)
g.map_offdiag(plt.scatter)
g.add_legend();

#### STYLING

In [0]:
sns.set_style("whitegrid") # sets Style
data = np.random.normal(size=(20, 6)) + np.arange(6) / 2
sns.boxplot(data=data);

In [0]:
sns.set_style("whitegrid")
sns.boxplot(data=data, palette="deep")
sns.despine() # remove the spines from right

In [0]:
f = plt.figure(figsize=(6, 6))
gs = f.add_gridspec(2, 2)

with sns.axes_style("darkgrid"):
    ax = f.add_subplot(gs[0, 0])
    sns.boxplot(data=data, palette="deep")

with sns.axes_style("white"):
    ax = f.add_subplot(gs[0, 1])
    sns.boxplot(data=data, palette="pastel")

with sns.axes_style("ticks"):
    ax = f.add_subplot(gs[1, 0])
    sns.boxplot(data=data, palette="deep")

with sns.axes_style("whitegrid"):
    ax = f.add_subplot(gs[1, 1])
    sns.boxplot(data=data, palette="pastel")

f.tight_layout()

In [0]:
sns.boxplot(data=data, palette='GnBu_d') # More color palettes to be selected from https://seaborn.pydata.org/tutorial/color_palettes.html