# Seaborn data visualization beginner's tutorial
**Seaborn** This is a higher level matplotlib library API. It provides a high-level interface for drawing attractive and informative statistical graphics. More details can be found [here](https://seaborn.pydata.org/)
Create your own notebook and try to play with the parameters of the plots by yourself.


![Sea Born](https://mirage.mk.ua/wp-content/uploads/2017/03/X-0-093-%D0%9C%D0%BE%D0%B0%D0%BD%D0%B0.-%D0%9C%D0%B0%D1%83%D0%B8.-%D0%A4%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B-%D0%BC%D1%83%D0%BB%D1%8C%D1%82%D1%84%D0%B8%D0%BB%D1%8C%D0%BC%D1%8B-%D1%84%D0%BE%D1%82%D0%BE%D0%BE%D0%B1%D0%BE%D0%B8-%D0%BD%D0%B0-%D0%B7%D0%B0%D0%BA%D0%B0%D0%B7-%D0%B2-%D0%9D%D0%B8%D0%BA%D0%BE%D0%BB%D0%B0%D0%B5%D0%B2%D0%B5..jpg)

> What can I say except you're welcome

In [None]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'svg'
%matplotlib inline

## Distplot

Shows a histogram and a graph of the distribution density.



In [None]:
dice = np.random.randint(1,7,200) + np.random.randint(1,7,200)  # 2d6 Dices roll 200 times
dice

In [None]:
sns.distplot(dice)

* kde - allows you to leave only the histogram, bins - set the number of bars, vertical - determines an orientation.

In [None]:
sns.distplot(dice, kde = False, bins = 8, vertical = True, color = 'red')

## Jointplot
**The jointplot() function shows a joint distribution over two variables.**

In [None]:
tips = sns.load_dataset("tips") # Load Tips dataset
tips.head()

In [None]:
sns.jointplot(x = tips.total_bill, y = tips.tip, kind='scatter') # kind - kind of plot

**You can change the color of the plot using the hex code.
For hex colours, you can use online palettes like: [www.color-hex.com](https://www.color-hex.com/)**

**space = 0 -  removes gaps between x data and y data. There is a gap in the previous chart, it will not be in this.**

In [None]:
sns.jointplot(tips.total_bill, tips.tip, color = '#a631e7', kind = 'reg', space = 0) # regression plot

## Pairplot
Plot pairwise relationships in a dataset.The diagonal Axes are treated differently, drawing a plot to show the univariate distribution of the data for the variable in that column.

In [None]:
iris = sns.load_dataset("iris") # Load iris dataset
iris.head()

**Below we can see the relationships between the data columns.
hue - chose variable in data to map plot aspects to different colors**

In [None]:
sns.pairplot(iris, hue = 'species')

**markers - allow you chose the marker style. Details can be found [here](https://matplotlib.org/3.1.0/api/markers_api.html).**

**palette - allows you to change colors. Read more [here](https://seaborn.pydata.org/tutorial/color_palettes.html).**

**corner - removes upper plots**


In [None]:
# try with tips data
palete = ['#E9473F', '#3FE1E9']
sns.pairplot(tips, hue = 'sex', markers = 'X',corner = True, palette = sns.color_palette(palete))

## Facet Grid
This class maps a dataset onto multiple axes arrayed in a grid of rows and columns that correspond to levels of variables in the dataset. The plots it produces are often called “lattice”, “trellis”, or “small-multiple” graphics.

In [None]:
g = sns.FacetGrid(tips, col="time", row="smoker") # Only this line show us an empty graph
g = g.map(plt.hist, "total_bill")

In [None]:
g = sns.FacetGrid(tips, col  = 'time', hue = 'day')
g = (g.map(plt.scatter, "total_bill", "tip", edgecolor="w").add_legend())

It seems like people like to dine in restaurants on Saturday and lunch on Thursday.

# Categorical data visualization
Here we will work with data such as gender, smoking addiction and others, which are divided into categories, unlike tips, which are described by the amount.

## Barplot
The barplot aggregates the data according to the values of the categorical variable and applies a certain function to the values of the corresponding groups of the digital variable. By default, this function is average.

In [None]:
sns.barplot(x = 'sex', y = 'total_bill', data = tips)

In [None]:
sns.barplot(x = tips.time, y = tips.tip, hue = tips.sex)

In [None]:
sns.barplot('size', y = 'total_bill', data = tips, palette = 'Blues_d')

## Countplot
The same as the barplot, only the function is already explicitly set, and it counts the number of values in each category.

In [None]:
palete = ['#001eff', '#f000ff']
plt.style.use("dark_background") # set black background
sns.countplot(x = 'sex', data = tips, facecolor = (0,0,0,0),
                                        linewidth=5,
                                         edgecolor=sns.color_palette(palete))

In [None]:
plt.style.available # All styles

## Boxplot
A box plot (or box-and-whisker plot) shows the distribution of quantitative data in a way that facilitates comparisons between variables or across levels of a categorical variable. The box shows the quartiles of the dataset while the whiskers extend to show the rest of the distribution, except for points that are determined to be “outliers” using a method that is a function of the inter-quartile range.

In [None]:
plt.style.use("Solarize_Light2")
sns.boxplot(x = tips.total_bill)

In [None]:
plt.style.use('grayscale')
sns.boxplot(x="day", y="total_bill", data=tips)

A related function, boxenplot(), draws a plot that is similar to a box plot but optimized for showing more information about the shape of the distribution. It is best suited for larger datasets:

In [None]:
plt.style.use("bmh")
sns.boxenplot(x="day", y="total_bill", hue="time", data=tips, linewidth=2.5)

## Violinplot
Violinplot performs the same function as boxplot. Represents 2 plots of distribution density.

In [None]:
sns.reset_orig() # reset style to original
sns.violinplot(x = 'total_bill', y = 'day', hue = 'sex', data = tips, palete = 'rainbow')

**inner - lets us see quartiles as horizontal lines instead of a mini-box**

**split - separates the density distribution graphs of mail and femail**

**scale - used to scale the width of each violin**


In [None]:
sns.violinplot(x="day", y="total_bill", hue="sex",
                    data=tips, palette="Set1", split=True,
                    scale="count", inner="quartile")

## Swarmplot
Draw a categorical scatterplot with non-overlapping points.

In [None]:
sns.swarmplot(x="time", y="tip", data=tips,
              order=["Dinner", "Lunch"],
              color = 'green').set_title('Christmas trees') # Use to set tittle

Swarmplot is a good complement to a box or violin plot in cases where you want to show all observations along with some representation of the underlying distribution.

In [None]:
ax = sns.violinplot(x='day', y = 'total_bill', data = tips, inner = None, color = '#a27250')
ax = sns.swarmplot(x='day', y = 'total_bill',
                   data = tips, 
                   color = 'black',
                   marker="h").set_title('Cacao beans') # I know that my imagination is crazy

## Pointplot
Show point estimates and confidence intervals using scatter plot glyphs.
A point plot represents an estimate of central tendency for a numeric variable by the position of scatter plot points and provides some indication of the uncertainty around that estimate using error bars.

In [None]:
palette = ('#ea7643', '#a91b1b')
sns.pointplot(x="day", y="tip", hue="sex", data=tips,
              capsize=.2, markers=["D", "x"], palette = sns.color_palette(palette),
             linestyles=["-", "--"])

I hope this tutorial came in handy for someone. I hope even more that it encouraged you to go and play with various graphs and their parameters. This topic is quite extensive for one lesson, but the basics here should be enough for comfortable use.