# INTRODUCTION
1. [Why use seaborn?](#1)
<br>
<br>
1.  Plot Contents:
    * 2.1 [Relational Plot](#2)
        * 2.1.1 [binary (and more) variable analysis with](#3) relplot
        * 2.1.2 [show line graphs with](#4) relplot
        * 2.1.3 [multi-graphical representation using](#5) relplot
    * 2.2 [Bar Plot](#6)
    * 2.3 [Point Plot](#7)
    * 2.4 [Joint Plot](#8)
         * 2.4.1 [scatter representation with](#9) jointplot
         * 2.4.2 [hexbin representation with](#10) jointplot
         * 2.4.3 [kde representation with](#11) jointplot
    * 2.5 [Heatmap](#12)
    * 2.6 [Box Plot](#13)
        * 2.6.1 [boxenplot](#14)
    * 2.7 [Strip Plot](#15)
    * 2.8 [Swarm Plot](#16)
        * 2.8.1 [Swarm + Box](#17)
    * 2.9 [Count Plot](#18)
    * 2.10 [Pair Plot](#19)
    * 2.11 [Violin Plot](#20)
    * 2.12 [lmplot and regplot](#21)
        * 2.12.1 [Multiple regression with](#22) lmplot
    * 2.13 [KDE plot](#23)
    * 2.14 [Distribution Plot (distplot)](#24)
        * 2.14.1 [plot the histogram](#25)
        * 2.12.1 [plot the kde](#26)
<br>
<br>
1. [CONCLUSION](#27)

<a id="1"></a> 
# Why use seaborn?

* seaborn allows you to make beautiful visualizations with very short codes.
* If you use pandas for your data analysis, it is a perfect match for you.
* Optimized for statistical analysis.
* It is a well-known and widespread tool among data scientists.


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


import seaborn as sns
sns.set(style="darkgrid")

from collections import Counter

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
df=pd.read_csv('../input/seaborn-datasets/repository/mwaskom-seaborn-data-23ee2ba/diamonds.csv')

In [None]:
df.head()

In [None]:
df.info()

<a id="2"></a> 
## Relational Plot(relplot)



<a id="3"></a> 
#### Example 1: <font color=#696969>binary (and more) variable analysis with relplot</font> 

In [None]:
sns.relplot(x="carat",y="price",data=df,height=7)

In [None]:
sns.relplot(x="carat", y="price",data=df,height=7, alpha=0.25, edgecolor=None)

Let's add another dimension to the chart. But the size we will add is to change the color scale. So let's assign color to a variable. It's easy to do with **seaborn**. You just change the **hue** parameter.

In [None]:
# Add "clarity" variable as color
sns.relplot(
            x="carat",
            y="price",
            hue="clarity", # added to color axis
            data=df,
            height=7,
            palette="Set1", # change color palette 
            edgeColor=None)

We used the color variable as an axis / dimension.
Can we find other axes in a similar way?
Our answer is yes! For example, let's assign the variables **style** and **size**.

In [None]:
sns.relplot(
            x="carat",
            y="price",
            hue="clarity",
            size="depth",   ###
            style="color",  ###
            data=df,
            palette="CMRmap_r",
            edgecolor=None,
            height=7)

<a id="4"></a> 
#### Example 2: <font color=#696969>show line graphs with relplot</font>

In [None]:
df1=df.iloc[:250]



In [None]:
sns.relplot(x="carat",y="depth",data=df1,kind="line",ci=None)

In [None]:
fmri=pd.read_csv('../input/seaborn-datasets/repository/mwaskom-seaborn-data-23ee2ba/fmri.csv')

In [None]:
fmri.head()

In [None]:
fmri.info()

In [None]:
sns.relplot(x="timepoint",y="signal",kind="line",data=fmri,height=7)

The graphic above is nice but why did the **seaborn** behave like this?

> Because **seaborn** understood more than one measurement result in a time frame and showed the mean and 95% reliability values of these measurements as the default behavior.

It is possible to have a **"standard deviation"** instead of a **"confidence interval"**. For this we need to add **ci = "sd"** parameter.

In [None]:
sns.relplot(x="timepoint", y="signal", kind="line", data=fmri, ci="sd", height=7)

In [None]:
sns.relplot(x="timepoint", y="signal", kind="line", data=fmri, ci=None, height=7)

So, what happens if we apply **hue**, **size** and **style** parameters here?

In [None]:
sns.relplot(
            x="timepoint",
            y="signal",
            size="event",
            style="region",
            markers=True,
            kind="line",
            data=fmri,
            hue="region",
            height=7
            )

plt.savefig("graph2.png")

<a id="5"></a> 
#### Example 3: <font color=#696969>multi-graphical representation using relplot</font>

Often it is not enough to display on a single chart, it may be necessary to divide the chart into pieces and show different variables within those parts. There is a function called **subplots** in the **matplotlib** library. With the **subplots** function you need to divide the chart and determine what area to plot.

If you use the **seaborn** library you can do this automatically. How Does?

In [None]:
sns.relplot(x="timepoint",
            y="signal",
            col="region", # show region in columns
            data=fmri,
            height=7)

Variables can be defined in rows as we define in columns.

In [None]:
sns.relplot(x="timepoint",
            y="signal",
            col="region", # show region in columns
            row="event",  # show event in rows
            kind="line",
            data=fmri,
            height=7)

## How to fold a row or column?

In [None]:
sns.relplot(x="timepoint", y="signal", 
            col="subject", kind="line",
            data=fmri)

We can use the **col_wrap** parameter to visualize them on a specific number of lines.

In [None]:
sns.relplot(x="timepoint", 
            y="signal", 
            col="subject", 
            col_wrap=4,
            kind="line",
            data=fmri)

In [None]:
data=pd.read_csv('../input/videogamesales/vgsales.csv')

In [None]:
data.head()

In [None]:
data.info()

In [None]:
data.dropna(how="any",inplace = True)
data.info()


In [None]:
data.Year = data.Year.astype(int)

<a id="6"></a> 
## Bar Plot

In [None]:
# 

platform_count = Counter(data.Platform)
most_platform=platform_count.most_common(20)
platform_name,count = zip(*most_platform)
platform_name,count = list(platform_name),list(count)

# visualization

plt.figure(figsize=(15,10))
ax=sns.barplot( x = platform_name, y = count, palette = 'rocket')
plt.xlabel('Platform')
plt.ylabel('Frequency')
plt.title('Most common 20 of Platform')
plt.show()


In [None]:
# 2013-2016 
first_filter=data.Year>2012
second_filter=data.Year<2017
new_data=data[first_filter&second_filter]

plt.figure(figsize=(15,10))
sns.catplot(x="Year",y="Global_Sales",kind="bar",
            hue="Platform",
            data=new_data,
            edgecolor=None,
            height=8.27, aspect=11.7/8.27,ci=None)
plt.show()

<a id="7"></a>
## Point Plot

In [None]:
#2010-2016
first_filter=data.Year>2009
second_filter=data.Year<2017
new_data1=data[first_filter&second_filter]


#visualization

sns.catplot(x="Year",y="NA_Sales",kind="point",
            data=new_data1,
            hue = "Platform",
            palette='Set1',
            ci = None,
            edgecolor=None,
            height=8.27, 
            aspect=11.7/8.27)
plt.show()

In [None]:
data1=data[['Year','Genre','Global_Sales']]
data1=data1.set_index('Year')
data2010=[]
data2010.append([sum(data1.loc[2010].Genre=='Shooter'),sum(data1.loc[2010].Genre=='Sports'), sum(data1.loc[2010].Genre=='Action'),sum(data1.loc[2010].Genre=='Role-Playing')])
data2010.append([sum(data1.loc[2011].Genre=='Shooter'),sum(data1.loc[2011].Genre=='Sports'), sum(data1.loc[2011].Genre=='Action'),sum(data1.loc[2011].Genre=='Role-Playing')])
data2010.append([sum(data1.loc[2012].Genre=='Shooter'),sum(data1.loc[2012].Genre=='Sports'), sum(data1.loc[2012].Genre=='Action'),sum(data1.loc[2012].Genre=='Role-Playing')])
data2010.append([sum(data1.loc[2013].Genre=='Shooter'),sum(data1.loc[2013].Genre=='Sports'), sum(data1.loc[2013].Genre=='Action'),sum(data1.loc[2013].Genre=='Role-Playing')])
data2010.append([sum(data1.loc[2014].Genre=='Shooter'),sum(data1.loc[2014].Genre=='Sports'), sum(data1.loc[2014].Genre=='Action'),sum(data1.loc[2014].Genre=='Role-Playing')])
data2010.append([sum(data1.loc[2015].Genre=='Shooter'),sum(data1.loc[2015].Genre=='Sports'), sum(data1.loc[2015].Genre=='Action'),sum(data1.loc[2015].Genre=='Role-Playing')])
data2010.append([sum(data1.loc[2016].Genre=='Shooter'),sum(data1.loc[2016].Genre=='Sports'), sum(data1.loc[2016].Genre=='Action'),sum(data1.loc[2016].Genre=='Role-Playing')])

df=pd.DataFrame(data2010,columns = ['Shooter' , 'Sports', 'Action','Role-Playing'])
df['Year']=[2010,2011,2012,2013,2014,2015,2016]

#visual

f,ax1 = plt.subplots(figsize =(20,10))

sns.pointplot(x='Year',y='Action',data=df,color='lime',alpha=0.7)
sns.pointplot(x='Year',y='Shooter',data=df,color='red',alpha=0.7)
sns.pointplot(x='Year',y='Sports',data=df,color='blue',alpha=0.7)
sns.pointplot(x='Year',y='Role-Playing',data=df,color='orange',alpha=0.7)


plt.xlabel('Years',fontsize = 15,color='blue')
plt.ylabel('Values',fontsize = 15,color='blue')
plt.text(5.7,240,'Action',color='lime',fontsize = 15,style = 'italic')
plt.text(5.7,230,'Shooter',color='red',fontsize = 15,style = 'italic')
plt.text(5.7,220,'Sports',color='blue',fontsize = 15,style = 'italic')
plt.text(5.7,210,'Role-Playing',color='orange',fontsize = 15,style = 'italic')
plt.grid()

<a id="8"></a> 
## Joint Plot

> **jointplot** is used to examine the probability distributions of two variables. There are also small distribution functions for each variable at the edges.

<a id="9"></a>
#### Example 1: <font color=#696969>**Scatter representation with jointplot**</font> 

In [None]:
iris=pd.read_csv('../input/iris/Iris.csv')

In [None]:
iris.head()

In [None]:
iris.info()

In [None]:
iris.corr()

In [None]:
sns.jointplot(x="SepalLengthCm",y="PetalLengthCm",data=iris)

<a id="10"></a>
#### Example 2: <font color=#696969>**Hexbin representation with jointplot**</font> 

In [None]:
sns.jointplot(x="SepalLengthCm",y="PetalLengthCm",data=iris, kind="hex", height=8)


<a id="11"></a>
#### Example 3: <font color=#696969>**kde representation with jointplot**</font> 

> *  pearsonr= if it is 1, there is positive correlation and if it is, -1 there is negative correlation.
> *  If it is zero, there is no correlation between variables
> *  Show the joint distribution using **kde**(kernel density estimation)

In [None]:
sns.jointplot(x="SepalLengthCm",y="PetalLengthCm",data=iris, kind="kde",height=8)

## you can change parameters of joint plot
## kind : { “scatter” | “reg” | “resid” | “kde” | “hex” }

<a id="12"></a> 
## Heatmap

In [None]:
iriscorr=iris.drop(["Id"],axis=1).corr()
iriscorr

In [None]:
#correlation map

f, ax = plt.subplots(figsize=(7,7))
sns.heatmap(iriscorr, annot=True, linewidths=0.2, fmt='.2f', ax=ax, cmap="rocket_r" )
plt.show()

<a id="13"></a> 
# Box Plot 

> Boxplots are a standardized way of displaying the distribution of data based on a five number summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and “maximum”).

![](https://miro.medium.com/max/1838/1*2c21SkzJMf3frPXPAR_gZA.png)

In [None]:
nRowsRead = 1000 # specify 'None' if want to read whole file
cars = pd.read_csv('../input/cars-mini-dataset/cars.csv', delimiter=';', nrows = nRowsRead)
cars.dataframeName = 'cars.csv'
nRow, nCol = cars.shape
print(f'There are {nRow} rows and {nCol} columns')

In [None]:
cars.info()

In [None]:
cars.head()

In [None]:
f, ax = plt.subplots(figsize=(6,8))

sns.boxplot(x="Origin", 
            y="Horsepower", 
            data=cars,
            palette='Set2',
            ax=ax)

> **Tip:** If we give the *hue* parameter as a categorical variable, it will show them side by side as *boxplot*.

In [None]:
f, ax = plt.subplots(figsize=(8,10))
sns.boxplot(x="Origin", 
            y="Horsepower",
            hue="Cylinders",
            data=cars,
            palette='tab10',
            ax=ax)

## We can draw the same graph with ***catplot***.

In [None]:

sns.catplot(x="Origin", 
            y="Horsepower",
            hue="Cylinders",
            data=cars,
            palette='tab20',
            kind="box",
            height=8
            )

<a id="14"></a> 
## *boxenplot*

In [None]:
sns.catplot(x="Origin", 
            y="Horsepower",
            data=cars,
            palette='inferno',
            kind="boxen",
            height=8
            )
plt.savefig('graph.png')

<a id="15"></a> 
## Strip Plot

In [None]:
fmri.head()

In [None]:
f, ax = plt.subplots(figsize=(10,8))
sns.stripplot(x="subject",
              y="signal",
              data=fmri,
              ax=ax,
              palette="hsv")

> **Tip: When the number of categories is high, you can choose to move the categorical variable to the y-axis, that is, visualize the chart horizontally.**

> To do so, just change the location of x and y **seaborn** takes care of the rest.

In [None]:
f, ax = plt.subplots(figsize=(10,8))
sns.stripplot(x="signal",
              y="subject",
              data=fmri,
              ax=ax,
              palette="hsv")

### Tip: You can turn off the vibration parameter by setting "**jitter = False**".

In [None]:
f, ax = plt.subplots(figsize=(10,8))
sns.stripplot(x="subject",
              y="signal",
              data=fmri,
              jitter=False,
              alpha=0.25,
              ax=ax,
              palette="hsv")
plt.savefig("graph1.png")

<a id="16"></a> 
## Swarm Plot

In [None]:
f, ax = plt.subplots(figsize=(8,8))
sns.swarmplot(x="region",
              y="signal",
              data=fmri
              )


### Warning: the "swarm plot" data set may run very slowly when large.

If we wish, we can assign any variable as color in this swarm chart. For example, let's assign the variable **"event"** as **hue**.

In [None]:
f, ax = plt.subplots(figsize=(8,8))
sns.swarmplot(x="region",
              y="signal",
              hue="event",
              data=fmri,
              palette="rocket"
              )

<a id="17"></a> 
### Swarm + Box plot

In [None]:
f, ax = plt.subplots(figsize=(8,8))
sns.swarmplot(x="region",
              y="signal",
              data=fmri,
              palette="CMRmap",
              alpha=0.5
              )
sns.boxplot(x="region",
            y="signal",
            data=fmri,
            palette="Set1",
            )

<a id="18"></a> 
## Count Plot

In [None]:
df1.head()

> **countplot** shows the numbers in each category. Here we just need to write x or y. If we want, we can set the **hue** parameter.

In [None]:
f, ax = plt.subplots(figsize=(10,8))
sns.countplot(x="color",
              hue="cut",
              data=df1,
              edgecolor=None,
              palette="inferno",
              ax=ax)

<a id="19"></a> 
## Pair Plot

In [None]:

sns.pairplot(iris, kind="reg",
             x_vars=["SepalLengthCm","PetalLengthCm","PetalWidthCm"],
             y_vars=["SepalLengthCm","PetalLengthCm","PetalWidthCm"],
             height=5)

## Tip: If the x and y variables are not the same, we cannot see the histograms in between.

In [None]:
sns.pairplot(iris, kind="reg",
             x_vars=["SepalLengthCm","PetalLengthCm","PetalWidthCm"],
             y_vars=["SepalWidthCm","PetalLengthCm","PetalWidthCm"],
             height=5)

<a id="20"></a> 
## Violin Plot

> **This is the favorite chart of many people. With "Kde: kernel density estimation" method, the violin-like distributions are drawn.**

In [None]:
f, ax = plt.subplots(figsize=(8,8))
sns.violinplot(x="Origin", y="Horsepower", data=cars, ax=ax, inner="points")

### With catplot

In [None]:
sns.catplot(x="Origin", y="Horsepower", data=cars, kind="violin", height=8)

In [None]:
cars["Old"] = cars.Year < 76

In [None]:
sns.catplot(x="Origin",
            y="Horsepower",
            hue = "Old",
            data=cars,
            split=True, # You need to turn on this parameter!
            inner = "stick", 
            kind="violin",
            palette="Blues",
            height=8
           )

<a id="21"></a> 
## lmplot and regplot (linear model plots)

> The **lmplot** and **regplot** functions are used to represent the linear (or nth degree) relationship between two variables.

In [None]:
sns.lmplot(x="PetalLengthCm", y="PetalWidthCm", data=iris, height=7)

In [None]:
sns.regplot(x="PetalLengthCm", y="PetalWidthCm", data=iris,order=3)

<a id="22"></a>
### Multiple regression with lmplot

In [None]:
sns.lmplot(x="SepalLengthCm",
           y="PetalWidthCm",
           hue="Species",
           data=iris,
           height=7)

<a id="23"></a>
# KDE Plot

In [None]:
cylinder8 = cars.loc[cars.Cylinders == 8]
cylinder5 = cars.loc[cars.Cylinders == 5]

f, ax = plt.subplots(figsize=(10,8))

sns.kdeplot(cylinder8.Horsepower, cylinder8.Weight, cmap="Reds", shade=True, shade_lowest=False, alpha=0.9)
sns.kdeplot(cylinder5.Horsepower, cylinder5.Weight, cmap="Blues", shade=True, shade_lowest=False, alpha=1)

<a id="24"></a>
# distplot (distribution plots)

 **distplot** is used to plot univariate distributions.
 
>  This function does not use dataframe. For this reason, we need to give the variable we want to draw directly.

<a id="25"></a>
### plot the histogram with the distplot

In [None]:
f, ax = plt.subplots(figsize=(7,7))
sns.distplot(df1.depth, kde=False)

<a id="26"></a>
### plot the kde(kernel density estimation) with the distplot

In [None]:
f, ax = plt.subplots(figsize=(7,7))
sns.distplot(df1.depth, hist=False, kde=True)

> **Tip: The default behavior of the distplot function is to have both the histogram and kde plotted.**

In [None]:
f, ax = plt.subplots(figsize=(7,7))
sns.distplot(df1.depth)

### References

* https://www.kaggle.com/kanncaa1/seaborn-tutorial-for-beginners
 
* https://www.datafloyd.com/tr/seaborn-kutuphanesi-ile-gorsellestirme

<a id="27"></a> 
# CONCLUSION

If you like it, please upvote :)