---   
<h1 align="center">Data Visualization Part 5</h1>

---
<h3><div align="right">Ehtisham Sadiq</div></h3>

## _Data Visualization with Seaborn_

**Read Documentation for details:** 
https://seaborn.pydata.org

<img align="left" width="500" height="500"  src="images/intromatlab.png"  >
<img align="right" width="400" height="500"  src="images/matplotlibadvantages.png"  >

## Learning agenda of this notebook
1. Overview of Seaborn Library
2. Download and Install Seaborn
3. Built-in Datasets of Seaborn Library
4. Plotting with Seaborn
    - The `relplot()` method
    - The `displot()` method
    - The `catplot()` method

## 2. Download and Install Seaborn Library

In [None]:
# To install this library in Jupyter notebook
import sys
#!{sys.executable} -m pip install --upgrade pip
!{sys.executable} -m pip install seaborn --quiet

In [None]:
import seaborn as sns
sns.__version__ , sns.__path__

## 3. Built-in Sets of Seaborn Library

In [None]:
# To handle URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] ... unable to get local issuer certificate>
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

print(sns.get_dataset_names())

### a. CAR_CRASHES Dataset

In [None]:
import seaborn as sns
df_cc = sns.load_dataset('car_crashes')
df_cc.head()

In [None]:
df_cc.shape

### b. FLIGHTS Dataset

In [None]:
df_flights = sns.load_dataset('flights')
df_flights

### c. TIPS Dataset

In [None]:
df_tips = sns.load_dataset('tips')
df_tips

### d. IRIS Dataset
<img align="center" width="700" height="500"  src="images/iris.png"  >

In [None]:
df_iris = sns.load_dataset('iris')
df_iris

In [None]:
df_iris['species'].value_counts()

### e. TITANIC Dataset
<img align="center" width="600" height="300"  src="images/titanic_sinking.jpeg"  >

In [None]:
df_titanic = sns.load_dataset('titanic')
df_titanic.head()

In [None]:
df_titanic.shape

## Programming with Seaborn

#### Option 1: (Use Axes-Level Functions)

In [None]:
import seaborn as sns
from matplotlib import pyplot as plt
fig, ax = plt.subplots()
sns.boxplot(x='sex', y='age', data=df_titanic, ax=ax);

#### Option 2: (Use Figure-Level Functions)

In [None]:
import seaborn as sns
sns.catplot(x ='sex', y='age', kind='box', data = df_titanic);

In [None]:
sns.set_context(context='paper')

## 4. Plotting Graphs with Seaborn
<img align="center" width="600" height="300"  src="images/seaborn_functions.png"  >

In [17]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
#plt.style.use('fivethirtyeight')  
import warnings
warnings.filterwarnings('ignore') 


In [18]:
sns.set_style(style='white') # 'dark', 'darkgrid' white', 'whitegrid'
sns.set_context(context='paper', font_scale=1.5) # talk', 'poster'

### a. The `sns.relplot()` Method
- Line Plot
- Scatter Plot

**Example: Line Plot**

In [None]:
df_iris.head()

In [None]:
df_iris.describe()

In [None]:
sns.relplot(x="sepal_width", y="sepal_length", data=df_iris,  kind='line');

In [None]:
sns.relplot(x="sepal_width", y="sepal_length", data=df_iris,  kind='line', hue='species');

In [None]:
sns.relplot(x="sepal_width", y="sepal_length", data=df_iris,  kind='line', hue='species', style='species');

**Example: Scatter Plot**

In [None]:
df_tips.head()

In [None]:
df_tips.shape

In [None]:
sns.relplot(x='total_bill', y='tip', data=df_tips, kind='scatter');

In [None]:
sns.relplot(x='total_bill', y='tip', data=df_tips, kind='scatter', hue='sex');

In [None]:
sns.relplot(x='total_bill', y='tip', data=df_tips, kind='scatter', hue='sex', style='sex');

In [None]:
sns.relplot(x='total_bill', y='tip', data=df_tips, kind='scatter', hue='sex', style='sex', col='sex');

**Example: Sub-Plots using FacetGrid**

In [None]:
sns.relplot(x='total_bill', y='tip', data=df_tips, kind='scatter', hue='day',col='day', col_wrap=2);

### b. The `sns.catplot()` Method
- Categorical estimate plots:
    - `pointplot` (with ``kind="point"``)
    - `barplot` (with ``kind="bar"``)
    - `countplot` (with ``kind="count"``)
    
- Categorical distribution plots:
    - `boxplot` (with ``kind="box"``)
    - `violinplot` (with ``kind="violin"``)
    - `boxenplot` (with ``kind="boxen"``)

- Categorical scatterplots:
    - `stripplot` (with ``kind="strip"``; the default)
    - `swarmplot` (with ``kind="swarm"``)

**Example: Bar Plot**

In [None]:
df_titanic

In [None]:
sns.catplot(x ='sex', y ='survived',kind='bar', data = df_titanic);

In [None]:
sns.catplot(x ='sex', y ='tip',kind='bar', data = df_tips);

In [None]:
sns.catplot(x ='size', y ='tip',kind='bar', data = df_tips);

In [None]:
sns.catplot(x ='day', y ='tip',kind='bar', data = df_tips);

**Example: Count Plot**

In [None]:
sns.catplot(x ='sex',kind='count', data = df_titanic);

In [None]:
sns.catplot(x ='day',kind='count', data = df_tips);

In [None]:
sns.catplot(x ='sex',kind='count', data = df_titanic, hue='survived');

**Example: Box Plot**

In [None]:
sns.catplot(x ='sex', y='age', kind='box', data = df_titanic);

In [None]:
sns.catplot(x ='sex', y='age', kind='box', data = df_titanic, hue='survived');

**Example: Violin Plot**

In [None]:
sns.catplot(x ='sex', y='age', kind='violin', data = df_titanic);

In [None]:
sns.catplot(x ='sex', y='age', kind='violin', data = df_titanic, hue='survived'); 

In [None]:
sns.catplot(x ='sex', y='age', kind='violin', data = df_titanic, hue='survived', col='survived'); 

**Example: Strip Plot**

In [None]:
sns.catplot(y ='age', kind='strip', data = df_titanic);

In [None]:
sns.catplot(x ='sex', y='age', kind='strip', data = df_titanic);

In [None]:
sns.catplot(x ='sex', y='age', kind='strip', data = df_titanic, hue='survived');

**Example: Swarm Plot**

In [None]:
sns.catplot(x ='sex', y='age', kind='swarm', data = df_titanic, hue='survived');

**Example: Sub-Plots using FacetGrid**

In [None]:
sns.catplot(x ='sex', y='age', kind='box', data = df_titanic, hue='survived', col='survived');

In [None]:
sns.catplot(x ='sex', y='age', kind='box', data = df_titanic, col='survived');

### c. The `sns.displot()` Method
- Categorical estimate plots:
    - `histplot` (with ``kind="hist"``)
    - `kdeplot` (with ``kind="kde"``)
    - `ecdfplot` (with ``kind="ecdf"``)
    

**Example: Histogram**

In [None]:
df_tips

In [None]:
df_tips.total_bill.min()

In [None]:
df_tips.total_bill.max()

In [None]:
df_tips.total_bill.mode()

In [None]:
sns.displot(x= 'total_bill', data=df_tips);

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='hist');

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='hist', bins=30);

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='hist', bins=30, hue='day');

**Example: KDE**

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='kde');

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='kde', fill=True)

**Example: Histogram + KDE**

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='hist', kde=True);

In [None]:
sns.displot(x= 'total_bill', data=df_tips, hue='day');

In [None]:
sns.displot(x= 'total_bill', data=df_tips, hue='day', col='day');

**Example: Adding hue**

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='kde', fill=True, hue='day');

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='kde', fill=True, hue='day')

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='kde', fill=True, hue='day')

In [None]:
df_tips

In [None]:
sns.displot(x= 'tip', data=df_tips, kind='kde', fill=True)

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='kde', fill=True)

**Example: ECDF**

>**Binning Bias** is a pitfall of histograms where you will get different representations of the same data as you change the number of bins of a histogram plot. Note the values along the y-axis changes as you change the number of bins

In [None]:
fig,ax = plt.subplots(2,2)
ax[0][0].hist(df_tips['total_bill'],bins=5);
ax[0][1].hist(df_tips['total_bill'],bins=25);
ax[1][0].hist(df_tips['total_bill'],bins=50);
ax[1][1].hist(df_tips['total_bill'],bins=100);

In [None]:
sns.displot(x='total_bill', data=df_tips, kind='ecdf');

In [None]:
sns.displot(x='tip', data=df_tips, kind='ecdf');

In [None]:
sns.displot(x='tip', data=df_tips, kind='ecdf', hue='time');

In [None]:
df_tips.tip.value_counts()

**Example: Bivariate Analysis**

In [None]:
sns.displot(x='total_bill', y='tip', data=df_tips, kind='hist', cbar=True)

In [None]:
sns.displot(x='total_bill', y='tip', data=df_tips, kind='kde')

In [None]:
sns.displot(x='total_bill', y='tip', data=df_tips, kind='hist', hue='day', col='day')

**Example: Sub-Plots using FacetGrid**

In [None]:
sns.displot(x= 'total_bill', data=df_tips, kind='hist', hue='day', col='day');