<a href="https://colab.research.google.com/github/Adh101/TechAxis-Data-Science-with-Python-Notes/blob/main/Data_Visualization_using_Seaborn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Visualization With Seaborn

- Seaborn is a Python data visualization library based on matplotlib.
- It provides a high-level interface for drawing attractive and informative statistical graphics. It provide choices for plot style and color defaults, defines simple high-level functions for common statistical plot types, and integrates with the functionality provided by Pandas DataFrames.
- The main idea of Seaborn is that it provides high-level commands to create a variety of plot types useful for statistical data exploration, and even some statistical model fitting.

### Table of Contents

1. Creating basic plots
    - Line Chart
    - Bar Chart
    - Histogram
    - Box plot
    - Violin plot
    - Scatter plot
    - Hue semantic
    - Bubble plot
    - Pie Chart
2. Advance Categorical plots in Seaborn
3. Density plots
4. Pair plots

In [25]:
#import the libraries
import seaborn as sns
sns.set()
sns.set(style ='darkgrid')

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

plt.rcParams['figure.figsize']=(8,5)

# Load the dataset

In [None]:
#read the dataset
data_BM = pd.read_csv("/content/bigmart_data.csv")

#drop the null values
data_BM = data_BM.dropna(how = "any")

data_BM.head()

## 1. Creating basic plots

Let's have a look on how can you create some basic plots in seaborn in a single line for which multiple lines were required in matplotlib.

#### Line Chart

 * With some datasets, you may want to understand changes in one variable as a function of time, or a similarly continuous variable.
 * In seaborn, this can be accomplished by the **lineplot()** function, either directly or with **relplot()** by setting **kind="line":**

In [None]:
#lineplot
sns.lineplot(x="Item_Weight", y="Item_MRP", data = data_BM[:50])

#### Bar Chart

- In seaborn, you can create a barchart by simply using the **barplot** function.
- Notice that to achieve the same thing in matplotlib, we had to write extra code just to group the data category wise.
- And then we had to write much more code to make sure that the plot comes out correct.

In [None]:
#bar plot
sns.barplot(x='Item_Type', y='Item_MRP', data = data_BM[:5])

#### Histogram

- You can create a histogram in seaborn by simply using the **distplot()**. There are multiple options that we can use which we will see further in the notebook.

In [None]:
sns.distplot(data_BM['Item_MRP'], bins = 20)

#### Box plots

- You can use the **boxplot()** for creating boxplots in seaborn.
- Let's try to visualize the distribution of Item_Outlet_Sales of items.

In [None]:
sns.boxplot(data_BM['Item_Outlet_Sales'], orient = 'vertical')

#### Violin plot

- A violin plot plays a similar role as a box and whisker plot.
- It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared.
- Unlike a box plot, in which all of the plot components correspond to actual datapoints, the violin plot features a kernel density estimation of the underlying distribution.
- You can create a violinplot using the **violinplot()** in seaborn.

In [None]:
sns.violinplot(data_BM['Item_Outlet_Sales'], orient ='vertical', color = 'magenta')

#### Scatter plot

* It depicts the distribution of two variables using a cloud of points, where each point represents an observation in the dataset.
* This depiction allows the eye to infer a substantial amount of information about whether there is any meaningful relationship between them.
- You can use **relplot()** with the option of `kind=scatter` to plot a scatter plot in seaborn.

***NOTE : Here, we are going to use only a subset of the data for the plots.***

In [None]:
sns.relplot(x='Item_MRP', y='Item_Outlet_Sales', kind='scatter', data= data_BM[:200])

#### Hue semantic

*We can also add another dimension to the plot by coloring the points according to a third variable. In seaborn, this is referred to as using a “hue semantic”.*

In [None]:
sns.relplot(x='Item_MRP',y='Item_Outlet_Sales', hue ='Item_Type', data = data_BM[:200])

- Remember the **line chart** that we created earlier? When we use **hue** semantic, we can create more complex line plots in seaborn.
- In the following example, **different line plots for different categories of the Outlet_Size** are made.

In [None]:
sns.lineplot(x='Item_Weight', y='Item_MRP', hue = 'Outlet_Size', data = data_BM[:200])

#### Bubble plot

- We utilize the **hue** semantic to color bubbles by their Item_Visibility and at the same time use it as size of individual bubbles.

In [None]:
sns.relplot(x='Item_MRP', y='Item_Outlet_Sales', kind ='scatter', data =data_BM[:200],
            hue ='Item_Visibility', size ='Item_Visibility')

#### Category wise sub plot

- You can also create **plots based on category** in seaborn.
- We have created scatter plots for each Outlet_Size

In [None]:
sns.relplot(x='Item_Weight', y='Item_Visibility', hue = 'Outlet_Size',
            style = 'Outlet_Size',col ='Outlet_Size' ,data = data_BM[:200])