# __Seaborn Visualization: From Basics to Advanced__

## __Agenda__

In this lesson, we will cover the following concepts with the help of examples:
- Introduction to Seaborn
- Plotting a Graphs Using Seaborn
- Violin Plot
- Pair Plot
- Heatmap
- Joint plot
- Swarm plot
- Plotting 3D Graphs for Multiple Columns Using Seaborn

## __1. Introduction to Seaborn__
Seaborn is a statistical data visualization library in Python based on Matplotlib.
![image.png](attachment:ca327c59-5b79-41ee-a09a-35f157a50a02.png)
 
- It provides an interface for creating attractive and informative statistical graphics. 
- It comes with several built-in themes and color palettes to make it easy to create aesthetically pleasing visualizations.
- It is particularly well-suited for exploring complex datasets with multiple variables.



The below code leverages Seaborn's functionality to create a specialized line plot for visualizing fMRI data with distinct regions and events. 
- Seaborn simplifies the process of creating complex visualizations and provides additional features for customization and exploration of data patterns.







In [None]:
import seaborn as sns

In [None]:
# sns.plot_function(data=data_frame, x='col_name',y='col_names',c='col_names')

In [None]:
tips = sns.load_dataset('tips')

In [None]:
tips

In [None]:
sns.scatterplot(data=tips,x='tip' ,y='total_bill', hue='sex',size='size' )

In [None]:
tips.groupby('size')['total_bill'].mean()

In [None]:
# lineplot
sns.lineplot(data=tips,x='size',y='total_bill',errorbar=('ci',False))

In [None]:
sns.barplot(data=tips,x='sex',y='tip')

In [None]:
sns.barplot(data=tips,x='sex',y='tip', hue= 'smoker')

In [None]:
#displot
sns.displot(x='tip',data=tips)

In [None]:
sns.boxplot(x='tip', y='sex',hue='smoker',data=tips)

In [None]:
import seaborn as sns
sns.set_theme()
fmri = sns.load_dataset("fmri")
fmri

In [None]:

sns.relplot(
    data=fmri, kind="line",
    x="timepoint", y="signal", col="region",
    hue="event", style="event",
)

## __2. Plotting Graphs Using Seaborn__

__Note:__ We have previously explored these plot types using the Matplotlib library. This example serves to illustrate how to achieve similar visualizations using Seaborn for its enhanced styling and simplicity.

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Load a sample dataset
tips = sns.load_dataset('tips')

# Scatter Plot
sns.scatterplot(x='total_bill', y='tip', data=tips)
plt.title('Scatter plot: Total bill vs tip')
plt.show()

# Line Plot
sns.lineplot(x='day', y='total_bill', data=tips, hue='sex')
plt.title('Line plot: Total bill by day (Differentiated by gender)')
plt.show()

# Histogram
sns.histplot(tips['total_bill'], bins=20, kde=True)
plt.title('Histogram: Distribution of total bill')
plt.show()

# Box Plot
sns.boxplot(x='day', y='total_bill', data=tips)
plt.title('Box plot: Total bill by day')
plt.show()

# Bar Plot
sns.barplot(x='day', y='total_bill', data=tips, hue='sex')
plt.title('Bar plot: Average total bill by day (Differentiated by gender)')
plt.show()

## __3. Violin Plot__
A violin plot combines the features of a kernel density plot and a box plot, showing the distribution of a numerical variable for different categories.
- This plot visualizes the distribution of total bills for each day, highlighting data point densities and key statistical measures such as the median and interquartile range.

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Load a sample dataset
tips = sns.load_dataset('tips')

# Violin Plot
sns.violinplot(x='day', y='total_bill', data=tips)
plt.title('Violin plot: Total bill by day')
plt.show()

## __4. Pair Plot:__

A pair plot displays the pairwise relationships between numerical variables in a dataset through scatter plots and distributions, differentiating categories using colors.
- It is useful for understanding how different numerical variables relate to each other and how these relationships vary based on gender ('sex' variable in this case).

In [None]:
# Pair Plot
sns.pairplot(tips, hue='sex')
plt.title('Pair Plot: Relationships across variables (Differentiated by gender)')
plt.show()

## __5. Heatmap:__

A heatmap visualizes the correlation matrix of numerical variables in a dataset, using color gradations to represent the strength and direction of correlations.

- It allows for the quick identification of relationships between variables, with warmer colors indicating stronger correlations and cooler colors indicating weaker or negative correlations.

In [None]:
# Heatmap
correlation_matrix = tips.corr()
correlation_matrix

In [None]:

sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')
plt.title('Heatmap: Correlation matrix')
plt.show()

## __6. Joint Plot__

A joint plot showcases the bivariate distribution of two numerical variables, representing point density with hexagonal bins.
- It helps identify patterns and concentrations in the relationship between __total_bill__ and __tip__, with darker hexagons indicating higher point density.

In [None]:
# Create a Joint Plot
joint = sns.jointplot(x='total_bill', y='tip', data=tips, kind='hex')
# Adjust the title position
plt.subplots_adjust(top=0.9)  # Adjust the top space to make room for the title
# Set the title for the figure
joint.fig.suptitle('Joint Plot: Hexbin Scatter Plot')
plt.show()

## __7. Swarm Plot__

A swarm plot is a categorical scatter plot that arranges individual data points without overlapping.
- It is useful for visualizing the distribution of total_bill across different days. Each point represents an individual entry.
- It helps identify patterns and concentrations without losing granularity due to point overlap.

In [None]:
# Swarm Plot
sns.swarmplot(x='day', y='total_bill', data=tips)
plt.title('Swarm plot: Total bill by day')
plt.show()

## __8. Plotting 3D Graphs for Multiple Columns Using Seaborn__

In [None]:
# Import necessary libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Read CSV file into DataFrame, parsing dates
df = pd.read_csv('ADANIPORTS.csv', parse_dates=True)

# Calculate 'High-Low' and '100-Day Moving Average'
df['H-L'] = df.High - df.Low
df['100MA'] = df['Close'].rolling(100).mean()

# Set Seaborn style to 'darkgrid'
sns.set_style('darkgrid')

# Plotting a 3D Graph
ax = plt.axes(projection='3d')
ax.scatter(df.index, df['H-L'], df['100MA'])

# Set labels for each axis
ax.set_xlabel('Index')
ax.set_ylabel('High-Low')
ax.set_zlabel('100-Day Moving Average')

# Display the 3D scatter plot
plt.show()


### __Plot a 3D Spiral Graph__

In [None]:
# Import necessary libraries
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Generate 3D data
z1 = np.linspace(0, 10, 100)
x1 = np.cos(4 * z1)
y1 = np.sin(4 * z1)

# Set Seaborn style to whitegrid
sns.set_style('whitegrid')

# Create a 3D axes
ax = plt.axes(projection='3d')

# Plot the 3D curve
ax.plot3D(x1, y1, z1)

# Display the 3D plot
plt.show()


### __Create 3D Surface Using Seaborn__


In [None]:
# Define a function to compute z values based on x and y
def return_z(x, y):
    return 50 - (x**2 + y**2)

# Set Seaborn style to whitegrid
sns.set_style('whitegrid')

# Generate 2D grid of x and y values
x1, y1 = np.linspace(-5, 5, 50), np.linspace(-5, 5, 50)
x1, y1 = np.meshgrid(x1, y1)

# Compute z values using the defined function
z1 = return_z(x1, y1)

# Create a 3D axes
ax = plt.axes(projection='3d')

# Plot the 3D surface
ax.plot_surface(x1, y1, z1)

# Display the 3D plot
plt.show()


## __Introduction to Plotly__
Plotly is a versatile and interactive data visualization library in Python that enables the creation of interactive and web-based visualizations. 
- It allows users to create a wide range of charts, graphs, and dashboards for exploratory data analysis and presentation purposes.
- It supports both static and dynamic visualizations and is particularly well-suited for creating interactive plots that can be embedded in web applications and notebooks.

In [None]:
import plotly.express as px

# Create a sample DataFrame
df = px.data.iris()

# Create a scatter plot
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", size="petal_length")

# Show the interactive plot
fig.show()


# __Assisted Practice__

## __Problem Statement:__
Analyze the housing dataset using various types of plots from the Seaborn library to gain insights into the data.

## __Steps to Perform:__
- Create a violin plot for a feature like SalePrice to visualize its distribution and understand its characteristics.
- Use a pair plot to visualize the relationships between different numerical variables like LotArea, YearBuilt, and SalePrice.
- Create a heatmap of the correlation matrix to understand the relationships between different numerical features.
- Use a joint plot to visualize the relationship between two numerical variables and their individual distributions, for example, LotArea and SalePrice.
- Create a swarm plot for a categorical variable like Neighborhood against SalePrice to understand the distribution of prices in each neighborhood.

In [None]:
import seaborn as sns
df = pd.read_csv('HousePrices.csv')
df.head()

In [None]:
sns.violinplot(x='price',data=df)

In [None]:
sns.pairplot(data=df[['sqft_lot','yr_built','price']])

In [None]:
plt.figure(figsize=(15,15))
sns.heatmap(df.corr(), annot = True)

In [None]:
sns.jointplot(x='price',y='sqft_lot',data=df)

In [None]:
tips

In [None]:
sns.boxplot(x='sex',y='total_bill',data=tips)