# Seaborn

Seaborn is a high‑level Python library built on top of Matplotlib that simplifies the creation of attractive, informative statistical graphics such as scatter plots, bar charts, box plots, violin plots, heatmaps, and multi‑plot grids by providing a concise, consistent API and polished default styles. It integrates seamlessly with Pandas DataFrames and NumPy arrays to streamline exploratory data analysis, model evaluation, and feature‐engineering visualizations, making it an essential tool for data scientists and AI engineers who need to quickly generate publication‑ready charts that reveal patterns, relationships, and insights in complex datasets.

## Essential Plot Types

### Relational Plots
*Use Case*: Visualize relationships, trends over time or continuous variable

In [None]:
import seaborn as sns
import pandas as pd

# Sample: model performance vs learning rate
df = pd.DataFrame({
    'lr': [0.001,0.01,0.1,1],
    'accuracy': [0.82, 0.88, 0.91, 0.89]
})
sns.set_theme()
sns.lineplot(data=df, x='lr', y='accuracy', marker='o')

#### Best practices:

Log-scale for hyperparameters: ax.set_xscale('log')
Add markers for clarity
Use hue for multiple model comparisons

### Categorical Plots

*Use Case*: Box, Bar, Violin, Strip, Swarm

In [None]:
df = pd.DataFrame(...)
sns.boxplot(data=df, x='model', y='error_rate', palette='Set2')
sns.stripplot(data=df, x='model', y='error_rate', color='black', alpha=0.5)

### Distribution Plots

*Use Case*: Histplot, KDE, Displot, ECDF

In [None]:
sns.displot(data=df, x='feature', kde=True, bins=30)

#### Best practices:

Always inspect univariate distributions before modeling
Use stat='density' to compare densities

### Matrix Plots

Heatmap & Clustermap
*Use Case*:  Feature correlation for EDA, multicollinearity check

In [None]:
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')

### Regression Plots

In [None]:
sns.regplot(data=df, x='x', y='y', ci=95)

### Multi-Plot Grids

In [None]:
sns.pairplot(df, hue='target', diag_kind='kde')

## Customization

### Themes & Contexts

In [None]:
sns.set_theme(style='whitegrid', context='talk')

### Color Palettes

In [None]:
sns.color_palette('rocket', as_cmap=True)

### Axes & Annotations

In [None]:
ax = sns.barplot(...)
ax.set_title('Model Accuracy by Setting', fontsize=16)
ax.set_xlabel('Configuration')
ax.annotate('Best', xy=(2,0.91), xytext=(2.5,0.93), arrowprops=dict(arrowstyle='->'))

## Statistical Visualization for ML

Visualizing target distributions vs features
Stratified histograms for class imbalance
Regression fit with confidence intervals
Multivariate via hue, size, style

In [None]:
sns.histplot(data=df, x='feature', hue='target', multiple='stack')

## Seaborn with Pandas & NumPy

Direct DataFrame plotting
melt() for reshaping data
Handling DatetimeIndex

In [None]:
long = df.melt(id_vars='id', var_name='feature', value_name='value')
sns.boxplot(data=long, x='feature', y='value')

## Real-World EDA

Check missing values
Univariate distributions: histplot, kdeplot
Bivariate: scatterplot, boxplot
Correlation heatmap

In [None]:
df = sns.load_dataset('data')
sns.histplot(df['column'].dropna(), kde=True)

## Multi-Plot Layouts

*Use Case*: Use FacetGrid, catplot, PairGrid to create dashboards of small multiples.

In [None]:
g = sns.FacetGrid(df, col='sex', hue='survived')
g.map(sns.histplot, 'age', kde=True)

## Confusion Matrices & Evaluation

In [None]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_true, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')

# Plot ROC/PR with Seaborn style:

from sklearn.metrics import roc_curve
fpr, tpr, _ = roc_curve(y_true, y_prob)
sns.lineplot(x=fpr, y=tpr)

## Time Series Visualization

In [None]:
df_ts = df.set_index('date')
sns.lineplot(data=df_ts['value'].rolling(30).mean())

## Saving Plots Professionally

In [None]:
fig = ax.get_figure()
fig.savefig('plot.png', dpi=300, bbox_inches='tight')

## Automating Plot Workflows

*Use Case*: Wrap plots into functions, log with MLflow or custom scripts

In [None]:
def plot_feature_distribution(df, feature, hue=None):
    ax = sns.histplot(df, x=feature, hue=hue, kde=True)
    return ax