<font size="5"><p style="text-align: center;">**Visualisation avec Seaborn**</font></p> 

<font size="4">**SEABORN , qu'est ce c'est?**</font>

Seaborn est une bibliothèque permettant de créer des graphiques en Python. Elle est basée sur Matplotlib, et s’intègre avec les structures Pandas.

<font size="3"> **Avantages:**</font>
- Syntaxe Facile et réduite
- Fournit différents styles et palettes de couleur attrayantes par défaut
- Seaborn automatise la création de figures multiples
- Intégration renforcée avec Pandas et ses Data Frames


On utilise principalement Matplotlib pour les tracés de graphiques basiques. Cependant, Matplotlib offre une flexibilité importante en termes de customisation et des performances parfois supérieures.

Choix de couleur: http://seaborn.pydata.org/tutorial/color_palettes.html

Choix du style : http://seaborn.pydata.org/tutorial/aesthetics.html


In [None]:
# Imporatation libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# Création d'un dataframe
df=pd.DataFrame(np.random.rand(50,1)*100, columns=['vitesse'])
df['poids']=np.random.rand(50,1)*100
df['Nbre_accident']=np.random.randint(10, size=50)
df['category']=np.random.choice(['Ford','BMW', 'Honda','Toyota'], size=50)
df['color']=np.random.choice(['Bleu','Rouge', 'Noir'], size=50)
df['bool']=np.random.choice([1,2], size=50)
df.head()

# CATPLOT - Variables catégorielles


seaborn.catplot(*, x=None, y=None, hue=None, data=None, row=None, col=None, col_wrap=None, estimator=<function mean at 0x7fecadf1cee0>, ci=95, n_boot=1000, units=None, seed=None, order=None, hue_order=None, row_order=None, col_order=None, kind='strip', height=5, aspect=1, orient=None, color=None, palette=None, legend=True, legend_out=True, sharex=True, sharey=True, margin_titles=False, facet_kws=None, **kwargs)

    Kind : bar - count - box - swarm - violin

https://seaborn.pydata.org/generated/seaborn.catplot.html?highlight=catplot#seaborn.catplot

In [None]:
plt.figure(figsize=(12,7))
sns.catplot(y="category", x="vitesse",
                hue="color",
                col="bool",
                data=df, 
                kind="bar",
               )
plt.show()

## Barplot

seaborn.barplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=, ci=95, n_boot=1000, units=None, orient=None, color=None, palette=None, saturation=0.75, errcolor='.26', errwidth=None, capsize=None, dodge=True, ax=None, **kwargs)

    x,y,hue : names of variable in data or vector data
    data : DataFrame,array or list of array,optional
    color :matplotlib color,optional
    palette : palette name,list, or dict,optional

   
 https://seaborn.pydata.org/generated/seaborn.barplot.html#seaborn.barplot

### Exemple 1

In [None]:
df['category'].value_counts()

In [None]:
sns.barplot(x=df['category'].value_counts().index,
               y=df['category'].value_counts().values,
               palette="Blues_d")
z=list(range(0,20,2))
plt.yticks(z)
plt.xlabel('Category',fontsize=14, color="skyblue")
plt.ylabel('Frequency')
plt.title('Bar Plot')

### Exemple 2: choix du style

    style: darkgrid, whitegrid, dark, white, ticks
    
    context: paper, talk, poster
    
    color_palette: https://seaborn.pydata.org/generated/seaborn.color_palette.html#seaborn.color_palette

In [None]:
plt.figure(figsize=(7,4))

sns.set(style='whitegrid')
sns.set_context("paper")
sns.set_palette("Set2")

ax=sns.barplot(x=df['category'].value_counts().index,
               y=df['category'].value_counts().values)

plt.xlabel('Category')
plt.ylabel('Frequency')
plt.title('Bar Plot')


### Exemple 3: Rotation de l'axe X 


In [None]:
df.groupby(['category', 'color'])["vitesse"].mean()

In [None]:
# Automatiquement la moyenne de vitesse avec ic de 95 par defaut (-> ci=0)
sns.barplot(x = "category", y = "vitesse", hue = "color", data = df) 
plt.xticks(rotation=45)
plt.legend(bbox_to_anchor=(1,1))

## Countplot

Compte les observations de chaque catégorie

seaborn.countplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, dodge=True, ax=None, **kwargs)

https://seaborn.pydata.org/generated/seaborn.countplot.html#seaborn.countplot

### Exemple 1

In [None]:
# // = exemple 1 du bar plot
sns.countplot(x="category", data=df, palette="Set1")

In [None]:
sns.countplot(x="category", hue="color", palette="Set3", data=df)

## Point PLot

seaborn.pointplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean at 0x7fecadf1cee0>, ci=95, n_boot=1000, units=None, seed=None, markers='o', linestyles='-', dodge=False, join=True, scale=1, orient=None, color=None, palette=None, errwidth=None, capsize=None, ax=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.pointplot.html?highlight=pointplot#seaborn.pointplot

### Exemple 1

In [None]:
sns.pointplot(x="category", y="vitesse", hue="bool",
                   data=df,  
              dodge=True # permet de séparer les point 
              ,markers=["o", "x"]
              ,linestyles=["-", "--"]
             ,ci="sd")

## Boxplot

seaborn.boxplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, width=0.8, dodge=True, fliersize=5, linewidth=None, whis=1.5, ax=None, **kwargs)

https://seaborn.pydata.org/generated/seaborn.boxplot.html?highlight=boxplot#seaborn.boxplot

### Exemple 1: une variable quantitative

In [None]:
sns.boxplot(y=df['vitesse'])
# ,orient='h'

### Exemple 2: 

In [None]:
sns.boxplot(x=df['color'],y=df['vitesse']
           ,hue=df["bool"]
           ,linewidth=2.5 # épaisseur des bordures
           ,order=["Noir", "Rouge", "Bleu"] # ordre d'affichage 
           )


## SWARMPLOT

Une sorte de scatterplot catégoriel

seaborn.swarmplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, dodge=False, orient=None, color=None, palette=None, size=5, edgecolor='gray', linewidth=0, ax=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.swarmplot.html#seaborn.swarmplot

### Exemple 1

In [None]:
sns.boxplot(x="category", y="vitesse", data=df)
sns.swarmplot(x="category", y="vitesse", data=df
              ,hue="bool"
              ,size=6
             )

## Violinplot

Combinaition Boxplot et KDE (kernel density estimate)

seaborn.violinplot(*, x=None, y=None, hue=None, data=None, order=None, hue_order=None, bw='scott', cut=2, scale='area', scale_hue=True, gridsize=100, width=0.8, inner='box', split=False, dodge=True, orient=None, linewidth=None, color=None, palette=None, saturation=0.75, ax=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.violinplot.html#seaborn.violinplot

### Exemple 1

In [None]:
sns.violinplot(x="category", y="vitesse", hue="bool",
               data=df
               , palette="muted"
               #, split=True # compare les deux variable avec hue 
              )

# DISPLOT - Distributions

seaborn.displot(data=None, *, x=None, y=None, hue=None, row=None, col=None, weights=None, kind='hist', rug=False, rug_kws=None, log_scale=None, legend=True, palette=None, hue_order=None, hue_norm=None, color=None, col_wrap=None, row_order=None, col_order=None, height=5, aspect=1, facet_kws=None, **kwargs)

    kind: par défaut histogramme - kde - ecdf

https://seaborn.pydata.org/generated/seaborn.displot.html?highlight=displot#seaborn.displot

In [None]:
sns.displot(df, x="vitesse", kind="kde")

In [None]:
sns.displot(df, x="poids", kde=True)

## Histplot

Histogramme de variable uni-ou bivariées.

seaborn.histplot(data=None, *, x=None, y=None, hue=None, weights=None, stat='count', bins='auto', binwidth=None, binrange=None, discrete=None, cumulative=False, common_bins=True, common_norm=True, multiple='layer', element='bars', fill=True, shrink=1, kde=False, kde_kws=None, line_kws=None, thresh=0, pthresh=None, pmax=None, cbar=False, cbar_ax=None, cbar_kws=None, palette=None, hue_order=None, hue_norm=None, color=None, log_scale=None, legend=True, ax=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.histplot.html#seaborn.histplot

### Exemple 1

In [None]:
sns.histplot(data=df, x="vitesse"
            #,binwidth=3 # lg des bins
            ,bins=15 # nbre total de bins
            ,kde=True # ajout kde
            ,hue="category"
            #, multiple="stack" # emplilé
            #,log_scale=True # echelle log x
            #, fill=False # remplissage des barre
            )

## Kdeplot

seaborn.kdeplot(x=None, *, y=None, shade=None, vertical=False, kernel=None, bw=None, gridsize=200, cut=3, clip=None, legend=True, cumulative=False, shade_lowest=None, cbar=False, cbar_ax=None, cbar_kws=None, ax=None, weights=None, hue=None, palette=None, hue_order=None, hue_norm=None, multiple='layer', common_norm=True, common_grid=False, levels=10, thresh=0.05, bw_method='scott', bw_adjust=1, log_scale=None, color=None, fill=None, data=None, data2=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.kdeplot.html?highlight=kde#seaborn.kdeplot

### Exemple 1

In [None]:
sns.kdeplot(data=df, x="vitesse"
           #, shade=True # remplir
        #, vertical=True
           )

### Exemple 2

In [None]:
f, ax = plt.subplots(figsize=(7, 5))
sns.kdeplot(df['vitesse'], shade=True, color='r')
sns.kdeplot(df['poids'], shade=True, color='m')
plt.show()

## Rugplot

S'utilise souvent en association avec un autre type de graphique.

seaborn.rugplot(x=None, *, height=0.025, axis=None, ax=None, data=None, y=None, hue=None, palette=None, hue_order=None, hue_norm=None, expand_margins=True, legend=True, a=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.rugplot.html#seaborn.rugplot

### Exemple 1

In [None]:
sns.kdeplot(data=df, x="vitesse")
sns.rugplot(data=df, x="vitesse")

## Densité plot
### Exemple 1

In [None]:
sns.kdeplot(x=df['vitesse'], y=df['poids'])

### Exemple 2

In [None]:
sns.kdeplot(x=df['vitesse'], y=df['poids'], cmap="Reds", shade=True)

# RELPLOT - Relation

kind: scatter - line

In [None]:
sns.relplot(
    data=df, x="vitesse", y="poids",
    col="color", hue="category", style="bool",
    kind="scatter"
)

## Scatterplot

### Exemple 1

In [None]:
plt.figure(figsize=(10,8))
sns.scatterplot(data=df, x="vitesse", y="poids"
               , hue="category"
                #, style="time"
               )


## Lineplot

seaborn.lineplot(*, x=None, y=None, hue=None, size=None, style=None, data=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, dashes=True, markers=None, style_order=None, units=None, estimator='mean', ci=95, n_boot=1000, seed=None, sort=True, err_style='band', err_kws=None, legend='auto', ax=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.lineplot.html#seaborn.lineplot

### Exemple 1

In [None]:
sns.lineplot(data=df, x="year", y="")

# REGPLOT - Regression

Affiche par défaut regression fit

seaborn.regplot(*, x=None, y=None, data=None, x_estimator=None, x_bins=None, x_ci='ci', scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, seed=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=True, dropna=True, x_jitter=None, y_jitter=None, label=None, color=None, marker='o', scatter_kws=None, line_kws=None, ax=None)

http://seaborn.pydata.org/generated/seaborn.regplot.html?highlight=regplot#seaborn.regplot


In [None]:
sns.regplot(x='vitesse', y='poids', data=df
           ,ci=68 # 68% confidence interval
            ,color='green', marker='+', scatter_kws={'s': 200}
           )

## Lmplot

seaborn.lmplot(*, x=None, y=None, data=None, hue=None, col=None, row=None, palette=None, col_wrap=None, height=5, aspect=1, markers='o', sharex=True, sharey=True, hue_order=None, col_order=None, row_order=None, legend=True, legend_out=True, x_estimator=None, x_bins=None, x_ci='ci', scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, seed=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=True, x_jitter=None, y_jitter=None, scatter_kws=None, line_kws=None, size=None)

https://seaborn.pydata.org/generated/seaborn.lmplot.html#seaborn.lmplot

### Exemple 1

In [None]:
sns.lmplot(x='vitesse', y='poids', hue="category", data=df, palette="Set1")

### Exemple 2

In [None]:
sns.lmplot(x='vitesse', y='poids', col="bool", data=df)

### Exemple 3

In [None]:
sns.lmplot(x='vitesse', y='poids', col="category", col_wrap=2,data=df)

# JOINPLOT - Distribution & Relation

Distribution et relation entre deux variables quantitatives.

seaborn.jointplot(*, x=None, y=None, data=None, kind='scatter', color=None, height=6, ratio=5, space=0.2, dropna=False, xlim=None, ylim=None, marginal_ticks=False, joint_kws=None, marginal_kws=None, hue=None, palette=None, hue_order=None, hue_norm=None, **kwargs)

http://seaborn.pydata.org/generated/seaborn.jointplot.html#seaborn.jointplot

### Exemple 1

In [None]:
sns.jointplot(data=df, x="vitesse", y="poids", hue='category')

### Exemple 2

    kind= reg - hex - hist - kde

In [None]:
sns.jointplot(data=df, x="vitesse", y="poids", kind="reg", marginal_kws=dict(bins=15))

# HEATMAP - Multi-Relation

seaborn.heatmap(data, *, vmin=None, vmax=None, cmap=None, center=None, robust=False, annot=None, fmt='.2g', annot_kws=None, linewidths=0, linecolor='white', cbar=True, cbar_kws=None, cbar_ax=None, square=False, xticklabels='auto', yticklabels='auto', mask=None, ax=None, **kwargs)

https://seaborn.pydata.org/generated/seaborn.heatmap.html

### Exemple 1

In [None]:
sns.heatmap(df.corr()
           ,annot=True
           ,cmap='YlGnBu')

# PAIRPLOT - Multi-Relation

Relation entre plusieurs variables

seaborn.pairplot(data, *, hue=None, hue_order=None, palette=None, vars=None, x_vars=None, y_vars=None, kind='scatter', diag_kind='auto', markers=None, height=2.5, aspect=1, corner=False, dropna=False, plot_kws=None, diag_kws=None, grid_kws=None, size=None)

http://seaborn.pydata.org/generated/seaborn.pairplot.html#seaborn.pairplot

### Exemple 1

In [None]:
sns.pairplot(df
            , hue="category"
            #, diag_kind="hist" # histo
            #, corner=True # triangle
            )

# Facetgrid Multi-graph

sns.FacetGrid(self, data, *, row=None, col=None, hue=None, col_wrap=None, sharex=True, sharey=True, height=3, aspect=1, palette=None, row_order=None, col_order=None, hue_order=None, hue_kws=None, dropna=False, legend_out=True, despine=True, margin_titles=False, xlim=None, ylim=None, subplot_kws=None, gridspec_kws=None, size=None)

https://seaborn.pydata.org/generated/seaborn.FacetGrid.html

### Exemple 1

In [None]:
g=sns.FacetGrid(df, col="category", row="color")
g.map_dataframe(sns.scatterplot, "poids", "vitesse")
g.set_axis_labels("Varaiable poids", "variable vitesse")

In [None]:
g=sns.FacetGrid(df, col="category")
g.map_dataframe(sns.scatterplot, "poids", "vitesse", hue="color")
g.set_axis_labels("Varaiable poids", "variable vitesse")
g.add_legend()

# JoinGrid Multi-graph

sns.JointGrid(self, *, x=None, y=None, data=None, height=6, ratio=5, space=0.2, dropna=False, xlim=None, ylim=None, size=None, marginal_ticks=False, hue=None, palette=None, hue_order=None, hue_norm=None)

https://seaborn.pydata.org/generated/seaborn.JointGrid.html#seaborn.JointGrid



## Exemple 1

In [None]:
g = sns.JointGrid(data=df, x="vitesse", y="poids")
g.plot(sns.regplot, sns.boxplot)

## Exemple 2

In [None]:
g=sns.JointGrid()
x, y = df["vitesse"], df["poids"]
sns.scatterplot(x=x, y=y, ec="b", s=100, linewidth=1.5, ax=g.ax_joint)
sns.histplot(x=x, linewidth=2, ax=g.ax_marg_x)
sns.kdeplot(y=y, linewidth=2, ax=g.ax_marg_y)

# Combo Plot

In [None]:
# CREATION DATAFRAME
# Mois
mois = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'June', 
         'July', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
# Moyenne temperatures
moy_Temp = [35, 45, 55, 65, 75, 85, 95, 100, 85, 65, 45, 35]
# Moyenne de  percipitation %
moy_Percipitation_Perc = [.90, .75, .55, .10, .35, .05, .05, .08, .20, .45, .65, .80]

df = pd.DataFrame({'Mois': mois, 
                   'Moy_Temp': moy_Temp, 
                   'Moy_Percipitation_Perc': moy_Percipitation_Perc})


df.head()

In [None]:
# Plot 1
sns.barplot(x='Mois', y='Moy_Temp', data=df, palette='YlGnBu')
plt.title('Moyenne des temperature par mois')

In [None]:
# Plot 2
plt.title('Moyenne des précipitations par mois')
sns.lineplot(x='Mois', y='Moy_Percipitation_Perc', data=df, sort=False)

In [None]:
# Combo plot
# Figure
fig, ax1 = plt.subplots(figsize=(10,6))


# creation bar plot 
ax1.set_title('Moyenne des temperature par mois', fontsize=16)
ax1 = sns.barplot(x='Mois', y='Moy_Temp', data = df, color="skyblue")
ax1.set_xlabel('Mois', fontsize=16)
ax1.set_ylabel('Moyenne Temperature', color="skyblue" ,fontsize=14)


# Partage de l'axe x
ax2 = ax1.twinx()



# creation line plot 
ax2.set_ylabel('Moyenne précipitations %', color="red", fontsize=14)
ax2 = sns.lineplot(x='Mois', y='Moy_Percipitation_Perc', data = df, sort=False, color="red")


#show plot
plt.show()

Resource suppl.: https://moncoachdata.com/blog/guide-visualisations-de-donnees-python/