


<img src="https://seaborn.pydata.org/_images/logo-wide-lightbg.svg" width="30%" height="30%" />

# Seaborn Unit 01 - Introduction

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%201%20-%20Lesson%20Learning%20Outcome.png"> Lesson Learning Outcome

* **Seaborn Lesson is made of 5 units.**
* By the end of this lesson, you should be able to:
  * Load Seaborn Datasets for exploring its multiple plots types
  * Combine Matplotlib and Seaborn capabilities
  * Manage Seaborn Plot Style
  * Create distinct plot types using Seaborn

---

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%202%20-%20Unit%20Objective.png"> Unit Objectives

* Get familiar with Seborn datasets
* Combine Matplotlib and Seaborn capabilities
* Understand Axes and Figure level functions



---

Seaborn is consideried a library for making statistical graphics in Python and is built on top of Matplotlib

* Seaborn helps resolve the two major issues in Matplotlib:
  * Default Matplotlib parameters and 
  * Working with Pandas data frames

<img width="3%" height="3%" align="top"  src=" https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Question%20mark%20icon.png
"> **Why do we study Seaborn?**
  * Seaborn has very important qualities:
    * It offers a built in themes for styling matplotlib graphics
    * It visualizes univariate and bivariate data
    * It fits and visualizes linear regression models
    * It works well with NumPy and Pandas data structures
    * Its syntax is simple. It offers shorter and intuitive syntax to create plots.



## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%203%20-%20Additional%20Learning%20Context.png"> Additional Learning Context

* We encourage you to:
  * Add **code cells and try it out** other possibilities, ie.: play around with parameters values in a function/method, or consider additional function parameters etc.
  * Also, **add your own comment** in the cells. It can help you to consolidate the learning. 

* Parameters in given function/method
  * As you may expect, a given function in a package may contain multiple parameters. 
  * Some of them are mandatory to declare; some have pre-defined values; and some are optional. We will cover the most common parameters used/employed at Data Science for a particular function/method. 
  * However, you may seek additional in the respecive package documentation, where you will find instructions on how to use a given function/method. The studied packages are open source, so this documentation is public.
  * **For Seaborn the link is [here](https://seaborn.pydata.org/)**.

---

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%204%20-%20Import%20Package%20for%20Learning.png"> Import Package for Learning

For convention, `Seaborn` is import with the alias `sns`

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

---

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png">  Seaborn Introduction

**Matplotlib** has a wide range of plots, but it can be complex to plot non-basic plots or adjust the plots to look nice.
  * **Seaborn** provides a higher-level interface for Matplotlib plots, with less code and typically with a nicer design.

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Seaborn Datasets oriented API 

Seaborn offers built-in datasets for exploring its plotting capabilities.
* You can get the dataset names using `sns.get_dataset_names()`

sns.get_dataset_names()

Just parse the dataset name into the function and assign to a DataFrame variable

df = sns.load_dataset('tips')
df = df.head(50)
df.head(3)

<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%205%20-%20Practice.png"> 


**PRACTICE** : Load other datasets from Seaborn, so you can get used to this process

df = sns.load_dataset('glue')
df = df.head(50)
df.head()

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Combine Matplotlib and Seaborn

**Seaborn is built on top of Matplotlib**, therefore they share many aspects. You can combine functions from each library when plotting
* Consider the following dataset
  * It has records for 3 different species of penguins, collected from 3 islands in the Palmer Archipelago, Antarctica


df = sns.load_dataset('penguins').sample(50, random_state=1)
df.head(3)

You can initialize a **Figure with 1 Axes** on matplotlib and draw a Seaborn plot. 
  * Then, you can write the title for these Seaborn plots using Matplotlib notation
  * We are not focused on the Seaborn code itself. The idea is to present an example of how both are used together



<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%207-%20Note.png"> The main takeway: **You can use the range of functions learned in Matplotlib and in Seaborn!**

fig, axes = plt.subplots(figsize=(8,8))

sns.scatterplot(data=df, x='bill_depth_mm', y='bill_length_mm', hue='species')   # Seaborn code to draw a scatter plot
plt.title("Seaborn Plot!!!")
plt.xlabel('X-Axis: bill_depth_mm ')
plt.legend(loc='upper left', title='Legend', frameon=False)
plt.show()

Now you are interested to initialize a **Figure with 2 Axes** on matplotlib and draw a Seaborn plot. Then, you can write the title from these Seaborn plots using Matplotlib notation
  *  You will notice the parameter ax of the seaborn function, relates to the Axes from your Matplotlib Figure.

fig, axes = plt.subplots(1, 3, figsize=(12, 4))

sns.scatterplot(data=df,
                x='bill_depth_mm',
                y='bill_length_mm',
                hue="species",
                ax=axes[0]) # you use the Axes from Matplotlib line of code above

axes[0].set_title('Nice Title')
sns.histplot(data=df, x="flipper_length_mm", ax=axes[1])
sns.histplot(data=df, x="bill_length_mm", ax=axes[2])

plt.tight_layout()
fig.suptitle('Super title for the Figure', fontsize=16, y=1.1)
plt.show()


---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Axes-level functions and Figure-level functions

Now we introduce you to **Axes-level functions** and **Figure-level functions** at Seaborn.

<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%207-%20Note.png"> 1: **Axes-level function** plot data on a matplotlib Axes
  * You will recognize when there is `ax` argument in a given Seaborn function.
   * Axes-level functions include: kdeplot, histplot, scatterplot, boxplot, countplot, heatmap etc.

<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%207-%20Note.png"> 2 - **Figure-level function** can't (easily) be arranged with other Axes/plots
  * You will recognize when there **is not** `ax` argument in a given Seaborn function.
  * The difference is that a Figure-level function creates, on its backend, subplots already. 
    * For example, a Pairplot is a Figure-level function and outputs a set of scatter plots for all numerical variables, in this case you have multiples scatter plots and a histogram for each numerical variable, arranged in a Figure. We will study Pairplot soon.
  * Figure-level functions include: lmplot,  pairplot, jointplot etc




<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%207-%20Note.png"> This Seaborn documentation [link](https://seaborn.pydata.org/tutorial/function_overview.html) gives also a overvew on how Axes-level functios and Figure-level functions work at Seaborn.
* As a rule of thumb, it will take practice to get yourself familiar with these functions; and typically, you might not be interested in making an Axes a Figure-level function since this can split the plot, and it is better to have a single Figure for it.
* The example below shows a Pairplot, so you can see in practical terms the use case where a single function outputs a set of plots.

sns.pairplot(data=df)
plt.show()

