
## Overview

The Palmer Penguins dataset provides information about three penguin species. It features data on 344 penguins collected from three islands. In this tutorial, we will walk through a demonstration of how to create a pair plot to analyze the relationships among key variables across the three species.

## Importing Necessary Libraries

First, we will import all the libraries which we will utilize throughout the tutorial.

```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
```

## Loading the Dataset

Next, we will load the Palmer Penguins dataset by executing the code block below.

```python
import pandas as pd
url = "https://raw.githubusercontent.com/pic16b-ucla/24W/main/datasets/palmer_penguins.csv"
penguins = pd.read_csv(url)
```

## Data Visualization

Below is the code to create the pair plot function. This pair plot function will visualize the relationships among selected variables such as, flipper length, body mass, culmen length, and culmen depth for each respective penguin species.

```python
def create_penguin_pairplot(data=penguins, hue="Species", variables=None,
                            palette="Set2", title="Pair Plot of Palmer Penguins Variables"):
    """
    This function generates a pair plot for the penguin dataset.

    Parameters:
        data (DataFrame): The dataset for the pair plot.
        hue (str): Column name for categorizing data points by color. 
        Has default argument of "Species".
        variables (list): List of variables to include in pair plot.
        palette (str): Color palette for the plot.
        Has default argument of "Set2".
        title (str): Title for the plot.
    """
    
    if variables is None:
        variables = ["Flipper Length (mm)", "Body Mass (g)",
                     "Culmen Length (mm)", "Culmen Depth (mm)"]
    
    #Create the pair plot
    sns.pairplot(
        data=data,  #Dataset to visualize
        hue=hue, #color by category
        vars=variables,  #Variables included in plot
        palette=palette  #Set color palette
    )
    
    #Add a title   
    plt.suptitle(title, y=1.02, fontsize=16)
    
    #Display the plot
    plt.show()
```

## Explanation of Data Visualization Code

The code below creates the pair plot.

```python
sns.pairplot(
        data=data, 
        hue=hue, 
        vars=variables, 
        palette=palette
    )
```
Each argument does the following:

- **`data=penguins`**: The dataset used to create the plot. In this case, we will use the penguins dataset.
  
- **`hue="Species"`**: Ensures each penguin species will have distinctly colored data points.

- **`vars=[...]`**: Specifies which numerical variables will be included in the pair plot.

- **`palette="Set2"`**: Determines the color palette of the plot.
  

The code below adds a super title to the pair plot gird and finally displays the plot.
```python
plt.suptitle(title = "Pair Plot of Palmer Penguins Variables", y=1.02, fontsize=16)
plt.show()
```

## Output

Now, run the code below:

```python
create_penguin_pairplot(penguins)
```

After executing the code, your output should be the following pair plot figure:

![Pair Plot of Palmer Penguins Variables](pairplot_penguins.png)

## Conclusion

This pair plot provides a clear visualization of relationships among key variables across the penguin species. Specifically, we see a strong distinction between the species. For instance, the species are well-separated in their flipper length and culmen length. This indicates to us that these variables can serve as tools to differentiate among the species. 

Additionally, with this function, the arguments can me modified to select different variables, change the plot's color, and more. For example, you can call the function with the following arguments:

```python
create_penguin_pairplot(penguins, variables = ["Flipper Length (mm)", "Body Mass (g)"], palette = "Set1")'
```

## Summary

Below is a summary of all of the code from the tutorial:

```python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import pandas as pd
url = "https://raw.githubusercontent.com/pic16b-ucla/24W/main/datasets/palmer_penguins.csv"
penguins = pd.read_csv(url)

def create_penguin_pairplot(data=penguins, hue="Species", variables=None,
                            palette="Set2", title="Pair Plot of Palmer Penguins Variables"):
    
    if variables is None:
        variables = ["Flipper Length (mm)", "Body Mass (g)",
                     "Culmen Length (mm)", "Culmen Depth (mm)"]
    
    sns.pairplot(
        data=data,
        hue=hue, 
        vars=variables, 
        palette=palette
    )
    
    plt.suptitle(title, y=1.02, fontsize=16)
    plt.show()

create_penguin_pairplot(penguins)
```
