# Where is Seaborn?

Seaborn Website: https://seaborn.pydata.org/index.html

    Contains a ton of tutorials about the installation process, how to apply different powerful statistical visualization tools, and galleries demonstrating reproducible examples.

Seaborn Github: https://github.com/mwaskom/seaborn

    Contains a link to the seaborn website and a detailed README document that details the Python 3.8+ system library dependenices required to use Seaborn. (These included numpy, pandas, and matplotlib). Notes on citaiton, testing, and development are additionally included.  



# Joint Kernel Density Estimate of Bill Length vs Bill Depth (mm)
### Visualizing Continuous Data 

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


penguins = sns.load_dataset("penguins")     # Load the penguins dataset

# Show the joint distribution using kernel density estimation
fig = plt.figure(figsize=(10,8))            # Creating a figure with desired dimension

sns.set_theme(style="ticks")                # Sets theme of graph to have major and minor tick marks on gray axes, with a white background

jkde = sns.jointplot(
    data=penguins,          # assigns the data set
    x="bill_length_mm",     # assigns the x variable 
    y="bill_depth_mm",      # assigns the y variable
    hue="species",          # assigns a variable to differentiate color by
    kind="kde",             # asks to provide a kernel density estimation
    palette="plasma",       # assigns the color palette to hue  
    levels=10,              # assigns the number of density contour levels
    fill=True               # asks for density contour levels to be filled
)
jkde.set_axis_labels('Bill Length (mm)', "Bill Depth (mm)")                                         # x and y axis labels
jkde.fig.suptitle("Joint Kernel Density Estimate of Bill Length vs Bill Depth (mm)", y =1.02)       # title of graph positioned not to overlap

plt.show()                  # displays beautiful joint kde 

Each layer represents the data distributions for each species. Darker colors indicate greater data density for readings in that region. The length and size of the layers also demonstrates the spread of data points within that level. A normal distribution of each species data for both attributes are displayed on the upper axis (Bill Length (mm)) and right axis (Bill Depth (mm)).

# Scatterplot Matrix

In [None]:

penguins = sns.load_dataset("penguins")   # Load the penguins dataset
sns.set_theme(style="ticks")              # Sets theme of graph to have major and minor tick marks on gray axes, with a white background

scattermatrix = sns.pairplot(             # asks to generate a grid of plots
            penguins,                     # assigns data set 
            hue="species",                # assigns a variable to differentiate color by
            palette="spring",             # assigns a color palette to hue
            )             
scattermatrix.fig.suptitle("Scatterplot Matrix of Penguin Attributes", y =1.02, fontsize=30)       # assigns title position not to overlap
scattermatrix.legend.set_bbox_to_anchor((1.08,0.5))                                                # assigns legend position not to overlap
scattermatrix.legend.set_title('Species')                                                          # assigns title to legend

plt.tight_layout()                        # formats the layout to be tight
plt.show()                                # displays beautiful scatterplot matrix

The scatterplot matrix above provides insight into the species relationships amongst the continous variables: body mass (g), bill length (mm), bill depth (mm), and fillper length (mm).
For the plots that consider the same variable, a normal distribution plot is provided to describe the spread of the data. Whereas the other plots articulate the different trends when comparing different penguin attributes.

In [None]:
print(penguins)



# Resources Utilized

https://seaborn.pydata.org/index.html

https://github.com/mwaskom/seaborn

https://matplotlib.org/stable/tutorials/colors/colormaps.html

https://seaborn.pydata.org/examples/joint_kde.html#joint-kernel-density-estimate

https://seaborn.pydata.org/examples/scatterplot_matrix.html

https://chat.openai.com/c/c8f56284-204e-4fdb-aae5-43e3db03e3a0




seaborn scientific pub. citation: @article{Waskom2021,
    doi = {10.21105/joss.03021},
    url = {https://doi.org/10.21105/joss.03021},
    year = {2021},
    publisher = {The Open Journal},
    volume = {6},
    number = {60},
    pages = {3021},
    author = {Michael L. Waskom},
    title = {seaborn: statistical data visualization},
    journal = {Journal of Open Source Software}
 }


