# Seaborn 

### Pre-Readings:
- Chapter 4
## Learning Objectives
- Learn to visualize data from DataFrames with the seaborn library.
- Explore data through visualization.

# Introduction
&nbsp;  
***Hi Hyosub,***

***Just wanted to say thank you for the recommendation to start using seaborn plotting in python. Wow. That's it.***

***-Jonathan***   
&nbsp;  

This message was copied verbatim from an email sent to me by a former PhD mentee and kind of says it all. [Seaborn](https://seaborn.pydata.org) is a great Python plotting package that is built on top of Matplotlib. That is, it uses Matplotlib "under the hood", but it offers the user a much simpler API ("Application Programming Interface"; aka, a set of commands) that enable us to generate a variety of great-looking plots that are particularly useful in data science. You can check out Seaborn's [examples gallery](https://seaborn.pydata.org/examples/index.html) to see some of the cool stuff you can do. Seaborn was written by [Michael Waskom](https://mwaskom.github.io/).

## Figure vs Axis level plots
- Two main types of functions in seaborn. Those that generate both the figure and the axis (figure level) and those that only operate on a specific axis.
- Figure level plots are typically easier to use, but provide less flexibility.
- Today we will focus on figure level plots and how to interpret them.

---
### Task 1: Load the adaptation data
- Load the data named 'adaptation_data' in the data folder
- Look at 5 random rows to familiarize yourself with the data

In [None]:
# Imports
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# your answer here


---
## Figure level plots
<img src="../images/function_overview_8_0.png" width="500" height="300">



---
## Categorical plots
- We can use these to look at differences between multiple categorical variables.
- Let's visualize the adaptation data as a barplot.
- The data we are plotting is the amount of adaptation

In [None]:
# Walk through code example


---
## What does this plot tell us?


**Your answer here**

### Grouping our data with seaborn
- This doesn't provide us with my information, except the mean adaptation across our entire dataset. So we can use the `x` argument to now see the real benefit of seaborn.

In [None]:
sns.catplot(data=data, y='adaptation', x='major', kind='bar')
plt.show()

---
## The x-axis is difficult to read now
### Task 2: 
- Change the size of the figure so we can read all the x-axis labels
- Here is the documentation for the `catplot` function:  
https://seaborn.pydata.org/generated/seaborn.catplot.html

In [None]:
# Your answer here


---
### Updating the look of our figure instead of the size
If we don't like how big our figure is now, we can also change how the x axis labels look.

In [None]:
# Return seaborn object from catplot function call

# Use the method set_xticklabels to rotate the x axis by 45 degrees so we can see it better.


---
## Visualizing more than one relationship
What if we want to visuzalize more than just one categorical variable?
We have two options here:
1. Add sub categories in our original plot using the `hue` argument
1. Create multiple subplots using the `col` and/or `row` argument


---
### Task 3:
- Use the `hue` argument to look at adaptation for both the `major` and the `sex` variables

In [None]:
# your answer here


---
### Task 4:
- Use the `col` argument to look at the effects of `sex` and `major` on adaptation.
- Create another figure to look at the effects of `handedness`, `sex`, and `major`. Organize your figure such that the columns represent different levels of `sex` and the rows represent different levels of `handedness`. 

In [None]:
# your answer here


In [None]:
# your answer here


## Are bar plots a good way to visuzalize data?
- They provide information about the mean and potentially the variance with error bars, but that's it.
- Other types of visuzalizations that seaborn is capable of can give us a much better understanding of our data
- For categorical plots, seaborn has the following options, which we can use by changing the `kind` argument.

**Categorical scatterplots: relplot**   
- strip plot (with kind="strip"; the default)   
- swarm plot (with kind="swarm")
 
**Categorical distribution plots: catplot**  
- box plot (with kind="box")  
- violin plot (with kind="violin")  
- boxen plot (with kind="boxen")

**Categorical estimate plots: catplot**  
- point plot (with kind="point")  
- bar plot (with kind="bar")  
- count plot (with kind="count")  


### Explore the different types of categorical visualization and make a pros and cons list of each

In [None]:
# your answers here