# Altair Continued
We last looked at the US Employment dataset, and created some starter visualizations. This time, we'll be analyzing and visualizing the [palmerpenguins](https://github.com/allisonhorst/palmerpenguins) dataset, which is a great starter dataset! More information can be found by following the link (including images of cute penguins.)

## Palmer Penguins
You will have to run `pip3 install palmerpenguins` in order to have access to the dataset. If you are using Google Colab, you can simply run the following code cell.

In [1]:
#!pip install palmerpenguins

In [2]:
import altair as alt
from palmerpenguins import load_penguins

penguins = load_penguins()
penguins.sample(5)

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,year
42,Adelie,Dream,36.0,18.5,186.0,3100.0,female,2007
333,Chinstrap,Dream,49.3,19.9,203.0,4050.0,male,2009
76,Adelie,Torgersen,40.9,16.8,191.0,3700.0,female,2008
149,Adelie,Dream,37.8,18.1,193.0,3750.0,male,2009
281,Chinstrap,Dream,45.2,17.8,198.0,3950.0,female,2007


## Data Exploration
Using the methods from the previous worksheet, answer the following questions about the palmerpenguins dataset.

### Question 1: Which species of penguins are represented in this dataset?

In [3]:
# Write your answer to question 1 here

### Question 2: On average, which species of penguin has the longest beak?

In [4]:
# Write your answer to question 2 here

### Question 3: Describe the relationship between flipper length and bill length.

In [5]:
# Write your answer to question 3 here

### Question 4: Create a scatterplot for flipper length vs bill length. Each species should be a different color.

In [6]:
# Write your answer to question 4 here

### Question 5: Compare and describe the distribution of flipper lengths across the different species.

In [7]:
# Write your answer to question 5 here

## More Altair!
Let's start off by visualizing the relationship between bill length and bill depth. Notice the use of `.interactive()` after we've defined the marks and channels. This allows the user to scroll and pan through the visualization -- since this is such a common request, Altair provides it by default!

In [8]:
alt.Chart(penguins).mark_point().encode(
    x="bill_length_mm:Q",
    y="bill_depth_mm:Q"
).interactive()

In [9]:
# Now let's color-code the species
species_color = alt.Color("species:N") # We can extract this out into a separate variable

alt.Chart(penguins).mark_point().encode(
    x="bill_length_mm:Q",
    y="bill_depth_mm:Q",
    color=species_color
).interactive()

In [10]:
# Now to create our own legend (this will come in handy soon!)
species_color = alt.Color("species:N", legend=None) # Remove the default legend

# Create scatterplot of bill length vs bill depth
bills = alt.Chart(penguins).mark_point().encode( # Notice we create a new Chart variable
    x="bill_length_mm:Q",
    y="bill_depth_mm:Q",
    color=species_color
).interactive()

legend = alt.Chart(penguins).mark_rect().encode( # We also create a legend variable (it's a mini viz)
    y=alt.Y("species:N", axis=alt.Axis(orient="right")),
    color=species_color # Reusing the species_color variable -- this is why we created it!
)

bills | legend # It's this easy to mash visualizations together

## Other Interactions
What if you want to do more than just panning and zooming? Then you'll need to understand how Altair represents interactions. More information can be found [at the documentation here](https://altair-viz.github.io/user_guide/interactions.html). The next few examples are based on the documentation.

### Selections and Conditions
You must first identify a `selection`; this allows a viewer to interact with and select specific parts of your visualization.

Then, you have to identify a `condition` that changes depending on what is being selected.

### A Simple Example
Here's an example of a rectangular selection -- the user is allowed to click and drag on the graph (the `selection`), and the color of the dots will change depending on whether or not it is inside the selection (the `condition`).

In [11]:
selection = alt.selection_interval() # Use a rectangular selection

species_color = alt.condition(selection,    # Set the color to change depending on a the selection
                              alt.Color("species:N", legend=None),
                              alt.value("lightgray"))

# Create scatterplot of bill length vs bill depth
bills = alt.Chart(penguins).mark_point().encode(
    x=alt.X("bill_length_mm:Q", scale=alt.Scale(zero=False)),
    y=alt.Y("bill_depth_mm:Q", scale=alt.Scale(zero=False)),
    color=species_color
).add_selection( # We have to tell the chart to use the selection we've defined
    selection
)

# Create corresponding legend for species
legend = alt.Chart(penguins).mark_rect().encode(
    y=alt.Y("species:N", axis=alt.Axis(orient="right")),
    color=species_color
)

bills | legend

### A More Complicated Example
What if you wanted to allow the viewer to click on a species to see all the corresponding points? Examine the code below while thinking about what the *selection* and *condition* are.

In [12]:
selection = alt.selection_multi(fields=['species']) # A different kind of selection!

species_color = alt.condition(selection,    # Set the color to change depending on a the selection
                              alt.Color("species:N", legend=None),
                              alt.value("lightgray"))

# Create scatterplot of bill length vs bill depth
bills = alt.Chart(penguins).mark_point().encode(
    x=alt.X("bill_length_mm:Q", scale=alt.Scale(zero=False)),
    y=alt.Y("bill_depth_mm:Q", scale=alt.Scale(zero=False)),
    color=species_color
).interactive()

# Create corresponding legend for species
legend = alt.Chart(penguins).mark_rect().encode(
    y=alt.Y("species:N", axis=alt.Axis(orient="right")),
    color=species_color
).add_selection(selection) # We now add it to the legend instead, since that is what the viewer interacts with

bills | legend

## Your Turn to Practice
Look through the above examples and documentation! **Make sure you read carefully through my code!** These will be good references.

### Practice 1: Visualize the relationship between flipper length and body mass. Allow the user to filter by species.

In [13]:
# Write your code for Practice 1 here.

### Practice 2: Visualize the relationship between island and body mass. Choose appropriate marks, channels, and interactions!

In [14]:
# Write your code for Practice 2 here.