# Interactions Continued

Last time we saw brushing and linking as well as some basic interactions. Today we will delve deeper into other possible interactions and some maps. 

Again we will star with the `songs.csv` dataset from last time. 

**We will also be doing an in-class activity, so make sure to follow along!**

In [1]:
import pandas as pd
import altair as alt

Let's filter by artist. But I want to look at specific artists only, as opposed to writting many "or (|)" statements I'll just use the `isin` method. 

In [47]:
df = pd.read_csv("songs.csv")

# use only artists who match our list
df_big3 = df[df["artist"].isin(["Kendrick Lamar", "Drake", "J. Cole", "Stromae"])]

df_big3.head()

Unnamed: 0,artist,song,duration_ms,explicit,year,popularity,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,genre
954,Drake,Forever,357706,True,2009,73,0.457,0.906,5,-2.278,0,0.342,0.249,0.0,0.182,0.54,104.02,"hip hop, pop, R&B"
986,Drake,Best I Ever Had,258760,True,2010,54,0.431,0.894,5,-2.673,0,0.33,0.0951,0.0,0.188,0.605,162.161,"hip hop, pop, R&B"
1003,Stromae,Alors on danse - Radio Edit,206066,False,2010,77,0.791,0.59,1,-9.206,0,0.0793,0.0994,0.00203,0.065,0.714,119.951,pop
1071,Drake,Over,233560,True,2010,57,0.325,0.848,7,-5.611,1,0.279,0.0109,0.0,0.124,0.433,100.093,"hip hop, pop, R&B"
1081,Drake,Find Your Love,208946,False,2010,56,0.625,0.613,6,-6.005,0,0.173,0.0209,0.0,0.0286,0.738,96.033,"hip hop, pop, R&B"


## 1. Recap-brushing and linking

Last time we made a visualization were we brushed on the bar chart and filtered the scatterplot.
Let's revisit the code. 

In [7]:
# Make a brush
brush_bar = alt.selection_interval()

# we add the selection interval to the bars
barChar = alt.Chart(data=df_big3).mark_bar().encode(
    x = alt.X("year:O", 
             scale = alt.Scale(domain=[i for i in range(2010, 2021)])), # fix the scale to 2010 - 2021
    y = alt.Y("count(song)")
).properties(
    width = 300, 
    height = 100
).add_params(brush_bar)

# but filter on the scatter 
scatter = alt.Chart(data = df_big3).mark_circle().encode(
    x = "danceability", 
    y = "valence", 
    color=alt.condition(brush_bar, "artist:N", alt.value("lightgray")), 
    tooltip = ["artist", "song"]
).transform_filter(brush_bar)

# vertical concat
barChar & scatter

### 1.1 Two-way brushing and linking

Now, we want to be able to brush and filter on both of the plots. 

In the last cell we saved our charts as `barChar` and `scatter`, and we made a brush for the bar charts `brush_bar` so we need to add a second brush for the scatter plot.  

In [8]:
# Make a brush
brush_scatter = alt.selection_interval()

# bind the brush to the scatter and filter on the bars
two_waybar = barChar.transform_filter(brush_scatter)
two_wayscatter = scatter.add_params(brush_scatter)

two_waybar & two_wayscatter

### 1.2 Brushing on the legend

In [9]:
brush_legend = alt.selection_point(
    fields = ["artist"],
    bind="legend" 
)

scatter = alt.Chart(data = df_big3).mark_circle().encode(
    x = "danceability", 
    y = "valence", 
    color=alt.condition(brush_legend, "artist:N", alt.value("lightgray")), 
).add_params(brush_legend)


scatter

Let's layer all of this together!

In [10]:
two_wayscatter = scatter.add_params(brush_scatter)
two_wayscatter =two_wayscatter.encode(
    color=alt.condition(brush_legend, "artist:N", alt.value("lightgray")), 
).transform_filter(brush_bar)


two_waybar & two_wayscatter

## 4. Maps and interactions

We will use geopandas so the data is easier to see. 

In [13]:
import geopandas as gpd

Let's use a map of Boston neighborhoods, the file can be found in the Canvas site. 

In [15]:
df_nb = gpd.read_file("BPDA_Neighborhood_Boundaries.geojson")

df_nb.head()

Unnamed: 0,sqmiles,name,neighborhood_id,acres,SHAPE__Length,objectid,SHAPE__Area,geometry
0,2.51,Roslindale,15,1605.568237,53563.912597,53,69938270.0,"MULTIPOLYGON (((-71.12593 42.27201, -71.12611 ..."
1,3.94,Jamaica Plain,11,2519.245394,56349.937161,54,109737900.0,"POLYGON ((-71.10499 42.32610, -71.10503 42.326..."
2,0.55,Mission Hill,13,350.853564,17918.724113,55,15283120.0,"POLYGON ((-71.09043 42.33577, -71.09050 42.335..."
3,0.29,Longwood,28,188.611947,11908.757148,56,8215904.0,"POLYGON ((-71.09811 42.33673, -71.09832 42.337..."
4,0.04,Bay Village,33,26.539839,4650.635493,57,1156071.0,"POLYGON ((-71.06663 42.34878, -71.06663 42.348..."


In [16]:
# make the map
alt.Chart(df_nb).mark_geoshape(
    fill = "#eee", 
    stroke = "black"
).project(
    type='identity',
    reflectY=True
)

Cool, where's the data!?!

In [18]:
# Let's see blue bikes by area
blue_bikes = pd.read_csv("bluebikes_boston.csv")
blue_bikes = blue_bikes[blue_bikes.Total_docks != "null"]

blue_bikes.head()

Unnamed: 0.1,Unnamed: 0,Number,Name,Latitude,Longitude,District,Public_,Total_docks,ObjectId,geometry,neighborhood
0,0,A32040,Honan Library,42.360274,-71.128525,Boston,Yes,15,1,POINT (-71.12852452 42.3602737),Allston
1,1,D32060,Hood Park,42.380045,-71.073046,Boston,Yes,23,2,POINT (-71.07304573 42.38004535),Charlestown
2,2,B32005,Christian Science Plaza - Massachusetts Ave at...,42.343666,-71.085824,Boston,Yes,19,3,POINT (-71.08582377 42.34366582),Back Bay
3,4,C32099,Circuit Drive at American Legion Hwy,42.297041,-71.091719,Boston,Yes,19,5,POINT (-71.09171927 42.29704126),Roxbury
4,6,C32104,Cleary Sq,42.2556,-71.12444,Boston,Yes,16,7,POINT (-71.12444 42.2556),Hyde Park


In [19]:
points = alt.layer(
    alt.Chart(blue_bikes).mark_point().encode(
        longitude = "Longitude:Q", 
        latitude = "Latitude:Q", 
        tooltip = ["Name"]
    )
).properties(
    width = 500, 
    height = 300
)

points

Let's layer the charts!

In [20]:
alt.Chart(df_nb).mark_geoshape(
    stroke="black",  
).project(
    "identity",
    reflectY=True
).encode(
    color= "name",
    opacity = alt.value(0.3),
    tooltip = ["name"]
).properties(
    width = 500, 
    height = 300, 
    title = "Bike Stations in Boston"
) + points



Let's do some brushing and linking!

Let's make a chrolopeth linked to a bar chart based on the total number of bike docks per neighborhood!


In [21]:
counts = blue_bikes.groupby("neighborhood").sum("Total_docks").reset_index()

counts.head()

Unnamed: 0.1,neighborhood,Unnamed: 0,Latitude,Longitude,Total_docks,ObjectId
0,Allston,1452,465.925144,-782.407138,187,1463
1,Back Bay,1996,423.485635,-710.792606,190,2006
2,Beacon Hill,1033,169.424441,-284.284371,88,1037
3,Brighton,2046,381.146487,-640.351054,142,2055
4,Charlestown,2470,466.153631,-781.709521,213,2481


We will learn about lookups, they let us 'merge' the data in two datasets in the visualization. 

These are common for maps, as we often need to combine the geospatial data with whatever attributes we want. 

We can specify these lookups with `transform_lookup`.


In [23]:
# We do not need two brushes because we are not filtering
# and we are doing a transform_lookup
click = alt.selection_point(fields=["neighborhood"])

base = alt.Chart(df_nb).mark_geoshape().encode(
    tooltip="name:N",
    color="Total_docks:Q",
    opacity=alt.condition(click, alt.value(1), alt.value(0.2))
).transform_lookup(
    lookup='name',
    from_=alt.LookupData(counts, 'neighborhood', ['neighborhood', 'Total_docks'])
).add_params(click).properties(
    width=300,
    height=300
).project(
    type='identity', reflectY=True
)

bars = alt.Chart(counts).mark_bar().encode(
    x=alt.X("neighborhood:N", sort="-y"),
    y="Total_docks:Q",
    tooltip = "Total_docks:Q", 
    opacity=alt.condition(click, alt.value(1), alt.value(0.2))
).add_params(click)

base | bars

## Excercise

We will do an in class exercise. 

+ Make a visualization with the songs dataset that includes brushing and linking. You cannot use the scatterplot or barchart. 
+ Use the chipotle dataset in canvas to map the chipotle locations in the boston neighborhoods
    + try to filter for the specific neighborhoods if you can, might require exploring some visualizations and filtering the data 



In [50]:
# Make a brush
brush_pie = alt.selection_interval()

line = alt.Chart(df_big3).mark_line().encode(
    x='year:T',
    y='popularity:Q',
    color='artist:N'
)
line

In [29]:
chipotle_df = pd.read_csv("chipotle.csv")
chipotle_df.head()


Unnamed: 0,state,location,address,latitude,longitude
0,Massachusetts,Boston,"101 Summer St Boston, MA 02110 US",42.353401,-71.058092
1,Massachusetts,Boston,"148 Brookline Ave Boston, MA 02215 US",42.344657,-71.100825
2,Massachusetts,Boston,"283 Washington St Boston, MA 02108 US",42.357549,-71.058429
3,Massachusetts,Boston,"51 Boston Wharf Rd Boston, MA 02210 US",42.351048,-71.045724
4,Massachusetts,Boston,"553 Boylston St Boston, MA 02116 US",42.350657,-71.076134


In [30]:
alt.Chart(df_nb).mark_geoshape(
    fill = "#eee", 
    stroke = "black"
).project(
    type='identity',
    reflectY=True
)

In [33]:
chipotle_points = alt.layer(
    alt.Chart(chipotle_df).mark_point().encode(
        longitude = "longitude:Q", 
        latitude = "latitude:Q", 
        tooltip = ["address"]
    )
).properties(
    width = 500, 
    height = 300
)

chipotle_points

In [45]:
alt.Chart(df_nb).mark_geoshape(
    stroke="black",  
).project(
    "identity",
    reflectY=True
).encode(
    opacity = alt.value(0.15)
).properties(
    width = 500, 
    height = 300, 
    title = "Chipotles in Boston"
) + chipotle_points



## Bonus: Extra Interactions:

I wanted to show a couple more ways in which you can add interaction to your plots! My recommendation is to look at the optional parameters from `selection_point` and `selection_interval` in the [docs](https://altair-viz.github.io/user_guide/generated/api/altair.selection_point.html)

First, we saw how useful the `tooltip` is, but it is not the most customizable. Let's learn how to add hover interactions and create our own version of a tooltip!

To do this we'll do the following:
+ explore the mouseover interaction
+ layer plots on top of each other
    - use `alt.layer()` or the `+`
+ add text and other marks
+ filter on mouse over


In [24]:
# Let's make a brush, but as opposed to select on click
# we specify select on "mouseover"
hover = alt.selection_point(
    on="mouseover")

What we want to do, is have all the text for all the points *in* the visualization but only visible when we select it.

In [25]:
# basic scatter
scatter = alt.Chart(data = df).mark_circle().encode(
    x = "danceability", 
    y = "valence"
)

base = scatter.transform_filter(hover)
text = base.mark_text(dx = 4, dy = -8, align = 'right', stroke= "black", strokeWidth=1).encode(text="song:N")

# + is used to layer
scatter.add_params(hover) + text

Ok, almost works, good idea once you do the interaction but it starts very ugly. 

Let's adjust what is selected *before* we interact with the `empty` param.

In [26]:
# adjust the brush
hover = alt.selection_point(
    on="mouseover", 
    empty=False)

# basic scatter
scatter = alt.Chart(data = df).mark_circle().encode(
    x = "danceability", 
    y = "valence"
)

# filter on the new hover
base = scatter.transform_filter(hover)
text = base.mark_text(dx = 4, dy = -8, align = 'right', stroke= "black", strokeWidth=1).encode(text="song:N")

# bind the new hover
scatter.add_params(hover) + text

That looks better! 

You can find other properties of Text [here](https://altair-viz.github.io/user_guide/marks/text.html) to personalize.
We can add highlight marks, overlay text and add more complex things! 

I'll give you some examples for you to play with. 

In [27]:
# Code from above
hover = alt.selection_point(
    on="mouseover", 
    empty=False)

scatter = alt.Chart(data = df).mark_circle().encode(
    x = "danceability:Q", 
    y = "valence:Q", 
)

base = scatter.transform_filter(hover)
text = base.mark_text(dx = 10, dy = -15, align = 'right', stroke= "black", strokeWidth=1).encode(text="song:N")

# alt.layer is a cleaner way to do "+" when we have many layers
# here I also defining each chart inside the layer 
alt.layer(scatter.add_params(hover), 
          # Let's make a circle around the point I'm hovering on
         base.mark_point(size=100, stroke='tomato', strokeWidth=5),  
          # I'll add a white backgroun on the text
         base.mark_text(dx = 10, dy = -15, align = 'right', stroke= "white", strokeWidth=3).encode(text="song:N"),
         text, 
         ).properties(
    title= "Implementing tool tip 2.0", 
    width = 500

)

## Revisiting Multiple Components/Facets

Finally, I want to draw attention to the resolve parameter of `selection_interval`.

If many charts are linked by the same brush, this let's us define we we want the union, intersection, or selection to reset by having intervals in different charts. 

In [28]:
brush = alt.selection_interval(
    resolve='intersect' # resolve selections - try using intersect or union and selection on different graphs!
)

alt.Chart(df[df.year == 2017]).mark_circle().encode(
  alt.X(alt.repeat('column'), type='quantitative'),
    alt.Y(alt.repeat('row'), type='quantitative'),
    color=alt.condition(brush,alt.value('tomato'),  alt.value('grey')),
    opacity=alt.condition(brush, alt.value(0.8), alt.value(0.1))
).properties(
    width=140,
    height=140
).repeat(
    # repeat is used to automate concatenation
    column=['danceability','energy', 'valence'],
    row=['danceability','energy', 'valence']
).add_params(
    brush)