<img src="img/Brilliant_logo.png" width="20%">

### John Mahoney's Teaching Demo

Sept 8, 2020

mohnjahoney@gmail.com

mohnjahoney@github.io

In [19]:
# Imports
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
#import matplotlib.pyplot as plt
import bqplot as bq
import bqplot.pyplot as plt
import numpy as np
import pandas as pd
import copy
import os
from sklearn.linear_model import LinearRegression

# Simpson's Paradox

Some ideas in statistics have an intuitive feel to them, even if they might take some work to compute precisely.
For example, we can intuitively understand that the **mean** (or the average) of a set of numbers is just "the middle".
Using this intuition, we might guess that the mean of ${2,4,7,9}$ is near 5 or 6 and the mean of ${18, 55, 93}$ is probably close to 50.

Some other statistical ideas are more subtle.
In this lesson, we'll be exploring an idea known as **Simpson's Paradox**.
This idea involves correlations and the effect of aggregating subsets of data.

<div class="alert alert-info" role="alert">
    <h2> Simpson's paradox in a nutshell: </h2>
    <p style="font-size:1.5em"> Combining subsets of data can result in counter-intuitive reversals of trends. </p>
    
<!--        <p class="lead"> Each of two data subsets displays the same trend, yet the whole dataset displays the opposite trend. </p> -->
</div>


Our goals in this lesson are to:
- notice how even in some simple situations our intuition may be wrong,
- learn why our intuition fails so that the examples seem less paradoxical,
- become aware of the different flavors of this paradox,
- understand how to act when faced with this paradox in real life.

Let's dig in by exploring three different examples.


# Example 1: The shape of fish.

Here we see fish of many different sizes.
We can divide them into a red group and a blue group.
Notice that the red fish are generally "tall" while the blue fish are generally "long".

In [2]:
# Create fish data
fish = pd.DataFrame({'color':['red', 'red', 'red', 'blue', 'blue', 'blue'], 
                     'length':[0.5, 1.0, 1.5, 2.0, 3.0, 4.0], 
                     'height':[2.5, 3.5, 4.5, 1.0, 1.5, 2.0]})
# fish = pd.DataFrame({'color':['red', 'red', 'red', 'blue', 'blue', 'blue'], 
#                      'length':[0.5, 1.0, 1.5, 1.0, 2.5, 4.0], 
#                      'height':[2.5, 3.5, 4.5, 1.0, 1.5, 2.0]})

# We don't want the image xspan to be as big as the length coordinate - otherwise the fish will overlap too much.
x_scale = 0.5
y_scale = 0.5

fish['xspan'] = x_scale * fish['length']
fish['yspan'] = y_scale * fish['height']

# We want the fish to start out somewhere "random".
fish['init_x'] = [4.2, 1.5, 3.9, 0.5, 3.1, 1.3]
fish['init_y'] = [0.6, 3.8, 4.4, 4.5, 2.3, 1]

#fish

In [3]:
# TODO: Do I have to define the scales before defining the images??
x_sc = bq.LinearScale(min=0, max=5)
y_sc = bq.LinearScale(min=0, max=6)

def get_image(color):
    if color == 'red':
        image = bq.Image(image=ipyimageA, scales={'x':x_sc, 'y':y_sc})
    elif color == 'blue':
        image = bq.Image(image=ipyimageB, scales={'x':x_sc, 'y':y_sc})
    else:
        raise
    return image


def scale_and_place_image(row, init=False):

    if init:
        x = row['init_x']
        y = row['init_y']
    else:
        x = row['length']
        y = row['height']

    # NOTE: The extent of the image does not depend on whether it is in "initial" or "normal" position.
    dx = row['xspan']
    dy = row['yspan']
    
    if init:
        image = row['image']
    else:
        image = row['image2']
    image.x = [x - dx/2, x + dx/2]
    image.y = [y - dy/2, y + dy/2]

    return image

In [4]:
# Put images in the dataframe

# Fish images
image_pathA = os.path.abspath('./img/fish_red.png')
image_pathB = os.path.abspath('./img/fish_blue.png')

with open(image_pathA, 'rb') as fA:
    raw_imageA = fA.read()
with open(image_pathB, 'rb') as fB:
    raw_imageB = fB.read()
    
ipyimageA = widgets.Image(value=raw_imageA, format='png')
ipyimageB = widgets.Image(value=raw_imageB, format='png')

# Get the right color fish.
fish['image'] = fish['color'].apply(get_image)
fish['image2'] = fish['color'].apply(get_image)
# Stretch each fish in some way.
fish['image'] = fish.apply(lambda row: scale_and_place_image(row, init=True), axis=1)
fish['image2'] = fish.apply(lambda row: scale_and_place_image(row, init=False), axis=1)

In [5]:
# Make initial "ocean" plot.
    
ocean_xs = np.linspace(0, 5, 400)
ocean_ys = -0.4 * np.abs(np.sin(4*ocean_xs)) + 6.0

ocean_line = plt.plot(ocean_xs, ocean_ys, scales={'x':x_sc, 'y':y_sc})

marks = list(fish['image']) + [ocean_line]

fig_ocean = plt.figure(marks=marks, axes=[], animation_duration=1000, 
                       fig_margin={'top':0, 'bottom':0, 'left':0, 'right':0}, 
                       layout=widgets.Layout(width='auto'))

box_ocean = widgets.GridspecLayout(1, 1, layout=widgets.Layout(border='black solid 1px', width='50%'))
box_ocean[0, 0] = fig_ocean

box_ocean

GridspecLayout(children=(Figure(animation_duration=1000, fig_margin={'top': 0, 'bottom': 0, 'left': 0, 'right'…

## Visualize the data.
After measuring the dimensions of each fish, let's organize each group in a graph according to their length and height.

In [6]:
# Move fish into (length, height) position on respective figures.

# TODO: Use separate scales for the two different plots to make the aggregate relation less obvious?
# TODO: Axis labels
# TODO: Fix layout: wider, less margin
# TODO: Dot size

# Move the fish into their "correct" places.
#fish['image2'] = fish.apply(lambda row: scale_and_place_image(row, init=False), axis=1)

# Mark with center dots for clarity.
scatterA = bq.Scatter(x=fish[fish['color']=='red']['length'], y=fish[fish['color']=='red']['height'], 
                      colors=['Red'], scales={'x': x_sc, 'y': y_sc})
scatterB = bq.Scatter(x=fish[fish['color']=='blue']['length'], y=fish[fish['color']=='blue']['height'], 
                      colors=['Blue'], scales={'x': x_sc, 'y': y_sc})

# Put things into figures.
marksA = list(fish[fish['color']=='red']['image2']) + [scatterA]
marksB = list(fish[fish['color']=='blue']['image2']) + [scatterB]

# figA = bq.Figure(marks=marksA, axes=[x_ax, y_ax], animation_duration=1000, 
#                     min_aspect_ratio=1, max_aspect_ratio=1, layout=widgets.Layout(border='red solid 2px', length='auto'))
# figB = bq.Figure(marks=marksB, axes=[x_ax, y_ax], animation_duration=1000, 
#                      min_aspect_ratio=1, max_aspect_ratio=1, layout=widgets.Layout(border='red solid 2px'))

x_ax = bq.Axis(label='length', scale=x_sc)
y_ax = bq.Axis(label='height', scale=y_sc, orientation='vertical')

figA = bq.Figure(title='Red Fish', marks=marksA, axes=[x_ax, y_ax], animation_duration=1000, 
                 layout=widgets.Layout(border='black solid 1px', width='auto'))
figB = bq.Figure(title='Blue Fish', marks=marksB, axes=[x_ax, y_ax], animation_duration=1000, 
                 layout=widgets.Layout(border='black solid 1px', width='auto'))


# Regression
# Note: Even if the data is in a straight line, use regression here for consistency with the aggregate and also to be more flexible.

regA = LinearRegression().fit(fish[['length']][fish['color']=='red'],
                              fish[['height']][fish['color']=='red'])
regB = LinearRegression().fit(fish[['length']][fish['color']=='blue'],
                              fish[['height']][fish['color']=='blue'])

reg_xs = np.linspace(0, 5, 2).reshape(-1, 1)

reg_ysA = regA.predict(reg_xs)
reg_ysB = regB.predict(reg_xs)

line_regA = bq.Lines(x=reg_xs, y=reg_ysA, colors=['Pink'], scales={'x': x_sc, 'y': y_sc}, stroke_width=0)
line_regB = bq.Lines(x=reg_xs, y=reg_ysB, colors=['lightblue'], scales={'x': x_sc, 'y': y_sc}, stroke_width=0)

figA.set_trait('marks', figA.marks + [line_regA])
figB.set_trait('marks', figB.marks + [line_regB])

regression_button = widgets.ToggleButton(value=False, description='Show regression lines')

def on_regression_button(change):
    if change['new'] == True:
        # Show regression
        line_regA.set_trait('stroke_width', 4)
        line_regB.set_trait('stroke_width', 4)
    else:
        # Don't show 
        line_regA.set_trait('stroke_width', 0)
        line_regB.set_trait('stroke_width', 0)
        
regression_button.observe(on_regression_button, 'value')

box = widgets.AppLayout(left_sidebar=figA, right_sidebar=figB, footer=regression_button, 
                        pane_heights=[0, 10, 1], 
                        layout=widgets.Layout(border='white solid 1px', width='100%'))

#box = widgets.GridspecLayout(1, 2, layout=widgets.Layout(border='white solid 1px', width='100%'))
#box[0, 0] = figA
#box[0, 1] = figB

box

AppLayout(children=(ToggleButton(value=False, description='Show regression lines', layout=Layout(grid_area='fo…

## Patterns in the data

Now that the data have been visualized, we can see some strong trends.
First focus on the red fish; As the fish get longer, they also get taller.
Now focus on the blue fish: We see a similar trend - as the fish get longer, they get taller too.

Given two quantities $A$ and $B$, 
When $A$ and $B$ tend to increase together, we say that $A$ and $B$ are **positively correlated**.
When $A$ and $B$ tend to change in opposite directions, we say that $A$ and $B$ are **negatively correlated**.

We see that the red line has a positive (upward) slope.
This means that the length and height of red fish are positively correlated.
Similarly, the length and height of blue fish are also positively correlated.

We can visualize these correlations by drawing a line through each group of fish.
(Try the regression button.)
The slope of each line tells us quantitatively how much the fish height will change for a given change in fish length.

$$\textrm{slope} = \frac{\Delta y}{\Delta x} = \frac{\textrm{difference in fish height}}{\text{difference in fish length}}$$

## Quiz!

<div class="alert alert-success" role="alert">
</div>

- Find the blue fish that is 4 feet long. How tall is it?
(answer: 2 feet)

- Imagine you find a red fish that is 2 feet long. How tall do you expect it to be?
(answer: 5.5 feet)

- Blue fish: When the length increases by one foot, the height (increases / decreases) by ___ feet.
(answer: increases, 1/3)

- Think! Consider the number of fishermen ($A$) at a pier and the number of fish ($B$) in the water below. Are $A$ and $B$ positively or negatively correlated? Argue both sides.

<!-- <div class="alert alert-info" role="alert">
</div> -->

## Now here comes the "paradox"...

We've seen that, within the group of red fish, longer fish are taller.
The same is true for blue fish.

<p style="font-size:1.5em"> <it>Certainly</it> if we were to consider all fish (red and blue) together, we would continue to find that longer fish are taller. <b>Right?</b>... </p>

Let's see what happens! (Try the aggregation button)

In [7]:
# Aggregate the two groups of fish. Draw an aggregate regression line.

# TODO: If we split the figures up front, here we would merge into the same plot.

# TODO Make one figure where we toggle between all fish with subgroup lines and green dots with aggregate line.

regAB = LinearRegression().fit(fish[['length']],
                              fish[['height']])

reg_ysAB = regAB.predict(reg_xs)

line_regAB = bq.Lines(x=reg_xs, y=reg_ysAB, colors=['black'], scales={'x': x_sc, 'y': y_sc}, stroke_width=0)

marksAB = marksA + marksB + [line_regA, line_regB, line_regAB]

figAB = bq.Figure(title='All Fish', marks=marksAB, axes=[x_ax, y_ax], animation_duration=0, 
                 layout=widgets.Layout(border='black solid 1px', width='auto'))

fish_agg_button = widgets.ToggleButton(value=False, description='Aggregate groups')

# TODO: This is an ugly hack, but it's OK for today.
def on_fish_agg_button(change):
    if change['new'] == True:
        line_regAB.set_trait('stroke_width', 4)
        line_regA.set_trait('stroke_width', 0)
        line_regB.set_trait('stroke_width', 0)
        scatterA.set_trait('colors', ['green'])
        scatterB.set_trait('colors', ['green'])
        for i in range(3):
            marksA[i].x = marksA[i].x * 100
            marksB[i].x = marksB[i].x * 100
    else:
        line_regAB.set_trait('stroke_width', 0)
        line_regA.set_trait('stroke_width', 4)
        line_regB.set_trait('stroke_width', 4)
        scatterA.set_trait('colors', ['red'])
        scatterB.set_trait('colors', ['blue'])
        for i in range(3):
            marksA[i].x = marksA[i].x / 100
            marksB[i].x = marksB[i].x / 100
fish_agg_button.observe(on_fish_agg_button, 'value')

boxAB = widgets.GridspecLayout(1, 2)
boxAB[0, 0] = figAB
boxAB[0, 1] = fish_agg_button

boxAB

GridspecLayout(children=(Figure(axes=[Axis(label='length', scale=LinearScale(max=5.0, min=0.0), side='bottom')…

Surprisingly, when we aggregate the subgroups, the trend is reversed!
The length and height are now **negatively correlated**.

<div class="alert alert-info" role="alert">
    <h2> Take-home Message </h2>
    <p style="font-size:1.5em"> For each subgroup, longer fish are generally taller. However, when we look at all fish together, longer fish are generally shorter.</p>
</div>

## Interactive challenge!

Let's stretch your understanding a bit.
We just discussed an example with two subgroups (of fish).
Can you create a Simpson's Paradox situation with **three** subgroups?
Move the shapes (circles, squares, and triangles) so that the correlation is **negative within each group**, yet the **overall trend is positive**.

In [8]:
# Play the game!

# TODO: Add Play Again reset button

xmax = 10
ymax = 9

# xs = np.random.uniform(0, xmax, 9)
# ys = np.random.uniform(0, ymax, 9)
xs = np.array([1.0, 4.0, 6.0, 4, 5, 8, 3, 5, 9])
ys = np.array([8.0, 6.0, 9.0, 2, 1, 4, 1, 4, 3])

circle_inds = [0, 1, 2]
square_inds = [3, 4, 5]
triangle_inds = [6, 7, 8]

fig_challenge1 = plt.figure()
# x_sc = bq.LinearScale(min=0, max=7)
# y_sc = bq.LinearScale(min=0, max=8)

reg0 = LinearRegression().fit(xs[circle_inds].reshape(-1, 1), ys[circle_inds])
reg1 = LinearRegression().fit(xs[square_inds].reshape(-1, 1), ys[square_inds])
reg2 = LinearRegression().fit(xs[triangle_inds].reshape(-1, 1), ys[triangle_inds])
reg012 = LinearRegression().fit(xs.reshape(-1, 1), ys)

reg_xs = np.linspace(0, 10, 2)

reg_ys0 = reg0.predict(reg_xs.reshape(-1, 1))
reg_ys1 = reg1.predict(reg_xs.reshape(-1, 1))
reg_ys2 = reg2.predict(reg_xs.reshape(-1, 1))
reg_ys012 = reg012.predict(reg_xs.reshape(-1, 1))

line_reg0 = plt.plot(reg_xs, reg_ys0, colors=['green'], stroke_width=3)
line_reg1 = plt.plot(reg_xs, reg_ys1, colors=['green'], stroke_width=3)
line_reg2 = plt.plot(reg_xs, reg_ys2, colors=['green'], stroke_width=3)
line_reg012 = plt.plot(reg_xs, reg_ys012, colors=['red'], stroke_width=6, line_style='dashed')

scatter0 = plt.scatter(xs[circle_inds], ys[circle_inds], default_size=600, colors=['gray'], 
                       marker='circle', enable_move=True)
scatter1 = plt.scatter(xs[square_inds], ys[square_inds], default_size=600, colors=['gray'], 
                       marker='square', enable_move=True)
scatter2 = plt.scatter(xs[triangle_inds], ys[triangle_inds], default_size=600, colors=['gray'], 
                       marker='triangle-up', enable_move=True)

status_text = plt.label(["Not Yet"], x=[5], y=[11], 
                            align='middle', font_weight='bold', default_size=24, colors=['Black'])

def randomize_positions(button, scatter0, scatter1, scatter2):
#    print("wer")
    scatter0.x = np.random.uniform(0, xmax, 3)
    scatter1.x = np.random.uniform(0, xmax, 3)
    scatter2.x = np.random.uniform(0, xmax, 3)
    scatter0.y = np.random.uniform(0, ymax, 3)
    scatter1.y = np.random.uniform(0, ymax, 3)
    scatter2.y = np.random.uniform(0, ymax, 3)
    
button_randomize = widgets.Button(
    description='Randomize positions',
    disabled=False,
    button_style='success', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description')
    
button_randomize.on_click(lambda button: randomize_positions(button, scatter0, scatter1, scatter2))

def is_simpson(reg0, reg1, reg2, reg012, status_text):
    conds = [(reg0.coef_ < 0), (reg1.coef_ < 0), (reg2.coef_ < 0), (reg012.coef_ > 0)]
    if conds[0]:
        line_reg0.set_trait('opacities', [1])
    else:
        line_reg0.set_trait('opacities', [0.2])
        
    if conds[1]:
        line_reg1.set_trait('opacities', [1])
    else:
        line_reg1.set_trait('opacities', [0.2])
        
    if conds[2]:
        line_reg2.set_trait('opacities', [1])
    else:
        line_reg2.set_trait('opacities', [0.2])
        
    if conds[3]:
        line_reg012.set_trait('opacities', [1])
    else:
        line_reg012.set_trait('opacities', [0.2])
      
    if sum(conds[:-1]) == 0:
        status_text.text = ["Move the shapes"]
    elif sum(conds[:-1]) == 1:
        status_text.text = ["You got one :)"]
    elif sum(conds[:-1]) == 2:
        status_text.text = ["You got two!!"]
    elif sum(conds[:-1]) == 3 and conds[-1] == 0:
        status_text.text = ["Now for the total!"]
    elif sum(conds) == 4:
        status_text.text = ["You designed a Simpson's Paradox!"]
    else:
        print('ELSE+++++++++++++++++++++++')
        
def update_reg_line(change):
    reg0 = LinearRegression().fit(scatter0.x.reshape(-1, 1), scatter0.y.reshape(-1, 1))
    reg1 = LinearRegression().fit(scatter1.x.reshape(-1, 1), scatter1.y.reshape(-1, 1))
    reg2 = LinearRegression().fit(scatter2.x.reshape(-1, 1), scatter2.y.reshape(-1, 1))
    
    newxs = np.concatenate([scatter0.x, scatter1.x, scatter2.x])
    newys = np.concatenate([scatter0.y, scatter1.y, scatter2.y])
    reg012 = LinearRegression().fit(newxs.reshape(-1, 1), newys.reshape(-1, 1))
    
    reg_ys0 = reg0.predict(reg_xs.reshape(-1, 1))
    reg_ys1 = reg1.predict(reg_xs.reshape(-1, 1))
    reg_ys2 = reg2.predict(reg_xs.reshape(-1, 1))
    reg_ys012 = reg012.predict(reg_xs.reshape(-1, 1))

    line_reg0.y = reg_ys0
    line_reg1.y = reg_ys1
    line_reg2.y = reg_ys2
    line_reg012.y = reg_ys012
    
    is_simpson(reg0, reg1, reg2, reg012, status_text)
    
scatter0.observe(update_reg_line, names=['x','y'])
scatter1.observe(update_reg_line, names=['x','y'])
scatter2.observe(update_reg_line, names=['x','y'])

is_simpson(reg0, reg1, reg2, reg012, status_text)
    
plt.xlim(0, xmax)
plt.ylim(0, 11)

fig_challenge1.layout = widgets.Layout(width='100%')

fig_challenge1.marks = fig_challenge1.marks + [status_text]
fig_challenge1.axes[0].set_trait('visible', False)
fig_challenge1.axes[1].set_trait('visible', False)

box_challenge1 = widgets.GridspecLayout(1, 1, layout=widgets.Layout(width='50%'))
box_challenge1[0, 0] = fig_challenge1

# box_challenge1 = widgets.GridspecLayout(3, 4, layout=widgets.Layout(width='80%'))
# box_challenge1[:, 0:3] = fig_challenge1
# box_challenge1[1, 3] = button_randomize

box_challenge1

GridspecLayout(children=(Figure(axes=[Axis(scale=LinearScale(max=10.0, min=0.0), visible=False), Axis(orientat…

In [9]:
%%HTML
<style>
td {
  font-size: 24px
}
</style>

In [10]:
# Create batting dataframe

# Set up multi index dataframe for batting stats
years = [2018, 2019]
batting_details = ['hits', 'at-bats', 'BA']

mindex = pd.MultiIndex.from_product([years, batting_details],
                           names=['year', 'stats'])

# This will set dtypes as int. Batting ave will upcast to float later.
bat_df = pd.DataFrame(np.zeros(shape=(2, 6), dtype='int'), index=['Jack', 'Arlo'], columns=mindex)

########
# Enter the main data: 'hits' and 'at bats'
bat_df.loc['Jack'] = [1, 20, 0.0, 24, 80, 0.0]
bat_df.loc['Arlo'] = [12, 80, 0.0, 10, 20, 0.0]
########

# TODO: This could be done better.
bat_df.loc[:, ('Total', 'hits')] = bat_df.loc[:, (2018, 'hits')] + bat_df.loc[:, (2019, 'hits')]
bat_df.loc[:, ('Total', 'at-bats')] = bat_df.loc[:, (2018, 'at-bats')] + bat_df.loc[:, (2019, 'at-bats')]

def compute_BA(df):
    for year in df.columns.get_level_values('year').unique():
        df.loc[:, (year,'BA')] = df[year]['hits'] / df[year]['at-bats']
    return df

bat_df = compute_BA(bat_df)

# Example 2: Batting averages (BA)

In [11]:
# Show this
bat_df[bat_df.columns[0:6]]

# Show this with button click
#bat_df[bat_df.columns[7:]]

year,2018,2018,2018,2019,2019,2019
stats,hits,at-bats,BA,hits,at-bats,BA
Jack,1,20,0.05,24,80,0.3
Arlo,12,80,0.15,10,20,0.5


Jack and Arlo have been playing baseball for the past two years.
Baseball fans argue about which one is the better player.

Some fans say "Arlo is definitely the better player! He had a better batting average in each of the two seasons.
(Take a minute and confirm that this is true.)

Other fans respond "Yeah, but Jack is clearly the better player *overall*."
(Compute each overall BA for yourself and confirm that Jack's is higher.)

This is another great example of Simpson's paradox!

First, let's try to understand more about how this situation can arise.
Then, we'll come back to the fans and try to help settle their debate.

In [12]:
# slice(None) important for slicing a multi index
hits = bat_df.loc[:, (slice(None), 'hits')].to_numpy()
atbats = bat_df.loc[:, (slice(None), 'at-bats')].to_numpy()

# We want to plot the cumulative.
# -1 because we don't want the Total
hitscum = np.cumsum(hits[:, :-1], axis=1)
atbatscum = np.cumsum(atbats[:, :-1], axis=1)

# Prepend 0s for plotting
hitscum = np.hstack((np.zeros(shape=(2,1)), hitscum))
atbatscum = np.hstack((np.zeros(shape=(2,1)), atbatscum))

#print(hitscum)
#print(atbatscum)

In [13]:
# Color choices
jack_colors = ['red', 'red']
arlo_colors = ['blue', 'blue']

# jack_colors = ['red', 'blue']
# arlo_colors = ['red', 'blue']

# jack_colors = ['red', 'orange']
# arlo_colors = ['blue', 'green']

In [14]:
# Make batting average interactive!

x_sc2 = bq.LinearScale(min=0, max=110)
y_sc2 = bq.LinearScale(min=0, max=27)

x_ax2 = bq.Axis(label='at-bats', scale=x_sc2)
y_ax2 = bq.Axis(label='runs', scale=y_sc2, orientation='vertical')

fig_batting = plt.figure(axes=[x_ax2, y_ax2], 
                        layout=widgets.Layout(width='auto', height='auto'), animation_duration=1000, 
                        fig_margin={'top':20, 'bottom':50, 'left':50, 'right':20}, display_legend=True)

jack_lines = [bq.Lines(x=atbatscum[0, [ind, ind+1]], y=hitscum[0, [ind, ind+1]], colors=[jack_colors[ind]], 
                       stroke_width=5, line_style='solid', scales={'x': x_sc2, 'y': y_sc2}) for ind in range(2)]
arlo_lines = [bq.Lines(x=atbatscum[1, [ind, ind+1]], y=hitscum[1, [ind, ind+1]], colors=[arlo_colors[ind]], 
                       stroke_width=5, line_style='solid', scales={'x': x_sc2, 'y': y_sc2}) for ind in range(2)]

jack_lines[1].set_trait('line_style', 'dashed')
arlo_lines[1].set_trait('line_style', 'dashed')
jack_lines[0].set_trait('labels', [''])
arlo_lines[0].set_trait('labels', [''])
jack_lines[1].set_trait('labels', ['Jack'])
arlo_lines[1].set_trait('labels', ['Arlo'])

vert_lines = [bq.Lines(x=2*[atbatscum[ind, 1]], y=[0, 25], colors=['black'], line_style='solid', 
                       stroke_width=1, scales={'x': x_sc2, 'y': y_sc2}) for ind in [0, 1]]

jack_text = plt.label(["Jack"], x=[35], y=[20], align='middle', font_weight='bold', default_size=24, 
                      colors=['Red'], scales={'x': x_sc2, 'y': y_sc2})
arlo_text = plt.label(["Arlo"], x=[65], y=[20], align='middle', font_weight='bold', default_size=24, 
                      colors=['Blue'], scales={'x': x_sc2, 'y': y_sc2})

fig_batting.marks = vert_lines + jack_lines + arlo_lines + [jack_text, arlo_text]

button_bat1 = widgets.ToggleButtons(
    value='season',
    options=['season', 'at-bats'],
    description='compare by:',
    disabled=False,
    button_style='success', # 'success', 'info', 'warning', 'danger' or ''
    tooltip='Description',
    icon='check')
   
def switch_vals(arrA, arrB):
    temp = copy.copy(arrA)
    arrA = arrB
    arrB = temp
    return arrA, arrB

def on_button_bat1(change):
    # TODO: This function is rather terrible, but abstracting seems like a big pain right now.
    # Switch Arlo lines
    dx0 = np.diff(arlo_lines[0].x)[0]
    dy0 = np.diff(arlo_lines[0].y)[0]
    
    dx1 = np.diff(arlo_lines[1].x)[0]
    dy1 = np.diff(arlo_lines[1].y)[0]
    
    if change['new'] == 'at-bats':
        arlo_lines[1].x = [0, dx1]
        arlo_lines[1].y = [0, dy1]

        arlo_lines[0].x = [dx1, dx0 + dx1]
        arlo_lines[0].y = [dy1, dy0 + dy1]
        
    elif change['new'] == 'season':
        arlo_lines[0].x = [0, dx0]
        arlo_lines[0].y = [0, dy0]

        arlo_lines[1].x = [dx0, dx0 + dx1]
        arlo_lines[1].y = [dy0, dy0 + dy1]
    else:
        print(change)
    
    # Move second vert line
    if change['new'] == 'at-bats':
        vert_lines[1].x = x=2*[atbatscum[0, 1]]
    elif change['new'] == 'season':
        vert_lines[1].x = x=2*[atbatscum[1, 1]]
    else:
        print(change)
    
button_bat1.observe(on_button_bat1, names='value')

box = widgets.GridspecLayout(3, 4, layout=widgets.Layout(width='80%', height='400px'))

#leg = plt.legend() 
box[:, 0:3] = fig_batting
box[1, 3] = button_bat1
box

GridspecLayout(children=(Figure(animation_duration=1000, axes=[Axis(label='at-bats', scale=LinearScale(max=110…

In this graph we show runs vs at-bats.
The solid line represents the first season and the dashed line represents the second season.

## Quiz

<div class="alert alert-success" role="alert">
</div>

- The slope of each line represents the ______ for that season. (batting average)

- We can see that Arlo has the better batting average for each season because:
Each blue line is higher than the corresponding red one.
Each blue line has a greater slope than the corresponding red one. (+)
The blue line sequence ends up with a greater final value than the red one.

- Something about how this process differs from adding fractions.

**DO THIS**
Imagine that each player had 100 at-bats in each season with the same batting averages reported above.
Fill in the table so that this is true.
Who has the greater overall BA? (answer: Arlo)

This tells us that knowing just the batting averages alone is not enough information.
We need to know the details - the hits and the at-bats.

**DO THIS**
If Arlo's worst BA is better than Jack's best BA, can we have a Simpson's paradox situation?
(answer: no)
We can create a nice graphical proof of this.
How to incorporate?
A - Notice: here's a graphic that proves this fact.
B - Here's a graphic. Which fact does it prove?
C - Walk through the construction of the graphical proof.

When we compare the two players by season, we see that Arlo has a better BA each year (greater slope).
Yet somehow Jack still comes out the overall leader in BA.

Let's try reorganizing this data: instead of sorting by season, let's sort by comparable at-bats.
From this vantage, Arlo still leads in one group (20 at-bats) but Jack leads in the other (80 at-bats).
Since the leadership is now mixed, it should be no surprise that either player might attain the higher overall BA.

Notice that Arlo has a significantly better BA in the small group (20 at-bats).
However, Jack's advantage in the large group (80 at-bats) represents the dominant piece of this puzzle.
Even though Jack's BA advantage is less dramatic, this advantage is drawn out over a greater extent (number of at-bats).

<div class="alert alert-info" role="alert">
    <h2> Take Home Message </h2>
    <p style="font-size:1.5em"> Arlo has the better BA for each season, yet Jack has the greater overall BA. This reversal is possible because of the imbalance of at-bats within each season.</p>
</div>

## Interactive Challenge!

Repeat the Jack and Arlo example but restrict the at-bats to be the same within each season.
Can you make a Simpson's paradox now?
If so, show it!
If not, can you explain why this is not possible?

# Interactive Challenge!

Now consider two new baseball players, Joe and Kamala.
Imagine they have played for three seasons.
Can you design a scenario where Joe has a better BA for each season, but Kamala has the better overall BA?

In [15]:
# TODO add axis labels
# Make interactive batting challenge

# TODO: Make a toggle or second challenge or just quiz question where the at-bats for each section is the same for both players.

fig_interactive_batting = plt.figure(layout=widgets.Layout(width='auto', height='auto'), animation_duration=100, 
                                     fig_margin={'top':20, 'bottom':50, 'left':50, 'right':20}, display_legend=True)

joe_hits_cum = np.array([0, 10, 20, 30])
joe_atbats_cum = np.array([0, 20, 40, 60])

kamala_hits_cum = np.array([0, 12, 24, 36])
kamala_atbats_cum = np.array([0, 20, 40, 60])

joe_lineA = plt.plot(x=joe_atbats_cum[0:2], y=joe_hits_cum[0:2], colors=['blue'], stroke_width=5)
joe_lineB = plt.plot(x=joe_atbats_cum[1:3], y=joe_hits_cum[1:3], colors=['blue'], stroke_width=5, line_style='dashed')
joe_lineC = plt.plot(x=joe_atbats_cum[2:4], y=joe_hits_cum[2:4], colors=['blue'], stroke_width=5, line_style='dotted')
kamala_lineA = plt.plot(x=kamala_atbats_cum[0:2], y=kamala_hits_cum[0:2], colors=['red'], stroke_width=5)
kamala_lineB = plt.plot(x=kamala_atbats_cum[1:3], y=kamala_hits_cum[1:3], colors=['red'], stroke_width=5, line_style='dashed')
kamala_lineC = plt.plot(x=kamala_atbats_cum[2:4], y=kamala_hits_cum[2:4], colors=['red'], stroke_width=5, line_style='dotted')

joe_line_overall = plt.plot(x=joe_atbats_cum[[0, -1]], y=joe_hits_cum[[0, -1]], colors=['lightblue'], stroke_width=3)
kamala_line_overall = plt.plot(x=kamala_atbats_cum[[0, -1]], y=kamala_hits_cum[[0, -1]], colors=['pink'], stroke_width=3)

joe_scatter0 = plt.scatter(x=[joe_atbats_cum[0]], y=[joe_hits_cum[0]], colors=['blue'], default_size=200, enable_move=False)
kamala_scatter0 = plt.scatter(x=[kamala_atbats_cum[0]], y=[kamala_hits_cum[0]], colors=['red'], default_size=200, enable_move=False)

joe_scatter = plt.scatter(x=joe_atbats_cum[1:], y=joe_hits_cum[1:], colors=['blue'], default_size=200, enable_move=True)
kamala_scatter = plt.scatter(x=kamala_atbats_cum[1:], y=kamala_hits_cum[1:], colors=['red'], default_size=200, enable_move=True)

status_text_congratulations = plt.label([""], x=[30], y=[20], 
                            align='middle', font_weight='bold', default_size=24, colors=['Black'])

htmltext = "blah blah"
htmlWidget = widgets.HTML(value = f"<b><font color='red'>{htmltext}</b>")

def on_joe_move(change):
    if change['name'] == 'x':
        newxs = change['new']
        allnewxs = np.insert(newxs, 0, joe_scatter0.x[0])
        
        if np.any(np.diff(allnewxs) < 0):
            # Reject the move
            joe_scatter.x = change['old']
            return
        
        joe_lineA.x = allnewxs[0:2]
        joe_lineB.x = allnewxs[1:3]
        joe_lineC.x = allnewxs[2:4]
        joe_line_overall.x = allnewxs[[0, -1]]
        
    if change['name'] == 'y':
        newys = change['new']
        allnewys = np.insert(newys, 0, joe_scatter0.y[0])
        
        if np.any(np.diff(allnewys) < 0):
            # Reject the move
            joe_scatter.y = change['old']
            return
        
        joe_lineA.y = allnewys[0:2]
        joe_lineB.y = allnewys[1:3]
        joe_lineC.y = allnewys[2:4]
        joe_line_overall.y = allnewys[[0, -1]]

    is_simpson_bat(joe_lineA, joe_lineB, joe_lineC, kamala_lineA, kamala_lineB, kamala_lineC)
    
def on_kamala_move(change):
    if change['name'] == 'x':
        newxs = change['new']
        allnewxs = np.insert(newxs, 0, kamala_scatter0.x[0])

        if np.any(np.diff(allnewxs) < 0):
            # Reject the move
            kamala_scatter.x = change['old']
            return
        
        kamala_lineA.x = allnewxs[0:2]
        kamala_lineB.x = allnewxs[1:3]
        kamala_lineC.x = allnewxs[2:4]
        kamala_line_overall.x = allnewxs[[0, -1]]
        
    if change['name'] == 'y':
        newys = change['new']
        allnewys = np.insert(newys, 0, kamala_scatter0.y[0])
        
        if np.any(np.diff(allnewys) < 0):
            # Reject the move
            kamala_scatter.y = change['old']
            return
        
        kamala_lineA.y = allnewys[0:2]
        kamala_lineB.y = allnewys[1:3]
        kamala_lineC.y = allnewys[2:4]
        kamala_line_overall.y = allnewys[[0, -1]]
    
    is_simpson_bat(joe_lineA, joe_lineB, joe_lineC, kamala_lineA, kamala_lineB, kamala_lineC)

def is_slope_greater(lineA, lineB):
    slopeA = np.diff(lineA.y) / np.diff(lineA.x)
    slopeB = np.diff(lineB.y) / np.diff(lineB.x)
    if slopeA > slopeB:
        return True
    else:
        return False
    
def is_simpson_bat(joe_lineA, joe_lineB, joe_lineC, kamala_lineA, kamala_lineB, kamala_lineC):
    
    conds = [is_slope_greater(joe_lineA, kamala_lineA), 
             is_slope_greater(joe_lineB, kamala_lineB), 
             is_slope_greater(joe_lineC, kamala_lineC), 
             is_slope_greater(joe_line_overall, kamala_line_overall)]
    
    if conds[0]:
        nameA = 'Joe'
    else:
        nameA = 'Kamala'
        
    if conds[1]:
        nameB = 'Joe'
    else:
        nameB = 'Kamala'
        
    if conds[2]:
        nameC = 'Joe'
    else:
        nameC = 'Kamala'
        
    if conds[3]:
        name_overall = 'Joe'
    else:
        name_overall = 'Kamala'
    
    htmlWidget.value = "<b><font color='black'>" + \
    """
    <table style='width:250px;font-size:2em'>
      <tr>
        <th>Year</th>
        <th>Leader</th>
      </tr>
      <tr>
        <td>2020</td>
        <td>{}</td>
      </tr>
      <tr>
        <td>2021</td>
        <td>{}</td>
      </tr>
      <tr>
        <td>2022</td>
        <td>{}</td>
      </tr>
      <tr>
        <td>Overall</td>
        <td>{}</td>
      </tr>
    </table>
    """.format(nameA, nameB, nameC, name_overall) + "</b>"
    
    if conds[0] and conds[1] and conds[2] and not conds[3]:
        status_text_congratulations.text = ['Congratulations!']
        
joe_scatter.observe(on_joe_move, names=['x', 'y'])
kamala_scatter.observe(on_kamala_move, names=['x', 'y'])
    
    
fig_interactive_batting.marks = [joe_lineA, joe_lineB, joe_lineC, 
                                 kamala_lineA, kamala_lineB, kamala_lineC, 
                                 joe_line_overall, kamala_line_overall, 
                                 joe_scatter0, kamala_scatter0, 
                                 joe_scatter, kamala_scatter, 
                                 status_text_congratulations]

is_simpson_bat(joe_lineA, joe_lineB, joe_lineC, kamala_lineA, kamala_lineB, kamala_lineC)

box_interactive_batting = widgets.GridspecLayout(3, 4, layout=widgets.Layout(width='80%', height='400px'))

box_interactive_batting[:, 0:3] = fig_interactive_batting
box_interactive_batting[1, 3] = htmlWidget

box_interactive_batting

GridspecLayout(children=(Figure(animation_duration=100, axes=[Axis(scale=LinearScale()), Axis(orientation='ver…

## QUIZ:

<div class="alert alert-info" role="alert">
</div>



# Example 3: Red Triangles

There are ten objects divided into two groups of five.
Each object has a color (red or blue) and a shape (cross or triangle).
Initially, both the color and shape are unknown to us.

**Your goal is to find the group (upper/lower) with the most red triangles.**

We can observe either the shape or color, but not both at once.
Use the button to reveal the shapes - which group has more triangles?
Then use the button to reveal colors - which group has more red objects?

Now choose the group that has the most red triangles!

In [18]:
def red_triangles_interactive():
    """
    This interactive displays a flavor of Simpson's paradox.
    
    The user is asked whether there are more red triangles on the top or bottom level.
    
    In this interactive, the user can reveal either the color or the shape but not both.
    Based on this information they must make a choice of level.
    After they have chosen, they may reveal both the color and shape.
    If they choose "rationally", they will find that their choice is wrong.
    
    It is based on the the example found in the Martin Gardner book.
    Sometimes this example is seen as being about a woman looking for a man that is kind AND rich.
    She examines the groups bald men and notbald men.
    The notbald men are more likely to be kind.
    The notbald men are also more likely to be rich.
    However, because of the (anti) correlation between these features, the bald men are more likely to be kind AND rich.
    """
    
    # Each object has a position x, y and a color.
    num = 10
    colors = ['red', 'blue']
    shapes = ['triangle-up', 'cross']

    # This `df_init` remains untouched
    df_init = pd.DataFrame({'x':np.arange(num) % 5, 'y': [0 for ind in range(num//2)] + [1 for ind in range(num//2)], 
                       'color':np.arange(num) % len(colors), 
                       'shape':np.arange(num) % len(shapes)})

    df_init['color'] = df_init['color'].map(dict(zip(np.arange(num), colors)))
    df_init['shape'] = df_init['shape'].map(dict(zip(np.arange(num), shapes)))

    # This dataframe `df` will be modified as we go.
    df = copy.copy(df_init)

    def get_shape_dfs(df):
        dfA = df[df['shape']==shapes[0]]
        dfB = df[df['shape']==shapes[1]]
        return dfA, dfB

    def permute_objects_in_df(df):
        n = len(df)
        halfn = n//2
        inds0 = np.random.choice(np.arange(halfn), replace=False, size=halfn)
        inds1 = np.random.choice(np.arange(halfn), replace=False, size=halfn) + halfn
        allinds = np.append(inds0, inds1)

        permuted_xs = df['x'][allinds].to_numpy()
        permuted_ys = df['y'][allinds].to_numpy() # Note that because we are permuting within rows this does nothing.

        df['x'] = permuted_xs
        df['y'] = permuted_ys

    def sync_scatters_position_w_df(scatterA, scatterB, df):
        # This can sync the scatter plots with either the initial or the active df.
    #     print("in sync")
        dfA, dfB = get_shape_dfs(df)

        scatterA.set_trait('x', dfA['x'].to_numpy())
        scatterA.set_trait('y', dfA['y'].to_numpy())

        scatterB.set_trait('x', dfB['x'].to_numpy())
        scatterB.set_trait('y', dfB['y'].to_numpy())
        
    def sync_scatters_w_df(scatterA, scatterB, df):
        # This is intended as a hard reset

        dfA, dfB = get_shape_dfs(df)

        scatterA.set_trait('x', dfA['x'].to_numpy())
        scatterA.set_trait('y', dfA['y'].to_numpy())

        scatterB.set_trait('x', dfB['x'].to_numpy())
        scatterB.set_trait('y', dfB['y'].to_numpy())
        
        show_color(scatterA, scatterB, df)
        show_shape(scatterA, scatterB)
        
    def on_button_reset(scatterA, scatterB, df, button_reveal, button_choice):
        print("reset")
        sync_scatters_w_df(scatterA, scatterB, df)
        
        # FIX: This apparently does not remove the "color and shape" option and I don't know why not.
        button_reveal.options = ['nothing', 'color', 'shape']
        button_reveal.value = 'nothing'
        
        button_choice.value = None

    def permute_scatter_positions(scatterA, scatterB, df):
        # Permute but within each row
        # We have to operate on both shape types at once.
    #     print("permute scatter position")

        permute_objects_in_df(df)
        sync_scatters_position_w_df(scatterA, scatterB, df)

    def hide_color(scatterA, scatterB):
        scatterA.colors = ['gray']
        scatterB.colors = ['gray']

    def show_color(scatterA, scatterB, df):
        dfA, dfB = get_shape_dfs(df)
        scatterA.colors = list(dfA['color'])
        scatterB.colors = list(dfB['color'])

    def hide_shape(scatterA, scatterB):
        scatterA.set_trait('marker', 'square')
        scatterB.set_trait('marker', 'square')

    def show_shape(scatterA, scatterB):
        scatterA.set_trait('marker', 'triangle-up')
        scatterB.set_trait('marker', 'cross')

    def on_button_choice(change, button_reveal):
    #     print("button choice")
        curr_val = button_reveal.value
        button_reveal.set_trait('options', ('nothing', 'color', 'shape', 'color and shape'))
        button_reveal.value = curr_val

    def on_button_reveal(change, scatterA, scatterB, df):
        # Reveal the requested properties.
        # Shape AND color only accessible after a level has been chosen

        state = change['new']

        if state == 'nothing':
            # Hide color and shape.
            hide_color(scatterA, scatterB)
            hide_shape(scatterA, scatterB)
        elif state == 'color':
            # Hide shape and show color.
            show_color(scatterA, scatterB, df)
            hide_shape(scatterA, scatterB)
        elif state == 'shape':
            # Hide color and show shape.
            hide_color(scatterA, scatterB)
            show_shape(scatterA, scatterB)
        elif state == 'color and shape':
            show_color(scatterA, scatterB, df)
            show_shape(scatterA, scatterB)
        else:
            print("not implemented yet")

        # If we have made a decision, don't continue to permute.
        # We want the user to see how their guess corresponds to reality without any extra confusion.
        if button_choice.value is None:
            permute_scatter_positions(scatterA, scatterB, df)

    # Make a figure showing these objects

    fig = plt.figure(ax=[], layout={'height':'300px', 'width':'500px', 'border':'black solid 2px'}, animation_duration=1000)

    dfA, dfB = get_shape_dfs(df)

    scatterA = plt.scatter(dfA['x'], dfA['y'], colors=list(dfA['color']), marker=shapes[0], default_size=1000)
    scatterB = plt.scatter(dfB['x'], dfB['y'], colors=list(dfB['color']), marker=shapes[1], default_size=1000)

    div_line = plt.plot([-0.5, 4.5], [0.5, 0.5], colors=['black'])

    # Create the buttons
    button_choice = widgets.RadioButtons(description="Choose:", options=['top', 'lower'], value=None, 
                                         layout={'width':'150px'})

    button_reveal = widgets.Dropdown(description="Reveal:", options=['nothing', 'color', 'shape'], value='nothing', 
                                    layout={'border':'white solid 1px', 'width':'225px'})

    button_permute_positions = widgets.Button(description='permute positions')
    button_reset = widgets.Button(description='reset')

    # I think on_click wants a `change` - use lambda to make a dummy variable.
    button_choice.observe(lambda change: on_button_choice(change, button_reveal), 'value')
    button_reveal.observe(lambda change: on_button_reveal(change, scatterA, scatterB, df), 'value')

    button_permute_positions.on_click(lambda change: permute_scatter_positions(scatterA, scatterB, df))
    button_reset.on_click(lambda change: on_button_reset(scatterA, scatterB, df_init, button_reveal, button_choice))

    fig.axes[0].visible = False
    fig.axes[1].visible = False

    hide_color(scatterA, scatterB)
    hide_shape(scatterA, scatterB)

    rightbox = widgets.VBox([button_reveal, button_reset],
                             layout={'border':'white solid 1px', 'height':'100px'})
    rightbox.layout.align_items = 'center'
    rightbox.layout.justify_content = 'space-around'
    
    box = widgets.AppLayout(left_sidebar=button_choice, center=fig, right_sidebar=rightbox, 
                            pane_widths=[1, 3, 1.5], 
                            layout=widgets.Layout(border='black solid 1px', 
                                                  width='66%',
                                                  align_items='center'),
                            grid_gap='0px')
    
    return box

In [None]:
red_triangles_box = red_triangles_interactive()
red_triangles_box

In [16]:
# # make shape data
# shape_df = pd.DataFrame({'row':[0, 0, 0, 0, 0, 
#                               1, 1, 1, 1, 1], 
#                     'column':[0, 1, 2, 3, 4, 
#                               0, 1, 2, 3, 4],
#                      'shape':[1, 1, 0, 0, 0, 
#                               0, 0, 1, 1, 1], 
#                      'color':[0, 0, 0, 1, 1, 
#                               0, 0, 1, 1, 1]})

# shape_df['color'] = shape_df['color'].map({0:'red', 1:'blue'})

# def shuffle_tokens(shape_df):
#     # Shuffle within each row
#     # There are ways to permute within pandas, but they seemed gross.
    
#     print("shuffle_tokens")
    
#     shuffled_inds = list(np.random.permutation(range(0, 5))) + list(np.random.permutation(range(5, 10)))
#     shape_df['shape'] = shape_df['shape'].iloc[shuffled_inds].to_numpy()
#     shape_df['color'] = shape_df['color'].iloc[shuffled_inds].to_numpy()
#     return shape_df

# #shape_df = shuffle_tokens(shape_df)
# #shape_df

In [17]:
# # TODO: force the player to make a decision for top or bottom before we activate the "shape+color" button
# # TODO: fix the shuffle - not sure what's wrong

# # simpson's paradox with tokens
# x_sc2 = bq.LinearScale(min=-0.5, max=4.5)
# y_sc2 = bq.LinearScale(min=0, max=1)

# x_ax2 = bq.Axis(label='length', scale=x_sc2)
# y_ax2 = bq.Axis(label='height', scale=y_sc2, orientation='vertical')

# # shape choices
# sA = 'triangle-up'
# sB = 'cross'
# sC = 'square'
    
# def make_simpsons_shape_box(shape_df):
#     shapesA_df = shape_df[shape_df['shape']==0]
#     shapesB_df = shape_df[shape_df['shape']==1]
#     shapesA = bq.Scatter(x=shapesA_df['column'], y=shapesA_df['row'], colors=list(shapesA_df['color']), 
#                           marker=sA, default_size=500, scales={'x': x_sc2, 'y': y_sc2})
#     shapesB = bq.Scatter(x=shapesB_df['column'], y=shapesB_df['row'], colors=list(shapesB_df['color']), 
#                           marker=sB, default_size=500, scales={'x': x_sc2, 'y': y_sc2})
#     div_line = bq.Lines(x=[-1, 5], y=[0.5, 0.5], colors=['black'], stroke_width=3, scales={'x': x_sc2, 'y': y_sc2})
#     fig_shape = plt.figure(marks=[shapesA, shapesB, div_line], axes=[], animation_duration=1000, 
#                            layout=widgets.Layout(width='auto', height='200px', border='3px solid black'), 
#                           fig_margin={'top':20, 'bottom':20, 'left':20, 'right':20})
    
#     button_show = widgets.ToggleButtons(
#         value='shape',
#         options=['nothing', 'shape', 'color', 'shape AND color'],
#         description='show me:',
#         disabled=False,
#         button_style='success', # 'success', 'info', 'warning', 'danger' or ''
#         tooltip='Description',
#         icon='check')
    
#     button_shuffle = widgets.Button(
#         description='Shuffle now',
#         disabled=False,
#         button_style='', # 'success', 'info', 'warning', 'danger' or ''
#         tooltip='Description')
    
#     button_shuffle_bool = widgets.Checkbox(
#             value=True,
#             description='Shuffle on observation',
#             disabled=False
#         )
    
#     button_choice = widgets.RadioButtons(options=['???', 'upper', 'lower'], description="Choose", 
#                                         layout = {'width':'max-content'})
    
#     def on_button_show(change, shapesA, shapesB):
        
#         if box[3, 4].value: # shuffle check box
#             on_button_shuffle(change, shapesA, shapesB, shape_df)
    
#         shapesA_df = shape_df[shape_df['shape']==0]
#         shapesB_df = shape_df[shape_df['shape']==1]
    
#         if change['new'] == 'nothing':

#             shapesA.marker='square'
#             shapesB.marker='square'
#             shapesA.colors=['gray']
#             shapesB.colors=['gray']
            
#         elif change['new'] == 'shape':

#             shapesA.marker='triangle-up'
#             shapesB.marker='cross'
#             shapesA.colors=['gray']
#             shapesB.colors=['gray']
            
#         elif change['new'] == 'color':

#             shapesA.marker='square'
#             shapesB.marker='square'
#             shapesA.colors=list(shapesA_df['color'])
#             shapesB.colors=list(shapesB_df['color'])
            
#         elif change['new'] == 'shape AND color':

#             shapesA.marker='triangle-up'
#             shapesB.marker='cross'
#             shapesA.colors=list(shapesA_df['color'])
#             shapesB.colors=list(shapesB_df['color'])

#         else:
#             raise

#     def on_button_shuffle(change, shapesA, shapesB, shape_df):
#         print("on_button_shuffle")
        
#         shape_df = shuffle_tokens(shape_df)

#         shapesA_df = shape_df[shape_df['shape']==0]
#         shapesB_df = shape_df[shape_df['shape']==1]
        
#         print("before")
#         print(shapesA.x)
        
#         shapesA.x = shapesA_df['column']
#         shapesA.y = shapesA_df['row']
#         shapesB.x = shapesB_df['column']
#         shapesB.y = shapesB_df['row']
        
#         print("after")
#         print(shapesA.x)
        
#     def on_button_choice(change):
#         print("choice")
#         print(change['new'])
        
#     button_shuffle.on_click(lambda change:on_button_shuffle(change, shapesA, shapesB, shape_df))
#     button_show.observe(lambda change: on_button_show(change, shapesA, shapesB), 'value')
#     button_choice.observe(on_button_choice, 'value')
    
#     box = widgets.GridspecLayout(4, 5, grid_gap='0px', width='100%')
    
#     box[1, 0] = button_choice
#     box[:, 1:4] = fig_shape
#     box[0:2, 4] = button_show
#     box[2, 4] = button_shuffle
#     box[3, 4] = button_shuffle_bool    

#     return shapesA, shapesB, box


# shapesA, shapesB, box = make_simpsons_shape_box(shape_df)
# #box[0,4].value='nothing'
# box

GridspecLayout(children=(RadioButtons(description='Choose', layout=Layout(grid_area='widget001', width='max-co…

TODO: If we set up the puzzle right, maybe we don't have to walk through these details as much?
Save for quiz?

## Quiz:

Click the Shape button.

- What is the proportion of triangles in the top group? (2/5)
- What is the proportion of triangles in the top group? (3/5)
- Which group has the most triangles? (bottom)


<div class="alert alert-info" role="alert">
    <h2> Take Home Message </h2>
    <p style="font-size:1.5em"> Just because the bottom group has more red shapes and more triangles, this doesn't mean that it has more red triangles.</p>
</div>

This is possible because of how the shape and color are correlated.
On the top, redness is highly correlated with triangleness.
On the bottom, redness is actually anti-correlated with triangleness.

## Wrap up

### Fish

In this example, the fish had three independent properties: length, width, and color.

We divided the fish into two subgroups based on one property - color.
We found a correlation between length and width within each color subgroup.
We also found that this correlation was reversed when we aggregated color subgroups.

### Batting averages

In this example, the data has four independent properties: player, season, hits, and at-bats.

(Note we could replace hits and at-bats with one property - BA.)

We divided the data into two subgroups based on two properties - player and season.
We found that the ratio (correlation) between hits and at-bats favored the same player in each season.
We found that this ratio (correlation) was reversed when we aggregated by season.

### Red triangles

In this example, the data has three independent properties: level, shape, and color.

We divided the objects into subgroups based on one property - level.
We found a correlation between level and shape.
We found a correlation between level and color.
We found that when we create a composite property (shape AND color) that its correlation with level is reversed.


## Complicated example

Show one of the canonical examples, e.g. the effect of a drug in a table.

Lots of numbers.
Somehow, "adding up" small things results in a larger result than adding up large things.
This is Simpson's paradox.

You can check these numbers to confirm the story - a pain, but straightforward.

Clearly this is *important* (we want to know which drugs to take and we want our doctors to know).

- Show the number table
- Compute for them how simpsons paradox arises (they don't have to do any math, but they can believe it).
- Question 1: Choose the sentence that explains ..? the source of the paradox?
- Question 2: Which drug should you choose? A - #1, B - #2, C - either, D - depends on if you are a man or woman

TODO: can we decouple this example from gender?

## QUIZ: 

<div class="alert alert-success" role="alert">
</div>

Which of the following *best* characterizes Simpson's paradox?

- A dataset may display both a positive and negative trend at the same time.
- A trend changes sign upon addition of just a single datapoint.
- A trend changes sign when data are combined. (YES)

Which statement best summarizes the origin of Simpsons paradox?

Fish:
- Each data subset trends upward, but the two subsets themselves are arranged in a downward fashion. (+)
- Each subgroup falls along a straight line, but the aggregate does not.
- Whenever you combine subsets with similar trends, the aggregate trend will reverse.

Batting averages:
- Batting averages are *ratios*. To compute the overall BA, 

<!-- <div class="alert alert-info" role="alert">
</div> -->

### Highlight how this notebook addresses the Brilliant Teaching Principles (https://brilliant.org/principles/):

1. Excites: The greatest challenges to education are disinterest and apathy.

Fish are kind of silly looking.
TODO: Maybe move in that direction for the shape example or at least for the story.

2. Cultivates curiosity: Questions and storytelling that cultivate natural curiosity are better than the threat of a test.

No threats here.
Incorporate a range of difficulty including some very beginner.
Also, the interactive "games" have no time limit - this deemphasizes success vs failure, certainly in a test sense.
    
3. Is active: Effective learning is active, not passive. Watching a video is not enough.

Questions interspersed maintain engagement.
Interactive games.

4. Is applicable: Use it or lose it: it is essential to apply what you're learning as you learn it.

Drug example is very applicable.
Making decisions (or even just understanding your doctor) in the face of statistics is a valuable skill.

5. Is community driven: A community that challenges and inspires you is invaluable.

Not sure how to incorporate this.

6. Doesn't discriminate: Your age, country, and gender don't determine what you are capable of learning.

I took care to choose examples that seemed accessible to a wide audience.
Fish are innocuous.
Simple shapes and colors are accessible across grades, languages, cultures.
Simpson examples often include gender - something I avoided for this reason.

7. Allows for failure: The best learners allow themselves to make many mistakes along their journey.

TODO: Offer more opportunity for mistakes (that can be then fixed).

8. Sparks questions: The culmination of a great education isn't knowing all the answers — it's knowing what to ask.

TODO: End with some good food for thought. Maybe one mathy one and one practical.


# Lessons

- Simpson's paradox has something to do with aggregation and things switching.

- Be careful when you add: are you adding **numbers** (stuff) or **ratios** (not stuff)?

Example: Baseball batting averages can switch when we aggregate. However, **runs** will not switch.

- There are limits:
        you can't combine (batting aves) {0.1, 0.2, 0.3} and overcome {0.6, 0.7, 0.8} by aggregating - there has to be some overlap.
        
- What if drug A is better than B for women and men, but B is better for people as a group? 

- People seem to be comfortable with the idea that if 2019 BA was x and 2020 was y, then the aggregate should be between x and y. (and this is true)
- People seem (fairly) comfortable with weighted averages. If there were many more at bats in 2019, then the aggregate should be closer to x than to y.
- If we have two batter and the at bat distribution is the same for 2019 and 2020, then there can be no SP.
- If the two batters have overlapping BAs, and different yearly at bats, and the "lower" one gets pulled up and the "higher" one gets pulled down, the result can be a flip.

- OK, but what should you *do*?

Take the example of males and females under the old and new education plans.
The new plan increases scores of both groups, but the overall average score is lower.
So what should you do next year? use the new plan or not?