# Quantifying causality using quasi-experiments interactive figures

Below are companion interactive figures for the static inline figures shown in "Quantifying causality using quasi-experiments."

To run/refresh the interactive widgets, re-run the code cells: `Cell -> Run all`

In [1]:
# Code hider, source: http://chris-said.io/2016/02/13/how-to-make-polished-jupyter-presentations-with-optional-code-visibility/
from IPython.display import HTML

HTML('''
<script>
  function code_toggle() {
    if (code_shown){
      $('div.input').hide('500');
      $('#toggleButton').val('Show Code')
    } else {
      $('div.input').show('500');
      $('#toggleButton').val('Hide Code')
    }
    code_shown = !code_shown
  }

  $( document ).ready(function(){
    code_shown=false;
    $('div.input').hide()
  });
</script>
<form action="javascript:code_toggle()"><input type="submit" id="toggleButton" value="Show Code"></form>
''')



## Imports

In [2]:
import matplotlib.pyplot as plt
from ipywidgets import interact_manual, IntSlider, FloatSlider

from utils import *

%matplotlib inline

## Figure 2: Instrumental Variables

You can use the "x" and "y" dropdowns to plot the relationship between any pair of four variables on the left-hand plot: the instrument $IV$, the treatment $X$, a confounder $Z$ and the outcome $Y$. The right-hand plot uses simulated data to display a density of treatment effect estimates made by both regression and instrumental variable analysis.

You can use the sliders to adjust the effect of the confounder as well as violations of the exclusion restriction.
Note that the IV estimates are unbiased (on average, they recover the correct treatment effect) regardless of the amount of confounding, while they become biased whenever the exclusion restriction fails.

In [3]:
n_samples = 500
n_trials = 50
x = 'treatment'
y = 'outcome'
iv_columns = ['treatment', 'outcome', 'confounder', 'instrument']
treat_effect = 2
confound_effect = 4
instrument_str = 0.8
confounder_str = 0.5

style = {'description_width': 'initial'}
confound_str_slider = FloatSlider(style=style, description="confound effect", min=-1, max=1, value=0.5)
exclusion_str_slider = FloatSlider(style=style, description="exclusion restriction effect", min=-1, max=1, value=0)
@interact_manual(x=iv_columns, y=iv_columns, 
                 confounder_str=confound_str_slider, 
                 exclusion_str=exclusion_str_slider)
def show_iv_widget(x, y,  confounder_str, exclusion_str):
    data_df = generate_iv_data(n_samples, treat_effect, confound_effect, instrument_str, confounder_str, exclusion_effect=exclusion_str)
    fig, (ax1,ax2) = plt.subplots(1,2, figsize=(10,5))
    
    plot_iv_scatter(x, y, data_df, ax1)
    plot_iv_dist(n_trials, n_samples, treat_effect, confound_effect, instrument_str, confounder_str, ax2, exclusion_effect=exclusion_str)
    fig.suptitle("Figure 2: Instrumental Variable Interactive Widget")
    plt.xlim(-2,6)

interactive(children=(Dropdown(description='x', options=('treatment', 'outcome', 'confounder', 'instrument'), …

---

## Figure 3: Regression Discontinuity Design

The left-hand plot shows the relationship between the running variable $R$ and the outcome $Y$, highlighting the bandwidth of data points in red used to estimate the local regressions above and below the threshold. The right-hand plot uses simulated data to display a density of treatment effect estimates made by both regression and regression discontinuity design analysis.

You can use the sliders to adjust the bandwidth size, and also use the dropdown to indicate whether the relationship between $R$ and $Y$ is linear or nonlinear. Note that larger bandwidths increase the precision of the treatment effect estimates, but risk biasing the estimates when the relationship between $R$ and $Y$ is nonlinear.

In [4]:
n_samples = 500
n_trials = 50
treat_effect = 15
confound_effect = 3
C_R = 0.3

style = {'description_width': 'initial'}
rdd_columns = ["treat", "confound", "running", "outcome"]
bandwidth_slider = FloatSlider(style=style, description="bandwidth",min=0.5, max=10, value=1)
@interact_manual(bandwidth=bandwidth_slider,
                 nonlinear=[False,True])
def show_rdd_widget(bandwidth, nonlinear):
    x="running"
    y="outcome"
    rdd_df = generate_rdd_data(n_samples, treat_effect, confound_effect, C_R, nonlinear=nonlinear)
    fig, (ax1,ax2) = plt.subplots(1,2, figsize=(10,5))
    
    plot_rdd_scatter(x, y, rdd_df, bandwidth, treat_effect, ax1)
    plot_rdd_dist(n_trials, n_samples, treat_effect, confound_effect, C_R, bandwidth, ax2, nonlinear=nonlinear)
    fig.suptitle("Regression Discontinuity Design Interactive Widget")
    ax1.set_ylim(-30-treat_effect, 30 + treat_effect)
    ax2.set_xlim(treat_effect-30, 30 + treat_effect)


interactive(children=(FloatSlider(value=1.0, description='bandwidth', max=10.0, min=0.5, style=SliderStyle(des…

---

## Figure 4: Difference-in-differences

The left-hand plot shows the outcome $Y$ for both the treated and control groups over time, with the treatment $X$ applied to the treated group to the right of the dotted line. The right-hand plot uses simulated data to display a density of treatment effect estimates made by both a single difference (in the treated group outcomes) estimate and difference-in-differences analysis.

You can use the slider to adjust the effect of parallel trends, which is the time effect ratio between the treated and control groups. Note that the parallel trends assumption holds when this ratio is 1, and that the treatment effect estimates become biased when this assumption is violated.

In [5]:
n_samples = 500
seed = 42
time_effect = 2
time_ratio = 0.2
treat_time_effect = time_effect
control_time_effect = time_effect * time_ratio

control_offset = -2
treat_effect = 2

time_ratio_slider = FloatSlider(style=style, description="parallel trends effect", min=0, max=2, value=1)

@interact_manual(time_ratio=time_ratio_slider)
def show_did_widget(time_ratio):
    treat_time_effect = time_effect
    control_time_effect = time_effect * time_ratio
    did_df = generate_did_data(n_samples, treat_time_effect, control_time_effect, control_offset, treat_effect, seed=42)
    
    fig, (ax1,ax2) = plt.subplots(1,2, figsize=(10,5))
    plot_did_scatter(did_df, ax1)
    plot_did_dist(n_trials, n_samples, treat_time_effect, control_time_effect, control_offset, treat_effect, ax2)
    fig.suptitle("Difference in Difference Interactive Widget")
    ax2.set_xlim(-14, 18)

interactive(children=(FloatSlider(value=1.0, description='parallel trends effect', max=2.0, style=SliderStyle(…