## Instrumental Variables Experiment on the [Zika Simulator](https://whynot.readthedocs.io/en/latest/simulators.html#zika-simulator).

In this experiment, we're interested in the effect of reducing the total mosquito population $X$ at time $t=6$ on the subsequent cumulative number of infected (symptomatic) humans $Y$ from times $t=7$ onward.

However, we cannot intervene on $X$ directly, and instead we rely on an instrument, application of mosquito spray, $Z$ at times $t=5$.
In general, $Z$ can take any value in $[0, 1]$ (it's a continuous instrument.)

The confounders $C$ in this experiment are the state variables for the system at times $t < 6$. However, to block all backdoor paths from $X$ to $Y$, it suffices to only consider 
the system state at time $t=5$.

Below, we show how to generate data for this experiment with WhyNot.

In [17]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [25]:
import numpy as np
import whynot as wn

from whynot.dynamics import DynamicsExperiment
from whynot.simulators import zika

## Set up the experiment


Extending the experiment API to gracefully handle continuous instruments $Z$ and continuous treatment $X$ requires a bit more work on the backend.
For the time being, we can use the existing API to generate data and ground truth $E[Y \mid \mbox{Do}(X :=x)]$ with a small amount of clunkiness.

We create two experiments. 

- The first generates the observational dataset $(Z, C, X, Y)$ by viewing $Z$ as the ''treatment.'' variable. At present, WhyNot only supports binary treatment, so for simplicitly we consider two possible values for $Z$, $Z=0$ and $Z=1.0$, and suppose $Z$ is randomly assigned with probability $0.5$ to each unit.

- The second experiment will generate the ground truth for a fixed intervention $X=x$.

In [79]:
def sample_initial_states(rng):
    """Sample initial state by randomly perturbing the default state."""
    state = zika.State().values()
    state *= rng.uniform(low=0.95, high=1.05, size=state.shape)
    return zika.State(*state)

def outcome_extractor(run):
    # Outcome is cumulative symptomatic patients starting 2 time steps after intervention
    return sum(run[time].symptomatic_humans for time in range(7, 20))

def covariate_builder(run):
    """Return the covariates for the IV experiment, in this case X and C."""
    # First covariate is the treatment, the mosquito population as time 6
    # The remaining covariates are the confounders, i.e. the state of the system at time t=5.
    return np.concatenate([[run[6].mosquito_population], run[5].values()])


experiment = DynamicsExperiment(
    name="ZikaExp",
    description="Study effect of indoor spray use on total human infections",
    simulator=zika,
    simulator_config=zika.Config(start_time=0, end_time=20, delta_t=1.0),
    intervention=zika.Intervention(time=5, stop_time=6, indoor_spray_use=1.0),
    state_sampler=sample_initial_states,
    propensity_scorer=0.5,
    outcome_extractor=outcome_extractor,
    covariate_builder=covariate_builder,
)

In [80]:
dataset = experiment.run(num_samples=500, show_progress=True, parallelize=True)

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




In [72]:
# Construct the observational dataset
Z = dataset.treatments
Y = dataset.outcomes
X = dataset.covariates[:, 0:1]   # The 'treatment' X is the first column of covariates (see the covariate_builder for this experiment)
C = dataset.covariates[:, 1:]    # The remaining columns correspond to the confounders between X and Y

# Compute causal estimates

In [73]:
## TODO: Run your causal estimator here ###

## Computing ground truth effects

Warning: Given how we constructed this experiment, `np.mean(dataset.true_effects)` does *not* correspond to the true treatment effect. Rather, it corresponds to $E[Y \mid Do(Z=1)] - E[Y \mid Do(Z=0)]$ (since we viewed $Z$ as the assigned treatment in the `DynamicsExperiment`.

Since the treatment $X$ is continuous, we don't compute treatment effects, but rather $E[Y \mid \mbox{Do}(X=x^\ast)]$, using another DynamicsExperiment where $X=x^\ast$ is the binary treatment considered.

In [74]:
def compute_true_effect(mosquito_population=1000, num_samples=500):
    """Mosquito population at time 6 is the treatment. Num samples is the number of monto-carlo estimators of E[Y | do(X=x)]."""
    
    experiment = DynamicsExperiment(
        name="ZikaRCT",
        description="Study effect of mixed treatment policy on infections in 20 days.",
        simulator=zika,
        simulator_config=zika.Config(start_time=0, end_time=20, delta_t=1.0),
        
        # Shocks are one-time interventions on the simulator state rather than changes in dynamics parameters.
        intervention=zika.Shock(time=6, mosquito_population=mosquito_population),
        
        # Use the same initial state distribution, so expectations are comparable.
        state_sampler=sample_initial_states,
        # Always apply treatment
        propensity_scorer=1.0,
        # Same outcome function
        outcome_extractor=outcome_extractor,
        covariate_builder=covariate_builder,
    )
    
    dataset = experiment.run(num_samples=num_samples, show_progress=True)
    
    # Since treatment is always applied, all outcomes correspond to Y_{do(X=x)}(u).
    return np.mean(dataset.outcomes)
    

In [75]:
compute_true_effect(mosquito_population=0)

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




2356.9448184418707

In [78]:
compute_true_effect(mosquito_population=8000)

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




2530.2924523026004