# Intuition behind inverse probability weighting

In non-randomized studies, the probability that a person receives a treatment might differ based on some covariates. **Inverse probability weighting** helps us estimate the causal effect of the treatment even when we have confounders. By weighting, the treatment becomes indepdent of the measured confounders.

The main intuition is that we want to give higher weight to underrepresented in a group. For example, we can be interested in testing a new drug's effect on pain relief. Maybe older patients are more likely to receive the drug. Then we would want to weight the outcome of young patients that were treated to be higher so that it can contribute more to the expected potential outcome of a patient who received the drug.

## Limitations

We can't use IPW if we didn't measure the covariates. We also can't be 100% sure that we accounted for all covariates. For example, if we forgot to include a covariate, then exchangeability won't be satisfied.

In [16]:
from dowhy import datasets
import pandas as pd
import numpy as np
from causalinference import CausalModel

# Step 1: Create synthetic data

In [None]:
np.random.seed(42)

data = datasets.linear_dataset(
    beta = 10, # causal effect
    num_common_causes=4, # number of confounders
    num_samples = 10000,
    treatment_is_binary=True,
    outcome_is_binary=False
)

df = data['df']

df.head()

Unnamed: 0,W0,W1,W2,W3,v0,y
0,-0.485073,0.667292,2.043201,0.964752,True,17.107666
1,-0.720394,1.443989,0.00057,-0.268413,True,9.638488
2,-0.008957,-1.011852,-1.26093,-0.364971,False,-5.821974
3,-1.263751,1.215676,-0.444036,-1.214987,True,3.508742
4,1.214729,0.675652,0.531516,-1.227431,True,12.36793


$W_i$ are the confounders, v0 is the treatment, and y is the outcome

In [7]:
# Convert v0 to Treatment and binary, y to Outcome
df['v0'] = df['v0'].apply(lambda x : 1 if x == True else 0)

df = df.rename({'v0' : 'treatment', 'y' : 'outcome'}, axis = 1)
df.head()

Unnamed: 0,W0,W1,W2,W3,treatment,outcome
0,-0.485073,0.667292,2.043201,0.964752,1,17.107666
1,-0.720394,1.443989,0.00057,-0.268413,1,9.638488
2,-0.008957,-1.011852,-1.26093,-0.364971,0,-5.821974
3,-1.263751,1.215676,-0.444036,-1.214987,1,3.508742
4,1.214729,0.675652,0.531516,-1.227431,1,12.36793


# Step 2: Get raw summary statistics from data

In [8]:
# Run causal model and print summary statistics
causal = CausalModel(Y = df['outcome'].values, D = df['treatment'].values, X = df[['W0', 'W1', 'W2', 'W3']].values)
print(causal.summary_stats)


Summary Statistics

                      Controls (N_c=2269)        Treated (N_t=7731)             
       Variable         Mean         S.d.         Mean         S.d.     Raw-diff
--------------------------------------------------------------------------------
              Y       -2.191        4.217       13.940        4.842       16.132

                      Controls (N_c=2269)        Treated (N_t=7731)             
       Variable         Mean         S.d.         Mean         S.d.     Nor-diff
--------------------------------------------------------------------------------
             X0       -0.779        0.939       -0.088        0.971        0.724
             X1       -0.181        0.685        1.207        0.829        1.825
             X2        0.122        0.997        0.555        0.991        0.435
             X3        0.127        0.969        0.221        1.005        0.095



From this output:
- There are 2269 units in the control, and 7731 in the treatment group
- The average outcome for the control group is -2.191 and the average outcome for the treatment group is 13.94
- Nor-diff is the standardized mean difference. It measures how similar the treatment and control group are with respect to a covariate
    - if Nor-diff < 0.1, the groups are balanced with respect to that covariate
- In this case, the groups are not balanced

# Step 3: Estimate the propensity score of each observation

A propensity score is the probability that the unit will receive the treatment, given the covariates. This is done using a logistic regression

In [13]:
# Estimate propsenity scores
causal.est_propensity()
print(causal.propensity)
print(causal.propensity['fitted'])


Estimated Parameters of Propensity Score

                    Coef.       S.e.          z      P>|z|      [95% Conf. int.]
--------------------------------------------------------------------------------
     Intercept     -0.052      0.053     -0.982      0.326     -0.156      0.052
            X0      2.171      0.067     32.537      0.000      2.040      2.302
            X1      4.525      0.117     38.816      0.000      4.297      4.754
            X2      1.350      0.054     25.134      0.000      1.244      1.455
            X3      0.286      0.044      6.500      0.000      0.200      0.373

[0.99295272 0.99217314 0.00156753 ... 0.69143426 0.99983862 0.99943713]


This shows the estimated coefficients for the parameters.
    
    causal.propensity['fitted'] gives the actual propensity scores

# Step 4: Inverse probability weighting

After estimating the propensity scores in Step 2, we can perform the inverse probability weighting

In [15]:
# Inverse probability weighting
causal.est_via_weighting()
print(causal.estimates)


Treatment Effect Estimates: Weighting

                     Est.       S.e.          z      P>|z|      [95% Conf. int.]
--------------------------------------------------------------------------------
           ATE     10.000      0.002   6455.889      0.000      9.997     10.003



This estimate of the average treatment effect matches the causal effect that we specified in the data generation process.