# Linear Balancing weights

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import empirical_calibration as ec # compare implementation - QBAL should be similar modulo simplex restriction

# can also balance on nystrom-approx or kitchen-sink features
import sklearn.kernel_approximation as ska


In [2]:
def lbw(X0: np.array,
        X1: np.array,
        y: np.array = None,
        w: np.array = None,
        return_weights : bool = True) -> np.array:
    """Linear balancing weights

    Args:
        X0 (np.array): Covariate matrix for source population
        X1 (np.array): Moments of target distribution
        y (np.array): outcome vector (optional)
        w (np.array): treatment vector (optional)

    Returns:
        if return_weights: np.array: weights
        else: float: ATT estimate
    """
    H00 = np.linalg.pinv(X0.T @ X0) @ X0.T
    wt = X1 @ H00
    if return_weights:
        return wt
    # return ATT if return_weights is False
    return y[w == 1].mean() - np.average(y[w == 0], weights=wt)

This formulation makes it clear that balancing weights can be solved for using only summary data from the target group (i.e. the covariate means for the treatment group for the ATT, or the covariate means for the target population under covariate shift).

## Kline's examples

'OLS is doubly-robust' result: results consistent for ATT if either outcome $Y^0$ or selection odds $\frac{\pi(X)}{1-\pi(X)}$ linear in X.

Replicating examples from [Kline (2011)](https://eml.berkeley.edu/~pkline/papers/OB_reweighting.pdf). Data and code [here](https://eml.berkeley.edu/~pkline/papers/Oaxaca_web.zip).

In [18]:
df = pd.read_stata("nswre74.dta")
yn, wn = "re78", "treat"
xn = df.columns.drop([yn, wn]).tolist()
n = df.shape[0]
y, w, X = df[yn].values, df[wn].values, np.c_[np.ones(n), df[xn].values]
print("DiM: =", y[w == 1].mean() - y[w == 0].mean())
# control covariate matrix, and target moments
X0, X1 = X[w == 0], X[w == 1].mean(axis = 0)
# estimate ATT
lbw(X0, X1, y, w, return_weights = False)

DiM: = 1794.3433


1784.7850412154503

Applying this to experiments makes little difference since imbalances are small and coincidental.

In [17]:
df = pd.read_stata("cps3re74.dta")
yn, wn = "re78", "treat"
xn = df.columns.drop([yn, wn]).tolist()
n = df.shape[0]
y, w, X = df[yn].values, df[wn].values, np.c_[np.ones(n), df[xn].values]
print("DiM: =", y[w == 1].mean() - y[w == 0].mean())
# control covariate matrix, and target moments
X0, X1 = X[w == 0], X[w == 1].mean(axis = 0)
# estimate ATT
lbw(X0, X1, y, w, return_weights = False)

DiM: = -635.0259


1701.172900724143

CPS3 has 'mild' selection bias [Smith and Todd (2005)], so we can get close to experimental estimates with reweighting alone.

In [21]:
df = pd.read_csv("lalonde_psid.csv")
yn, wn = "re78", "treat"
xn = df.columns.drop([yn, wn]).tolist()
n = df.shape[0]
y, w, X = df[yn].values, df[wn].values, np.c_[np.ones(n), df[xn].values]
print("DiM: =", y[w == 1].mean() - y[w == 0].mean())
# control covariate matrix, and target moments
X0, X1 = X[w == 0], X[w == 1].mean(axis = 0)
# estimate ATT
lbw(X0, X1, y, w, return_weights = False)

DiM: = -15204.775555988717


687.8220530909257

PSID has worse selection bias, so it is harder to undo with reweighting alone.