In [1]:
import time

import jax.numpy as jnp
import numpy as np
import pandas as pd
from src.mcnnm.utils import generate_data
from src.mcnnm.wrappers import estimate

In [2]:
import causaltensor.cauest.MCNNM as MC

# Comparison of Causaltensor, Fect, and Lightweight-MCNNM
This notebook compares the performance of Causaltensor, Fect, and Lightweight-MCNNM in estimating the average treatment effect in a panel data setting. The comparison is based on the following metrics: treatment effect estimate, execution time, and MAE and MSE of the untreated counterfactual outcome matrix. The comparison is based on a simulated dataset with 50 units and 50 periods without any covariates. The dataset is generated using the `generate_data` function from the `util` module. The true treatment effect is set to 5. The untreated counterfactual outcome matrix is also generated using the true parameters. The three estimators are then run on the generated data, and the results are compared. The reason why these estimators are compared without covariates is that they handle covariates differently: lightweight-mcnnm exactly follows the description in section 8.1 of [the paper](https://www.tandfonline.com/doi/full/10.1080/01621459.2021.1891924) and regularizes covariates separately, while causaltensor and fect do not handle covariates in the same way (fect seems to regularize covariates as well, but neither causaltensor nor fect allow for unit-time specific covariates). Colab can not be used to run this notebook because it requires a local R installation. All results were obtained on a 2021 10-core Apple M1 Pro CPU.

## Generate Data

In [3]:
nobs, nperiods = 50, 50

Y, W, X, Z, V, true_params = generate_data(
    nobs=50,
    nperiods=20,
    unit_fe=True,
    time_fe=True,
    X_cov=False,
    Z_cov=False,
    V_cov=False,
    seed=2024,
    assignment_mechanism="staggered",
    treatment_probability=0.1,
)

tau = true_params["treatment_effect"]
Y_0 = jnp.array(true_params["Y(0)"])

In [4]:
# Define a function to compute the MSE of two matrices
def mse(A, B):
    return jnp.mean((A - B) ** 2)

## Run all three estimators

### Causaltensor

In [5]:
# Code adapted from Causaltensor's Matrix Completion Example: https://colab.research.google.com/github/TianyiPeng/causaltensor/blob/main/tests/MCNNM_test.ipynb#scrollTo=LSYGyn4cl9Bd (last cell)
# Causaltensor nomenclature: observation matrix O and treatment pattern Z
# so O is Y and Z is W
# Causaltensor by default uses 6 candidate lambdas
# input arrays have to be numpy
Y_np = np.array(Y)
W_np = np.array(W)

causaltensor_start_time = time.time()
solver = MC.MCNNMPanelSolver(Z=W_np, O=Y_np)
ct_res = solver.solve_with_cross_validation(K=5)
causaltensor_exec_time = time.time() - causaltensor_start_time

  res.tau = np.sum((self.O - res.baseline_model)*self.Z) / np.sum(self.Z)


### Lightweight-MCNNM

In [11]:
mcnnm_start_time = time.time()
results = estimate(Y=Y, Mask=W, K=5, n_lambda=6)
mcnnm_exec_time = time.time() - mcnnm_start_time

### Fect

In [7]:
# Convert Y and W to long format
y_df = pd.DataFrame(Y).reset_index()
y_long = pd.melt(y_df, id_vars=["index"], var_name="period", value_name="Y")
y_long = y_long.rename(columns={"index": "unit"})

w_df = pd.DataFrame(W).reset_index()
w_long = pd.melt(w_df, id_vars=["index"], var_name="period", value_name="D")
w_long = w_long.rename(columns={"index": "unit"})

# Combine Y and W data
data = pd.merge(y_long, w_long[["unit", "period", "D"]], on=["unit", "period"])

# Rename columns
data = data.rename(columns={"unit": "id", "period": "time"})

# Sort the data
data = data.sort_values(["id", "time"]).reset_index(drop=True)
data.to_csv("fect_data.csv", index=False)  # Save the long format DataFrame to a CSV file

After manually running the code contained in fect_test.R in Rstudio, fect version 0.1.0, we can load the results::

In [8]:
fect_results = pd.read_csv("fect_results.csv")  # Read the results

# Access the values
fect_tau = fect_results["att_avg"].values[0]
fect_lam = fect_results["lambda_cv"].values[0]
fect_Y_0 = fect_results.filter(regex="^Y_ct_").values.T
fect_exec_time = fect_results["elapsed_time"].values[0]

## Results Comparison:

In [12]:
print("Causaltensor:")
print(f"true effect: {tau}, estimated effect: {ct_res.tau:.4f}")
print(f"Execution time: {causaltensor_exec_time:.2f} s, Cross-validated lambda: Not made available by Causaltensor")
print("-" * 100)
print("Fect:")
print(f"true effect: {tau}, estimated effect: {fect_tau:.4f}")
print(f"Execution time: {fect_exec_time:.2f} s, Cross-validated lambda: {fect_lam:.6f}")
print(
    f"MSE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): {mse(Y_0, fect_Y_0):.4f}"
)
print(
    f"MAE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): {jnp.mean(jnp.abs(Y_0 - fect_Y_0)):.4f}"
)
print("-" * 100)
print("Lightweight-MCNNM:")
print(f"true effect: {tau}, estimated effect: {results.tau:.4f}")
print(f"Execution time: {mcnnm_exec_time:.2f} s, Cross-validated lambda: {results.lambda_L:.6f}")
print(
    f"MSE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): {mse(Y_0, results.Y_completed):.4f}"
)
print(
    f"MAE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): {jnp.mean(jnp.abs(Y_0 - results.Y_completed)):.4f}"
)

Causaltensor:
true effect: 5.0, estimated effect: 5.0559
Execution time: 11.90 s, Cross-validated lambda: Not made available by Causaltensor
----------------------------------------------------------------------------------------------------
Fect:
true effect: 5.0, estimated effect: 4.9551
Execution time: 0.61 s, Cross-validated lambda: 0.006636
MSE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): 2.4865
MAE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): 1.1319
----------------------------------------------------------------------------------------------------
Lightweight-MCNNM:
true effect: 5.0, estimated effect: 4.9338
Execution time: 0.66 s, Cross-validated lambda: 0.000000
MSE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): 2.4205
MAE of Y(0) (The untreated counterfactual outcome matrix completed by these estimators ): 0.7748


We can see that all three estimators provide similar estimates of the treatment effect, with lightweight-mcnnm being the least accurate. Causaltensor takes the longest to run, while fect is the fastest. The MSE of the untreated counterfactual outcome matrix is lowest for fect, followed by causaltensor and lightweight-mcnnm. The MAE and MSE of the imputed outcome matrix are also lowest for lightweight-mcnnm, followed by fect and causaltensor. The loss in treatment effect accuracy is likely explained by discrepancies in coefficient initialization and optimization convergence. 