# Feature Importance in Causal Forests

Feature importance in causal forests measures how much each feature contributes to the estimation of heterogeneous treatment effects (HTEs). This allows researchers to identify which features drive variations in the causal effect across the population. It extends the concept of feature importance from standard random forests to a causal setting.

For example:

In a study analyzing a marketing intervention, feature importance can reveal which user characteristics (e.g., age, location, or past behavior) most influence how the treatment (e.g., a coupon) affects purchasing behavior.
Example Code (Using econml for Causal Forests)
Below is an example using the econml library, which supports causal forests via its CausalForestDML implementation:


In [None]:
!pip install econml


Collecting econml
  Downloading econml-0.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (38 kB)
Collecting scikit-learn<1.6,>=1.0 (from econml)
  Downloading scikit_learn-1.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (13 kB)
Collecting sparse (from econml)
  Downloading sparse-0.15.4-py2.py3-none-any.whl.metadata (4.5 kB)
Collecting shap<0.44.0,>=0.38.1 (from econml)
  Downloading shap-0.43.0-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (24 kB)
Collecting slicer==0.0.7 (from shap<0.44.0,>=0.38.1->econml)
  Downloading slicer-0.0.7-py3-none-any.whl.metadata (3.7 kB)
Downloading econml-0.15.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.5/4.5 MB[0m [31m43.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading scikit_learn-1.5.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from econml.dml import CausalForestDML

# Simulated data
np.random.seed(42)
n = 1000
X = np.random.normal(0, 1, size=(n, 5))  # Features
T = np.random.binomial(1, 0.5, size=n)   # Treatment assignment
y = 1 + 2 * T + np.dot(X, [0.5, -0.2, 0.1, 0, 0]) + np.random.normal(0, 1, size=n)  # Outcome

# Train/test split
X_train, X_test, T_train, T_test, y_train, y_test = train_test_split(X, T, y, test_size=0.2, random_state=42)

# Define the causal forest
causal_forest = CausalForestDML(
    model_t=RandomForestRegressor(n_estimators=100, max_depth=5, random_state=42),
    model_y=RandomForestRegressor(n_estimators=100, max_depth=5, random_state=42),
    discrete_treatment=True,
    random_state=42
)

# Fit the model
causal_forest.fit(y_train, T_train, X=X_train)

# Compute feature importance
feature_importances = causal_forest.feature_importances_

# Display feature importance
for i, importance in enumerate(feature_importances):
    print(f"Feature {i + 1}: Importance = {importance:.4f}")


First stage model has discrete target but model is not a classifier!
First stage model has discrete target but model is not a classifier!


Feature 1: Importance = 0.2551
Feature 2: Importance = 0.2239
Feature 3: Importance = 0.1316
Feature 4: Importance = 0.1889
Feature 5: Importance = 0.2005
