LBCF: A Large-Scale Budget-Constrained Causal Forest Algorithm

This repo contains all the data, code files and pics mentioned in Supplementary Materials of paper "LBCF: A Large-Scale Budget-Constrained Causal Forest Algorithm".

Reproduction Instructions

Section 5.1 Simulation Analysis

Steps to reproduce the results:

1. Add your work directory to "homePath" in generateSimulationData.R under Code/Data_generation folder.
2. Run generateSimulationData.R.
3. Follow the instructions under each model's folder for training and prediction; run budget_allocation.py for each model.
4. Run Simulation_analysis.py under Code/Evaluation folder.

Section 5.2 Offline Test

Note: Due to the protected coypright of this real-world offine test dataset, currently we are not able to disclose the complete data used in this test until we get authorization but the reader can use the code provided to run this algorithm on other dataset. Note, a small part (about 2000 samples) of the complete data are provided in Data Folder just for testing the code, NOT FOR REPRODUCE THE RESULT in Section 5.2, and ALL THE 2000 SAMPLES HAVE BEEN Encrypted.

Steps to run the codes:

1. Follow the instructions under each model's folder for training and prediction.
2. Run budget_allocation-RCT.py for each model.
3. Run Offline_test.py under Code/Evaluation folder.

Implementation Details

The proposed CATE estimation method UDCF (Unified Discriminative Causal Forest) is implemented by directly modifying the C++ source code of GRF (Generalized Random Forests) with a new splitting criterion. Therefore, UDCF requires a C++ running environment as GRF. Detailed info for environment configuration please refers to GRF Github Pages.
The proposed MCKP distributed algorithm DGB (Dual Gradient Bisection) is written in Python and run on Pytorch.
The baseline methods: uplift random forests on ED (Euclidean Distance), Chi (Chi-Square) and CTS (Contextual Treatment Selection) are all directly imported from CausalML package Github Pages.
The baseline methods: CT.ST (Causal Tree with Stochastic Optimization) and CF.DT (Causal Forest with Deterministic Optimization) proposed by Tu et al. [1] are implemented by following the algorithm listed in the paper [1]. R package is required.

Reference

[1]. Ye Tu, Kinjal Basu, Cyrus DiCiccio, Romil Bansal, Preetam Nandy, Padmini Jaikumar, and Shaunak Chatterjee. 2021. Personalized Treatment Selection using Causal Heterogeneity. In Proceedings of the Web Conference 2021. 1574–1585.

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
Code		Code
Data		Data
Images		Images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code

Code

Data

Data

Images

Images

README.md

README.md

Repository files navigation

LBCF: A Large-Scale Budget-Constrained Causal Forest Algorithm

Reproduction Instructions

Section 5.1 Simulation Analysis

Section 5.2 Offline Test

Implementation Details

Reference

About

Releases

Packages

Languages

www2022paper/WWW-2022-PAPER-SUPPLEMENTARY-MATERIALS

Folders and files

Latest commit

History

Repository files navigation

LBCF: A Large-Scale Budget-Constrained Causal Forest Algorithm

Reproduction Instructions

Section 5.1 Simulation Analysis

Section 5.2 Offline Test

Implementation Details

Reference

About

Resources

Stars

Watchers

Forks

Languages