Code for " Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation" (ICLR2023) (https://arxiv.org/abs/2106.03907)
Update from Neurips Camera Ready version
As found in Kompa et al. 2022, our old code may suffer from numerical instability in the Demand design experiment. This occurs because we applied ReLU activation at the final layer of features and they can all be zero during in the training. We updated our code to remove this final activation, which not only resolve numerical instability but also slightly increases the performance
To ensure reproducibility, we keep the old neural net structure in src/models/DFPV/nn_structure/nn_structure_for_demand_deprecated.py. You can replace src/models/DFPV/nn_structure/nn_structure_for_demand.py with it to reproduce the results in ICLR paper.
We thank Olawale Salaudeen for alerting us to this issue.
In the proceedings version of this document, we employed a different experimental dSprite setting. We used the structural function
where each element of the matrix
However, this had the following limitations. Sprite images
To deal with this limitation, we introduced new structural function given as
where each element of the matrix
which refects the position of sprite images dsprite_org in the config files.
A implementation now uses more stable formulation for both methods are found in this paper.
Keep the original code as deprecated
- Install all dependencies
pip install -r requirements.txt - Create empty directories for logging
mkdir logs mkdir dumps - Run codes
python main.py <path-to-configs> <problem_setting><problem_setting>can be selected fromateandope, which corresponds to ate experiments and policy evaluation experiments in the paper. Make sure to input the corresponding config file to each setting. The result can be found indumpsfolder. You can run in parallel by specifing-toption.
