Youmi Suk1 and Chan Park2
1 Department of Human Development, Teachers College Columbia University
2 Department of Statistics and Data Science, The Wharton School, University of Pennsylvania
Optimal treatment regimes (OTRs) have been widely employed in computer science and personalized medicine to provide data-driven, optimal recommendations to individuals. However, previous research on OTRs has primarily focused on settings that are independent and identically distributed, with little attention given to the unique characteristics of educational settings, where students are nested within schools and there are hierarchical dependencies. The goal of this study is to design OTRs from multisite randomized trials, a commonly used experimental design in education and psychology to evaluate educational programs. We investigate modifications to popular OTR methods, specifically Q-learning and weighting methods, in order to improve their performance in multisite randomized trials. A total of 12 modifications, 6 for Q-learning and 6 for weighting, are proposed by utilizing different multilevel models, moderators, and augmentations. Simulation studies reveal that all Q-learning modifications improve performance in multisite randomized trials and the modifications that incorporate random treatment effects show the most promise in handling cluster-level moderators. Among weighting methods, the modification that incorporates cluster dummies into moderator variables and augmentation terms performs best across simulation conditions. The proposed modifications are demonstrated through an application to estimate an OTR of conditional cash transfer programs using a multisite randomized trial in Colombia to maximize educational attainment.
For more details of our proposed methods, see our paper.
Here, we provide R
codes to reproduce our simulation study and replicate our data analysis using data about conditional cash transfer (CCT) programs.
-
DataGeneratingModels.R
This
R
file includes data generating codes for data from a multisite randomized trial with a cluster-level unmeasured covariate. -
Qlearn.R
This
R
file includes a function namedQlearn
to implement our proposed modifcations for Q-learning as well as the baseline Q-learning method. -
SimulationCodes.R
This
R
file includes simulation codes with our proposed modifcations for Q-learning and weighting methods where the parameterbeta1
represents the cofficient of a cross-level interaction effect between treatment status and a cluster-level unmeasured covariate. For more information on simulation condtions, please refer to our paper.
-
Data on conditional cash transfer (CCT) programs
For our empirical analysis, we used data collected by Barrera-Osorio et al. (2011). The data can be downloaded from openICPSR by clicking here. For more information on the data, please refer to the detiled report by Barrera-Osorio et al. (2011) and the codebook provided here.
-
CCTdataAnalysisCodes.R
This
R
file can be used to replicate our data analysis.
Please note that these supplemental materials are provided for the purpose of reproducibility and should be used in accordance with academic ethical guidelines. Any reference to these materials in your work should properly cite the original sources.