Parameter tuning for RDD / lm_forest #1259

corydeburd · 2023-02-05T02:51:37Z

I wanted to check whether the solution for parameter tuning proposed here (#1195) would be valid for the regression discontinuity case, paired with lm_forest() as in the example below. As with the previous link, this method does not currently have a setting to automatically tune parameters.
https://grf-labs.github.io/grf/reference/lm_forest.html

Does the code / intuition in the original post still apply here? That is, with lm_forest() and non-binary "treatments" (i.e., the RD running variable slopes), is this MSE still the object to consider?

erikcs · 2023-03-22T17:28:31Z

Hi @corydeburd,

That MSE is a reasonable object to consider. Another approach is to treat RDD Forest purely as a data-driven algorithm to find heterogeneous subgroups, i.e:

Split data into training/evaluation

On train data fit an rdd forest
On evaluation data predict treatment effects and form for example groups based on which quantile of the RDD CATEs the unit belongs to. Then, in each of these groups, fit an RDD coefficient using your favorite RDD method. If step 1 was successful, you'd ideally see different RDD estimates in the low and high group corresponding to a low and high effect.

Doing many runs of 1) with different tuning parameters is fair in the sense that inference should still be valid for the RDD coefficients you estimate in 2) since it is a tuned algorithm that discovers the subgroups on a held out data set.

corydeburd · 2023-03-22T17:35:05Z

Thanks, this is a great suggestion. Actually, I had adopted something like this approach, so it's good to know it comes recommended! Holding out is very important as I think it's very easy to overfit in my situation [it's an RD, so observations near the cutoff have a lot of weight]

yusukematsuyama · 2023-12-15T05:23:13Z

Dear @erikcs and @corydeburd ,

This thread is very helpful. Thank you.
I am also trying to tune parameters in lm_foest() for RDD analysis. Could I have your advice?

Should I split the data into trainning/evaluation with equal sizes?
Are there recommended parameter spaces (min/max) that I should try?

erikcs · 2023-12-20T06:44:07Z

Hi @yusukematsuyama, there's no fixed rule for the train/test, 50/50 and 70/30 are just some common choices. Forests are usually robust wrt tuning parameters, it's hard to say which range of parameters is reasonable.

yusukematsuyama · 2023-12-21T00:47:56Z

Dear @erikcs,

Thank you for your advice. I will try that!

erikcs added the question label Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameter tuning for RDD / lm_forest #1259

Parameter tuning for RDD / lm_forest #1259

corydeburd commented Feb 5, 2023

erikcs commented Mar 22, 2023

corydeburd commented Mar 22, 2023

yusukematsuyama commented Dec 15, 2023

erikcs commented Dec 20, 2023

yusukematsuyama commented Dec 21, 2023

Parameter tuning for RDD / lm_forest #1259

Parameter tuning for RDD / lm_forest #1259

Comments

corydeburd commented Feb 5, 2023

erikcs commented Mar 22, 2023

corydeburd commented Mar 22, 2023

yusukematsuyama commented Dec 15, 2023

erikcs commented Dec 20, 2023

yusukematsuyama commented Dec 21, 2023