Add support for factor variables. #109

jtibshirani · 2017-07-23T06:28:17Z

We should likely implement the approach suggested in ESL where in each node, factor variables are ordered by their mean outcome before performing the split. This should be properly generalized to handle non-regression forests.

lwu9 · 2018-03-26T14:29:52Z

When there is a categorical variable X1 in X, it is possible that there is a child node where those observations contain only one category, i.e. all values of the variable X1 are the same. In this case, how can we get pseudo-outcomes? The pseudo-outcomes is obtained through the inverse of Ap which may be singular. Isn't it?

swager · 2018-03-28T08:01:52Z

The X-values aren't used to compute the pseudo-outcomes in the leaf in the standard GRF formulation; rather, only the "outcomes" matter (e.g., W and Y for causal_forest). The features X enter into the problem by determining which leaf an observation falls into.

lwu9 · 2018-03-28T14:57:37Z

Thanks for you reply @swager! I understand in your causal_forest, pseudo-outcomes only have W and Y. But if we want to do the local linear regression, so our psi should be:
psi(Y_i)=Y_i-theta * X_i, (here theta is related to the query point x), shouldn't it? So when we calculate Ap, we need take derivative of psi w.r.t. theta, and the result will include feature X. Is there anything wrong in my understanding?

jtibshirani · 2020-02-23T22:59:33Z

We added the sufrep package, which contains a collection of methods for handling categorical variables, and a tutorial for how to use sufrep with grf.

lwu9 · 2020-02-23T23:10:20Z

Thanks

jtibshirani changed the title ~~Add support for factor variables~~ Add support for factor variables. Jul 23, 2017

jtibshirani added the high priority label Jul 25, 2017

jtibshirani added the help wanted label May 28, 2018

jtibshirani mentioned this issue Jul 10, 2018

varImpPlot for causal trees #130

Closed

jtibshirani added 1.0 release and removed high priority labels Aug 4, 2018

jtibshirani added requires research feature and removed help wanted labels Dec 9, 2018

jtibshirani assigned halflearned Jun 14, 2019

halflearned mentioned this issue Oct 10, 2019

Documentation and vignette for categorical variables #537

Merged

jtibshirani closed this as completed Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for factor variables. #109

Add support for factor variables. #109

jtibshirani commented Jul 23, 2017 •

edited

Loading

lwu9 commented Mar 26, 2018

swager commented Mar 28, 2018

lwu9 commented Mar 28, 2018

jtibshirani commented Feb 23, 2020

lwu9 commented Feb 23, 2020

Add support for factor variables. #109

Add support for factor variables. #109

Comments

jtibshirani commented Jul 23, 2017 • edited Loading

lwu9 commented Mar 26, 2018

swager commented Mar 28, 2018

lwu9 commented Mar 28, 2018

jtibshirani commented Feb 23, 2020

lwu9 commented Feb 23, 2020

jtibshirani commented Jul 23, 2017 •

edited

Loading