Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xpred.rpart Returns Levels Not Treatment Effects #20

Open
jonathandroth opened this issue Jun 5, 2017 · 2 comments
Open

Xpred.rpart Returns Levels Not Treatment Effects #20

jonathandroth opened this issue Jun 5, 2017 · 2 comments

Comments

@jonathandroth
Copy link

Hi there,

The rpart function xpred.rpart is supposed to return predicted values from a tree under cross-validation. (I am trying to implement this along with causal tree since I'd like to choose my complexity parameter using a customized cross-validation criterion.)

However, when I use it with the output of a causalTree, it seems to predict the level of the y variable, rather than the treatment effect. An example is below:

library(rpart)
library(causalTree)

#Add y to all the y-values so that levels and treatment effects are very different
simulation.1$y <- simulation.1$y + 100 

tree <- causalTree(y~ x1 + x2 + x3 + x4, data = simulation.1, treatment = simulation.1$treatment,
                   split.Rule = "CT", cv.option = "CT", split.Honest = T, cv.Honest = T, split.Bucket = F, xval = 5, 
                   cp = 0, minsize = 20, propensity = 0.5)

opcp <- tree$cptable[,1][which.min(tree$cptable[,4])]
opfit <- prune(tree, opcp)


#Predicting using the tree gives treatment effects
mean( predict(opfit) )
[1] 0.9670799

#Using xpred.rpart gives levels
mean( xpred.rpart(tree,cp=opcp) )
[1] 100.3836


Any help on this (or another way of implementing custom cross-validation criteria) would be appreciated! Thanks!

@susanathey
Copy link
Owner

@jonathandroth apologies for the slow response. Did you find a solution, and are you still interested in this?

@jonathandroth
Copy link
Author

@susanathey I worked around this by manually doing cross-validation, i.e. constructing folds myself, training the tree in K-1 of the folds, and then predicting and calculating the loss in the Kth fold.

I think it would be nice if this could be automated better, but I don't need an immediate fix for my current purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants