Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom plot function for causalTree rpart.object #27

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

roboton
Copy link

@roboton roboton commented Oct 6, 2017

To address printing of p-values and number of observations at each node in #19

  • to add number of observations pass argument extra=101 (this is part of the prp plotting function for CART).
  • p-values are automatically added for leaf nodes when the model data (x) and response data (y) are in the rpart.object. This requires running causalTree with parameters x=TRUE, y=TRUE.
  • commented out some debug print statements.

@roboton
Copy link
Author

roboton commented Oct 6, 2017

Potential TODOs: (1) make p-value annotation optional if that makes sense, (2) multiple testing corrections.

@arjunkunna
Copy link

Hi, I was trying to test your code and had a couple of questions:

1. Do you intend for first column of X to be treatment variables?

When calling the function using a pruned tree defined by the example code in the documentation of causalTree, (line 20: tree_with_p = plot.causalTree(opfit)), one runs into the error '$ operator is invalid for atomic vectors'. I'm using the data from 'simulation.1'.

Stepping into the functions, the error appears to be coming from add.pvals - the data frame 'merged' is empty. I might be misunderstanding, but it seems like you create a new dataframe 'dat' by using the first column of X as the 'treatment' variable, and then extracts all the rows where treatment == 1 or == 0.

However, as far as I can tell in causalTree the first column of X is not intended to be the treatment vector. The first column of X here is not a binary vector - (I.e. opfit$x[,1] is not all zeros or ones). Thus, when you run you run the merge, it returns nothing.

Could you clarify if you intend x[,1] to be the treatment vector?

2. Do you intend for us to pass in a pruned tree?

In your comments for plot.causal tree, you mention that it "takes the optimally pruned causal tree and adds pvalues." However, in the code, it appears that you prunes the tree inside the function itself. Are you intending for someone to pass a pruned tree, or does it not matter?

For reference, this is the code I used to build the tree:

tree <- causalTree(y~ x1 + x2 + x3 + x4, data = simulation.1, treatment = simulation.1$treatment,
                   split.Rule = "CT", cv.option = "CT", split.Honest = T, cv.Honest = T, x = TRUE, y = TRUE, split.Bucket = F, 
                   xval = 5, cp = 0, minsize = 20, propensity = 0.5)

opcp <- tree$cptable[,1][which.min(tree$cptable[,4])]
opfit <- prune(tree, opcp)

tree_with_p = plot.causalTree(opfit)

Let me know if I misunderstood anything about the code - thank you very much!

@roboton
Copy link
Author

roboton commented Nov 20, 2017

Thank you for your comments and apologies for the long delay.

  1. Yes I intended the first column to be the treatment vector. This may be because I was using causalTree and incorrectly assuming that I needed to include the treatment variable as the first independent variable in the model. Inferring from the example data set I guess this isn't the case?

  2. You're right, pruning doesn't matter.

Apologies in advance if this wasn't very well thought out but when estimating causal effects - significance matters quite a lot and was hoping this would get us started on how to best calculate and present them. I'm also concerned about issues with the robustness of standard errors under different correlation structures.

@arjunkunna
Copy link

Great - thank you very much for clarifying that! Do you think you could edit the code to correct (1), such that it works on the example data set? It would be good to have some consistency. Other than that, it looks good - thank you for the addition!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants