-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output tree$nodes[[i]]$samples #258
Comments
You're right, I've kept this issue open and tagged it with 'documentation', so we remember to add an explanation to |
It would be better to keep track of which is which (J1 and J2) for the use case of using the results from a single tree; may matter for different methods of calculating standard errors as well. |
@susanathey to clarify the exchange above, because you have access to both the leaf samples of a tree, and the overall 'drawn samples' for that tree, both My intuition is that unless accessing both |
@jtibshirani Sorry I misunderstood. Maybe we can post a code sample and/or add it to our testing or demo code for users who might want to access them. |
I've updated the documentation in #268. |
Hello @jtibshirani
Quick question: For a given final node "i" of a tree (i.e. a leaf),
does the output tree$nodes[[i]]$samples correspond to the observations of the training sub sample used to build the tree (i.e. J1 in paper) falling in that leaf, or are they the observations from the other sub sample (J2) falling in that leaf?
Thanks!
@predt I'm sorry I missed your question earlier! Would you be able to open a new issue with this question, and I will add a detailed answer there? Keeping each issue scoped to one topic helps ensure that other users with the same question will be able to find the answer as well. To answer briefly, that vector only contains examples from the second subsample (J2).
Thanks, @jtibshirani. Since tree$nodes[[i]]$samples corresponds to J2, the complement in "drawn_samples" should give me the set of samples in J1. Is that correct?
I'm working in the appendix of an application of the GRF. I'm using a tree example figure to make more pedagogical the explanation of building a tree. I wanted to add the theta.hat.P values that results after splitting of a node ( theta.hat.P is the notation in the paper) to illustrate how splits favor heterogeneity in the context of a generalized causal forest. That is the reason of looking for the J1 samples. Thanks.
The text was updated successfully, but these errors were encountered: