Replies: 6 comments 13 replies
-
|
In splitCustom.c we describe the feature matrix: feature - matrix of user specified features that can be sent into the split rule and acted on as desired. The matrix is of dimension featureCount - count of features in the above matrix, specifically the number of rows. The count will be zero when features are absent. So adjust the data frame to include the feature matrix, and specify your features as y-variables, with zero weight. In a node, those features in the node, can be used in the split rule. They change in each node according to the membership. If record i is in the node, then the set of features in record i are also in that node and can be used by the split rule. In the same way response[i] represents the response for record [i], feature[j][i] is the double value corresponding to the [j]th feature for record [i] |
Beta Was this translation helpful? Give feedback.
-
|
Yes, this would be perfectly fine. It would be a way to grow different models using different constant vectors while using the same complied custom split rule. |
Beta Was this translation helpful? Give feedback.
-
|
Let's say you have a vector of user defined constants alpha = c(alpha1, alpha2, alpha2) that you want to make available to your custom split rule. Again, let's take mtcars, and splitrule = "custom1" in the package as our starting point. First, you would create an auxiliary data frame as follows: aux.response = as.data.frame(cbind(rep(alpha1, dim(mtcars)[1]), rep(alpha2, dim(mtcars)[1]), rep(alpha3, dim(mtcars)[1]))) Then create an augmented data set as follows: mtcars.aux = cbind(aux.response, mtcars) Then a statement like mtcars.aux.mreg = rfsrc(Multivar(mpg, alpha1, alpha2, alpha3) ~ ., data = mtcars.aux, splitrule = "custom1", yvar.wt = c(1, 0, 0, 0)) would make available the feature as a [3] x[n] array in each node in the split rule. It's a bit redundant, because each row is a constant containing alpha1, alpha2, and alpha3, but you could use these constants as you wish. This is what I meant by saying that you would only have to compile once, and introduce the constants via an augmented data frame. |
Beta Was this translation helpful? Give feedback.
-
|
Hi Drs. Ishwaran and Kogalur, As I could follow from this discussion, it is not possible to send features in custom splitting rules for survival and competing risks families. I'm writing here beacause for the specific case of random survival forests, I need a custom splitting rule that uses a matrix of features. To calculate it I need to access some individual variables, such as sex and age. However, using only the inputs passed as arguments in the CustomSplitStatisticSurvival it is not possible to do it without the feature matrix. Given a vector of membership, time or event (that are the function inputs associated to a node), is it possible to assess the other variables associated to each of these individuals (like age, sex, etc), so that I can implement my feature matrix inside the CustomSplitStatisticSurvival function? Thank you in advance for the answer. |
Beta Was this translation helpful? Give feedback.
-
|
Custom features are used and passed in univariate classification and regression families, and multivariate families only. Unfortunately, to allow this in survival and competing risk requires decent modification of the way we pass these custom features, both on the R-side and how we access the features on the C-side. At this time, we do not have a time-line of when this particular campaign will be completed. We apologize for not being more informative, at this point, but it will be possible to do it in the future. |
Beta Was this translation helpful? Give feedback.
-
|
Hi Mr kogalur, My interest is to use an appropriate spitting rule to build a forest for a particular survival measure. For that, I will need some features that depends on demographic covariates, which I would like to pass as a feature in the custom splitting rule function. Since my interest at this moment is only the forest building, is it a possible to consider an adpatation using the Multivariate specification with a formula that pass (i) the time-to-event as a response and (ii) the event indicator and the constant matrix that I need as a feature (using the argument yvar.wt that you meention). Then, inside the custom function, I would compute the quantities of interest for the survival context (for example, to calculate a log-rank type statistics). Is that type of adaptation OK, or is there anything conceptually wrong in that practice (considering the structure of the implemented functions of the randomForestSRC package)? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Dr. Ishwaran,
I have some questions about customizing the splitting rule.
**FeatureinsplitCustom.c? Later, use those features for building the splitting statistics.**Featurechanging dynamically in each node?Thanks
Beta Was this translation helpful? Give feedback.
All reactions