Conditional Inference Trees #56

andrewnolanhall · 2020-03-21T00:56:51Z

I enjoyed the code that you have here as well as the descriptions in the accompanying slideshow. I have two suggestions.

Would you consider adding a section on conditional inference trees? These are trees that utilize probabilistic associations between features and outcome variable to make splits rather than the information gain (or impurity/error reduction) criterion used here. Conditional inference trees are effective in reducing the variable selection bias present in CART methods (the framework for decision tree creation used here) in which variables with more splits are preferentially selected when compared to variables with fewer splits. They are also less risky to interpret, as each split is based on some statistical association rather than absolute differences in some information metric. The one thing I worry about is obfuscating your very clear presentation of the decision tree algorithm, but this might help to reduce confusion down the line when people try to interpret CART method trees too much. Note that besides the segmentation criterion, conditional inference trees are identical to CART method trees.
In the Module 5-Decision Tree slideshow on slide 53 (the slide right before "Model Performance") you reference the fact that the RMSE algorithm proceeds by "Calculating the variance for each node" and "calculating the variance for each split as weighted average of each node variance." Unless I'm misunderstanding here, I believe this is incorrect. You are not calculating the variance in each node, nor are you calculating the variance for each split; you are instead calculating the root mean squared error (RMSE) and not the variance. The equations are similar, but for variance you are subtracting the average value from each observation in the summation, whereas in RMSE you are subtracting the predicted value from each observation in the summation.

Thank you for your consideration.

CloudChaoszero · 2020-03-21T08:32:30Z

Hello @andrewnolanhall,

Hope you are doing well! I am one of the Teaching Fellows helping out at Delta Analytics.

Thanks for reaching out about these two points, and Delta Analytics appreciates your feedback! These are great conceptual topics I can ask the technical lead or other leads about, for sure.

I will keep you posted! Have a good one until then.

andrewnolanhall · 2020-03-23T15:09:08Z

Thank you for your response! And thanks for considering it. I couldn't decide if adding the conditional inference trees would add more confusion than they are worth, but thought it might be good to point them out. Best, *Andrew N Hall * *PhD Candidate, Northwestern University * *Data Science Research Consultant* *Northwestern IT Research Computing Services*

…

On Sat, Mar 21, 2020 at 3:32 AM Raul-ing Average ***@***.***> wrote: Hello @andrewnolanhall <https://github.com/andrewnolanhall>, Hope you are doing well! I am one of the Teaching Fellows helping out at Delta Analytics. Thanks for reaching out about these two points, and Delta Analytics appreciates your feedback! These are great conceptual topics I can ask the technical lead or other leads about, for sure. I will keep you posted! Have a good one until then. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#56 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKI2TB7H4V75N5HP5QLXBNDRIR3STANCNFSM4LQXMX4Q> .

brianspiering · 2020-04-24T19:03:53Z

Closing this issue. Conditional inference trees are out of the scope for the introductory nature of this course.

CloudChaoszero self-assigned this Mar 21, 2020

CloudChaoszero added enhancement question labels Mar 21, 2020

brianspiering closed this as completed Apr 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Conditional Inference Trees #56

Conditional Inference Trees #56

andrewnolanhall commented Mar 21, 2020

CloudChaoszero commented Mar 21, 2020

andrewnolanhall commented Mar 23, 2020 via email

brianspiering commented Apr 24, 2020

Conditional Inference Trees #56

Conditional Inference Trees #56

Comments

andrewnolanhall commented Mar 21, 2020

CloudChaoszero commented Mar 21, 2020

andrewnolanhall commented Mar 23, 2020 via email

brianspiering commented Apr 24, 2020