Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely randomised trees in caret using ranger #581

Closed
hadjipantelis opened this issue Jan 20, 2017 · 9 comments
Closed

Extremely randomised trees in caret using ranger #581

hadjipantelis opened this issue Jan 20, 2017 · 9 comments

Comments

@hadjipantelis
Copy link
Contributor

@hadjipantelis hadjipantelis commented Jan 20, 2017

Hello,

Since version 0.6.4 the package ranger supports Extremely randomised trees. Strictly-speaking the only "new thing" would be to actually change the argument splitrule = "extratrees", nevertheless I have the feeling that people would miss out using extr. random trees if they are not careful. There is a package, which is included in caret already, extraTrees that implements the same algorithm. That implementation though does not do multi-core computation out-of-the-box (and its code-base hasn't change in the last 2 years either, which worries me too) so I think there is a case to use ranger directly.

Should one:

  1. Let things as is and hope/trust the user to find the e.r.t. functionality themselves or
  2. Add a method like: rangerExtreme so users can directly invoke extremely randomise trees?

Let me note that 0.6.4 is not available on CRAN, so you might want to nap the issue of adding the e.r.t. functionality until it is included in the CRAN release?

All best,
Pantelis

@topepo
Copy link
Owner

@topepo topepo commented Jan 23, 2017

My thought is that we should have a different method and update caret when the newer ranger is on CRAN. I don't have a good way of checking for those package dependencies in the package generally, so a check could be written into the method (and it can be removed after a year so some suitable time).

@hadjipantelis
Copy link
Contributor Author

@hadjipantelis hadjipantelis commented Jan 23, 2017

Perfect; I think what you suggested is very reasonable.

@topepo
Copy link
Owner

@topepo topepo commented Apr 7, 2017

I'm coding this up but it looks like it might need to wait until the next release. Now that I've looked into it, I will just make splitrule a tuning parameter but have it default to gini/variance. There seems to be a bug in ranger so it won't happen right now.

@hadjipantelis
Copy link
Contributor Author

@hadjipantelis hadjipantelis commented Apr 7, 2017

No problem; thank you for following this up.

@topepo
Copy link
Owner

@topepo topepo commented Aug 19, 2017

I'll try to put this together for the imminent CRAN release since the CRAN ranger is at 0.8.0.

@topepo
Copy link
Owner

@topepo topepo commented Aug 19, 2017

I decided to incorporate the splitting rule into the basic ranger model. With regular grid search, it will try to usual splitting rule and the randomized trees. That is also included in random search too.

topepo added a commit that referenced this issue Aug 19, 2017
@topepo
Copy link
Owner

@topepo topepo commented Aug 19, 2017

Give it a spin and let me know if there are any issues.

@hadjipantelis
Copy link
Contributor Author

@hadjipantelis hadjipantelis commented Aug 19, 2017

Thank you for this! I agree, having gini/extratrees as a choice is neater.

Unfortunately, no spin out of the box because of the is.tbl call. I just changed the call to dplyr::is.tbl and is fine. Thank you! Having extra-tree was really missing from caret (and in R in general) It should work out of the box with PR #711.

@hadjipantelis
Copy link
Contributor Author

@hadjipantelis hadjipantelis commented Aug 19, 2017

Feel free to close this issue. It seems sorted.

@topepo topepo closed this Aug 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.