Add Poisson splitting rule #495

lorentzenchr · 2020-03-22T15:56:30Z

What does this PR do?

This PR implements a splitrule = "poisson" with the additional option poisson.tau to deal with pure nodes that have y = 0.

References

Solves #433.

Further info

The Poisson splitrule is based on the Poisson deviance, but after some arithmetics that makes the split rule computation faster.

The option poisson.tau takes action if a terminal node has y=0. It then estimates the value of that node as alpha * 0 + (1-alpha) * mean(parent node) and alpha = samples(node)*mean(parent node) / (poisson.tau + samples(node)*mean(parent node)). The larger the value of poisson.tau the closer the prediction to the parent node's mean. Rpart does it similar.
An alternative would have been (or for the future?) to give an option like "minimum sum of responses per node".

lorentzenchr · 2020-03-22T15:57:31Z

@mnwright It would be really great if you could have a look. Feedback is very warmly welcome.

R/ranger.R

mnwright · 2020-03-24T06:21:43Z

tests/testthat/test_quantreg.R

@@ -2,7 +2,7 @@ library(ranger)
 context("ranger_quantreg")

 rf.quant <- ranger(mpg ~ ., mtcars[1:26, ], quantreg = TRUE, 
-                   keep.inbag = TRUE, num.trees = 50)
+                   keep.inbag = TRUE, num.trees = 100, seed = 0)


Any reason for the change? Or just a mistake commit?

Before this change, I got

Error in ranger(mpg ~ ., mtcars[1:26, ], quantreg = TRUE, keep.inbag = TRUE, : Error: Too few trees for out-of-bag quantile regression.

It also depends on the seed. This change solved that on my machine. I should have done a separate commit.

tests/testthat/test_poissonsplit.R

src/TreeRegression.cpp

mnwright · 2020-03-25T06:02:47Z

Looks great, thanks!

What I have to do before merge (notes to myself):

Add to pure C++ version (help etc.)
Read and understand the splitting rule

lorentzenchr · 2020-03-25T10:35:39Z

Thank you for the fast review!

lorentzenchr · 2020-03-25T18:12:40Z

I refactored the Poisson splitting rule a bit to follow findBestSplitValueSmallQ instead of findBestSplitValueBeta. This gave me a huge speedup (up to 20x). It is now roughly half as fast as the variance splitting rule.

I assume, a similar speedup could be possible for the beta splitting rule?

mnwright · 2020-03-26T07:03:18Z

I assume, a similar speedup could be possible for the beta splitting rule?

Yes, that should be possible.

lorentzenchr · 2023-12-23T10:40:15Z

@mnwright I merged master and still think this would be nice to have.

mnwright · 2024-05-16T06:43:45Z

Sorry for the long silence. I still think this is useful. Let's try to merge it for the next release.

mnwright · 2024-05-16T14:34:03Z

Looks good, I think we are ready to merge?

lorentzenchr · 2024-05-30T06:44:04Z

Yes, why not just merge?

mnwright · 2024-06-11T06:28:41Z

Merged 🎉

Christian Lorentzen added 3 commits March 22, 2020 16:20

ENH add poisson splitting rule to regression tree

a2090bb

ENH check in forest with poisson splitrule for valid range of y

1a975f0

MNT indend comments correctly

cb43089

Merge branch 'master' into poisson

7edc12a

mnwright reviewed Mar 25, 2020

View reviewed changes

ENH address review comments for poisson splitrule

b593d65

ENH make Poisson splitrule analogous to findBestSplitValueSmallQ

d4f732a

Merge branch 'master' into poisson

5463db7

mnwright added the Next release label May 16, 2024

mnwright added 4 commits May 16, 2024 13:46

merge with new master

8e41190

forgot some merge conflicts...

4ac8f57

min bucket for Poisson splitting

ed2b73d

add Poisson splitrule to pure C++ version

24bd170

new version for Poisson splitting

6e9d42b

mnwright merged commit 10b73fd into imbs-hl:master Jun 11, 2024
8 checks passed

lorentzenchr deleted the poisson branch June 11, 2024 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Poisson splitting rule #495

Add Poisson splitting rule #495

lorentzenchr commented Mar 22, 2020

lorentzenchr commented Mar 22, 2020

mnwright Mar 24, 2020

lorentzenchr Mar 25, 2020

mnwright commented Mar 25, 2020

lorentzenchr commented Mar 25, 2020

lorentzenchr commented Mar 25, 2020 •

edited

Loading

mnwright commented Mar 26, 2020

lorentzenchr commented Dec 23, 2023

mnwright commented May 16, 2024

mnwright commented May 16, 2024

lorentzenchr commented May 30, 2024

mnwright commented Jun 11, 2024

Add Poisson splitting rule #495

Add Poisson splitting rule #495

Conversation

lorentzenchr commented Mar 22, 2020

What does this PR do?

References

Further info

lorentzenchr commented Mar 22, 2020

mnwright Mar 24, 2020

Choose a reason for hiding this comment

lorentzenchr Mar 25, 2020

Choose a reason for hiding this comment

mnwright commented Mar 25, 2020

lorentzenchr commented Mar 25, 2020

lorentzenchr commented Mar 25, 2020 • edited Loading

mnwright commented Mar 26, 2020

lorentzenchr commented Dec 23, 2023

mnwright commented May 16, 2024

mnwright commented May 16, 2024

lorentzenchr commented May 30, 2024

mnwright commented Jun 11, 2024

lorentzenchr commented Mar 25, 2020 •

edited

Loading