Operations when merging EBM #532

JWKKWJ123 · 2024-04-22T17:33:34Z

Hi all,
Now I am using ebm.merge( ). However, I haven't found the formula or pseudocode about the merge of EBMs. If I want to explain the merging of EBMs to others, is there a paper/formula/pseudocode that I can refer?

paulbkoch · 2024-04-22T18:57:36Z

Hi @JWKKWJ123 -- I'm not aware of a paper that describes the merge function. The effect of merge_ebms() is the same as traditional model ensembling, which you could do on any model. In the case of EBMs though, since the predictions from the EBM come from additive functions, instead of averaging the predictions after they've been made, you can through associativity move the addition back to the partial response functions. There's more complexity in practice because the bins from the EBMs don't necessarily line up. We handle it by making new bins that are a superset of all the bins between the EBMs being merged.

JWKKWJ123 · 2024-04-22T23:34:24Z

Dear Paul,
Thank you very much! I am not familiar with decision tree. To my understanding, the 'partial response functions' = 'shape functions' = 'decision trees' in a EBM. So the merging of EBM is to merge each corresponding decision tree. And I guess the intercept is arithmetic mean. But I have no idea how to define the new bins in the merged EBM. So draw a pseudocode of merging EBM, an example of merging two decision trees. I would like to ask if my understanding is correct (I think it is wrong)?

paulbkoch · 2024-04-23T00:49:34Z

Hi @JWKKWJ123 -- The range of x should be from -inf to +inf. Using your example mostly, let's say I have:

EBM1, bin_range:score
[-inf, 1): 0
[1, 3): 1.5
[3, +inf]: 3

EBM2, bin_range:score
[-inf, 2): 0
[2, 4): 2
[4, +inf]: 2.5

The new bins and scores will be:
[-inf, 1): 0 ------> (0+0)/2
[1, 2): 0.75 ------> (1.5+0)/2
[2, 3): 1.75 ------> (1.5+2)/2
[3, 4): 2.5 ------> (3+2)/2
[4, +inf]: 2.75 ------> (3+2.5)/2

And if EBM1 and EBM2 have intercepts, then take the average of that too.

JWKKWJ123 · 2024-04-23T18:12:46Z

Hi @JWKKWJ123 -- The range of x should be from -inf to +inf. Using your example mostly, let's say I have:

EBM1, bin_range:score [-inf, 1): 0 [1, 3): 1.5 [3, +inf]: 3

EBM2, bin_range:score [-inf, 2): 0 [2, 4): 2 [4, +inf]: 2.5

The new bins and scores will be: [-inf, 1): 0 ------> (0+0)/2 [1, 2): 0.75 ------> (1.5+0)/2 [2, 3): 1.75 ------> (1.5+2)/2 [3, 4): 2.5 ------> (3+2)/2 [4, +inf]: 2.75 ------> (3+2.5)/2

And if EBM1 and EBM2 have intercepts, then take the average of that too.

Hi Paul,
Thank you very much! Now I am much more clear about the merge of EBMs and trees.
I have a more question: the merged tree will have more bins, if the number of bins exceed the max_bin, do the bins in the merged tree need to with each other?
From my understanding the merge of bins could be combine the boundary and average the value.

paulbkoch · 2024-04-23T19:22:25Z

max_bins only applies when fitting EBMs. If you merge EBMs afterwards, there is no upper limit. max_bins only applies to continuous features too, BTW.

JWKKWJ123 · 2024-04-24T09:23:57Z

max_bins only applies when fitting EBMs. If you merge EBMs afterwards, there is no upper limit. max_bins only applies to continuous features too, BTW.

Hi Paul,
Thank you very much! Now I am clear.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operations when merging EBM #532

Operations when merging EBM #532

JWKKWJ123 commented Apr 22, 2024 •

edited

paulbkoch commented Apr 22, 2024 •

edited

JWKKWJ123 commented Apr 22, 2024

paulbkoch commented Apr 23, 2024

JWKKWJ123 commented Apr 23, 2024

paulbkoch commented Apr 23, 2024

JWKKWJ123 commented Apr 24, 2024

Operations when merging EBM #532

Operations when merging EBM #532

Comments

JWKKWJ123 commented Apr 22, 2024 • edited

paulbkoch commented Apr 22, 2024 • edited

JWKKWJ123 commented Apr 22, 2024

paulbkoch commented Apr 23, 2024

JWKKWJ123 commented Apr 23, 2024

paulbkoch commented Apr 23, 2024

JWKKWJ123 commented Apr 24, 2024

JWKKWJ123 commented Apr 22, 2024 •

edited

paulbkoch commented Apr 22, 2024 •

edited