-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sklearn's HistGradientBoosting #64
Comments
Hi Matteo, |
Hi Ahmed,
Next, to add a new operator in Hummingbird:
Please share any doubt or question you may have! |
I'm one of the authors of the histgradientboosting estimators, feel free to ping me if you have any question related to them! |
Glad to see you here Nicholas :) |
Thanks @interesaaat for the detailed introduction and @NicolasHug for offering help! I've installed the dependencies, and built the library using Anyways, I started implementing the First, I believe that the equivalent of The problem is that On the other hand, So my question is, how can we map the |
A However the predictor object of the hist-GBDT is different: it's a single structured numpy array, i.e. it's an array whose elements have a specific dtype with multiple entries. It's basically an array of structs, if we were in C. For example the |
Thanks @NicolasHug for the clarification! IIUC, the equivalent of: tree_info = operator.raw_operator.estimators_[0][0]
lefts = tree_info.tree_.children_left should be: tree_info = operator.raw_operator._predictors[0][0]
lefts = [tree_info.nodes[x]['left'] for x in range(len(tree_info.nodes))] when using hist-GBDT. If that is the case, it seems like nodes which don't have left nodes are represented with |
Yup I think you got it right The array of nodes is initialized with all fields being 0. If a node doesn't have a left child that means it doesn't have a right child either so the left/right fields are 0, and the |
I think that for left, right and threshold we should have -1 instead of 0 in Hummingbird because the implementation looks for -1 values. (I might be wrong, but I am on the phone and I am having hard time checking the code) Anyway this is not hard :) Thanks Nicolas for the help! |
OK great,
and
|
@interesaaat would it be a good idea to compare against both |
Why not convert the array "upstream" so that you can rely on the existing code for the non-hist estimators? lefts = [tree_info.nodes[x]['left'] for x in range(len(tree_info.nodes))]
lefts = [idx if idx != 0 else -1 for idx in lefts] |
Yeah, that's better! |
No description provided.
The text was updated successfully, but these errors were encountered: