-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
variable importance option #52
Comments
I ran the feature significant and compared the results to Sklearn output. not only the results are different, but also the results that I'm getting using this implementation doesn't make any sense(using the info that I have about my data). maybe I don't know how to read the output properly? Tsachi |
Thanks @Howard-ll , we are happy to hear the tools are useful! There are various definitions of "feature importance" -- they are all metrics about the model/dataset, but there is not an absolute "truth" or best one. Now we should have a clear documentation page with the list of all feature importances we support -- with pointers to the papers that define some of them -> making this as an "enhancement" for us to work on. |
The list of features importances and their definition is given here in the Yggdrasil user manual.
Note that different variable importances have different semantics. Unless specified, the greater the value, the most important the feature.
Sklean's "mean decrease in impurity" is likely close to the
The Variable importance section of the user documentation and the model specific documentation (for example, Random Forest). The |
Hello!
First of all, I highly appreciate your efforts for TFDF
Found that there are multiple options for variable importance such as NUM_AS_ROOT
variable_importance = model.make_inspector().variable_importances()['NUM_AS_ROOT']
Thank you!
The text was updated successfully, but these errors were encountered: