Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support gradient boosting classifiers #23

Merged
merged 12 commits into from
May 21, 2021
Merged

Support gradient boosting classifiers #23

merged 12 commits into from
May 21, 2021

Conversation

iamDecode
Copy link
Owner

@iamDecode iamDecode commented May 20, 2021

The long awaited gradient boosting model support is finally here.

I have found that PMML generation for GBM models is unfortunately not very consistent. I had to account for many different versions which resulted in a rather massive change.

There are some parts I am yet uncertain about:

  • For a yet unknown reason scores in regression trees for xgboost have to be multiplied by 10. PMML's Target.rescaleFactor seems related. I've added a FIXME note, and it should be revisited when supporting regression trees outside of gradient boosting.
  • I noticed for some PMML exports, the prediction probabilities to be slightly off compared to the original and pypmml. This seems happen at
    value -= 0.00000000001
    , as switching the - for a + fixes the probabilities for those models. However, that inevitably breaks the predictions of other models, which in my testing is way more significant. I am not sure what causes this discrepancy, but was not able to fix it just yet. However, this is such a slight effect I'm willing to accept it for now. Should reinvestigate later.

After this PR, supporting regression should be fairly straightforward.

Closes #20

@codecov
Copy link

codecov bot commented May 20, 2021

Codecov Report

Merging #23 (6e698c0) into master (0a42d68) will increase coverage by 0.22%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master       #23      +/-   ##
===========================================
+ Coverage   99.77%   100.00%   +0.22%     
===========================================
  Files          11        12       +1     
  Lines         440       559     +119     
===========================================
+ Hits          439       559     +120     
+ Misses          1         0       -1     
Impacted Files Coverage Δ
sklearn_pmml_model/base.py 100.00% <100.00%> (ø)
sklearn_pmml_model/ensemble/__init__.py 100.00% <100.00%> (ø)
sklearn_pmml_model/ensemble/forest.py 100.00% <100.00%> (ø)
sklearn_pmml_model/ensemble/gb.py 100.00% <100.00%> (ø)
sklearn_pmml_model/linear_model/implementations.py 100.00% <100.00%> (ø)
sklearn_pmml_model/naive_bayes/implementations.py 100.00% <100.00%> (ø)
sklearn_pmml_model/tree/__init__.py 100.00% <100.00%> (ø)
sklearn_pmml_model/tree/tree.py 100.00% <100.00%> (+1.02%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a42d68...6e698c0. Read the comment docs.

@lgtm-com
Copy link

lgtm-com bot commented May 20, 2021

This pull request introduces 4 alerts when merging 6df5bcb into 0a42d68 - view on LGTM.com

new alerts:

  • 2 for Except block handles 'BaseException'
  • 1 for Conflicting attributes in base classes
  • 1 for Unused import

@lgtm-com
Copy link

lgtm-com bot commented May 21, 2021

This pull request introduces 1 alert when merging 63d5ca3 into 0a42d68 - view on LGTM.com

new alerts:

  • 1 for Conflicting attributes in base classes

@lgtm-com
Copy link

lgtm-com bot commented May 21, 2021

This pull request fixes 7 alerts when merging cf9da61 into 0a42d68 - view on LGTM.com

fixed alerts:

  • 6 for Conflicting attributes in base classes
  • 1 for Unused import

The case of mismatched type between target field and score distribution value was unaccounted for
@lgtm-com
Copy link

lgtm-com bot commented May 21, 2021

This pull request fixes 8 alerts when merging 6e698c0 into 0a42d68 - view on LGTM.com

fixed alerts:

  • 7 for Conflicting attributes in base classes
  • 1 for Unused import

@iamDecode iamDecode merged commit f365e9a into master May 21, 2021
@iamDecode iamDecode deleted the decode/gbm branch May 21, 2021 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GBM Importer
1 participant