Is it possible to combine LGBMClassifier and IsotonicRegressor into a single PMML? #146

liamjoy · 2020-08-04T12:13:56Z

I have been able to do the above by creating separate PMMLs for both LGBMClassifier and IsotonicRegressor, then copying the IsotonicRegressor PMML into the LGBM PMML as a final Segment in the chained model. I have looked into using StackedClassifier/Regressor but because LGBM is a classifier and Isotonic is a regressor, it does not allow it. I am also unable to use two models in a single pipeline as only one estimator is allowed. Is it possible to do this using a single pipeline or some other work-around?

The text was updated successfully, but these errors were encountered:

vruusmann · 2020-08-04T12:42:29Z

In plain english, what is this model chain supposed to do? What is the function of LightGBMClassifier, what is the function of IsotonicRegression?

Are you trying to "smooth" the prediction of the classifier?

I have looked into using StackedClassifier/Regressor but because LGBM is a classifier and Isotonic is a regressor, it does not allow it.

I assume you're referring to Scikit-Learn's stacking estimator classes, and that it is Scikit-Learn that prevents you from building such a model chain (not the SkLearn2PMML/JPMML-SkLearn stack).

I am also unable to use two models in a single pipeline as only one estimator is allowed

Possible workaround - the first estimator should be packaged as a transformer: jpmml/sklearn2pmml#118

liamjoy · 2020-08-04T13:14:34Z

LGBMClassifier takes in around 100 features to predict a binary target class. The isotonic regression is used to calibrate the model predictions to match a different distribution. The output should be a probability of the target being 1, after prediction calibration.

Thank you, I will look into packaging the LGBMClassifier as a transformer.

vruusmann · 2020-08-04T13:30:15Z

The isotonic regression is used to calibrate the model predictions to match a different distribution.

This looks like a "decision engineering" problem - taking the prediction of a model, and then doing something extra with it.

In such a case LGBMClassifier is still the primary/final estimator of the pipeline, and the challenge is about applying IsotonicRegression to its predicted probability.

Decision engineering is not supported by Scikit-Learn pipelines. However, the sklearn2pmml.pipeline.PMMLPipeline class lets you specify three attributes predict_transformer, predict_proba_transformer and apply_transformer to accomplish it: https://github.com/jpmml/sklearn2pmml/blob/0.61.0/sklearn2pmml/pipeline/__init__.py#L47-L51

Suppose you want to manually correct the predicted probability of a binary classifier:

pipeline = PMMLPipeline(.., predict_proba_transformer = ExpressionTransformer("X[1] * 0.95 + 0.1"))

You should package IsotonicRegression as a transformer instead.

sidelmary · 2021-06-22T12:54:52Z

Hi Villu!

Could you suggest a way to package IsotonicRegression as a transformer, please?
I tried ModelTransformer from #jpmml/sklearn2pmml#118, but bumped into

SEVERE: Failed to convert
java.lang.IllegalArgumentException: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.predict_proba_transformer' has an unsupported value (Python class __main__.ModelTransformer)
	at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
	at org.jpmml.sklearn.PyClassDict.get(PyClassDict.java:57)
	at org.jpmml.sklearn.PyClassDict.getOptional(PyClassDict.java:67)
	at sklearn2pmml.pipeline.PMMLPipeline.getTransformer(PMMLPipeline.java:441)
	at sklearn2pmml.pipeline.PMMLPipeline.getPredictProbaTransformer(PMMLPipeline.java:433)
	at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:101)
	at org.jpmml.sklearn.Main.run(Main.java:145)
	at org.jpmml.sklearn.Main.main(Main.java:94)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
	at java.lang.Class.cast(Unknown Source)
	at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41)
	... 7 more

Exception in thread "main" java.lang.IllegalArgumentException: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.predict_proba_transformer' has an unsupported value (Python class __main__.ModelTransformer)
	at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
	at org.jpmml.sklearn.PyClassDict.get(PyClassDict.java:57)
	at org.jpmml.sklearn.PyClassDict.getOptional(PyClassDict.java:67)
	at sklearn2pmml.pipeline.PMMLPipeline.getTransformer(PMMLPipeline.java:441)
	at sklearn2pmml.pipeline.PMMLPipeline.getPredictProbaTransformer(PMMLPipeline.java:433)
	at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:101)
	at org.jpmml.sklearn.Main.run(Main.java:145)
	at org.jpmml.sklearn.Main.main(Main.java:94)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
	at java.lang.Class.cast(Unknown Source)
	at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41)
	... 7 more

And I'm also not sure how to represent isotonic regression as an expression for ExpressionTransformer. The only idea which came to my mind is iteratively building a string with "if else" clauses with implementing extrapolation between values x in scipy.interpolate.interpolate.interp1d, which is the base of IsotonicRegression in sklearn. But it doesn't seem like a good solution to me.

Are there any other options to wrap IsotonicRegression in a transformer? Or maybe is there a better solution with ExpressionTransformer?

vruusmann · 2021-06-22T13:14:03Z

java.lang.IllegalArgumentException: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.predict_proba_transformer' has an unsupported value (Python class main.ModelTransformer)

Looks like you're trying to develop a custom transformer. You've implemented the Python side, but you still haven't implemented the Java side, plus informing the SkLearn2PMML package about it all.

Lately it's been discussed here: jpmml/sklearn2pmml#283

And I'm also not sure how to represent isotonic regression as an expression for ExpressionTransformer.

See the EstimatorTransformer class from the Scikit-Lego package (I decided to reuse an existing 3rd party class instead of coming up with my own).

Something like this:

from sklego.meta import EstimatorTransformer

# A pre-fitted Isotonic regression
isotonicRegression = ..

pipeline = PMMLPipeline(.., predict_proba_transformer = EstimatorTransformer(isotonicRegression))

sidelmary · 2021-06-23T11:17:19Z

Thanks for the quick response!

I found EstimatorTransformer in the supported packages list #https://github.com/jpmml/jpmml-sklearn and tried to use it but found two issues:

Can't dump the pipeline with it to PMML, having the same issue: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.predict_proba_transformer' has an unsupported value (Python class sklego.meta.estimator_transformer.EstimatorTransformer), probably it can be an issue with library version (I use sklearn2pmml==0.49.3)
Can't use predict_proba_transform, got ValueError: Isotonic regression input should be a 1d array, since the output of the model predict_proba is 2d array, but isotonic expects 1d array.

Is there a workaround with using EstimatorTransformer? Or the only way is building a custom transformer?

vruusmann · 2021-06-23T11:36:01Z

it can be an issue with library version (I use sklearn2pmml==0.49.3)

Exactly - support for the sklego.meta.EstimatorTransformer transformation type was added in SkLearn2PMML version 0.73.0 (released ~3 days ago).

Can't use predict_proba_transform, since the output of the model predict_proba is 2d array, but isotonic expects 1d array.

Use a helper transformer to select a single column (eg. probability of class Z) out of the available ones:

pipeline = PMMLPipeline(..,
  predict_proba_transformer = Pipeline([
    ("select_col", ExpressionTransformer("X[1]")),
    ("transform_col", IsotonicRegression())
  ])
)

Is there a workaround with using EstimatorTransformer? Or the only way is building a custom transformer?

Honestly, just upgrade the SkLearn2PMML package to the latest version.

sidelmary · 2021-06-24T09:03:40Z

Hi, Villu!

I updated sklearn2pmml library and found a new issue while building PMMLPipeline
Code:

model = XGBClassifier( ... )
model.fit(x, y)
pipeline = PMMLPipeline([('classifier', model)])

Error:

 53                 self.apply_transformer = apply_transformer
   54                 # SkLearn 0.24+
---> 55                 super(PMMLPipeline, self).__init__(steps = steps, memory = memory, verbose = verbose)
   56 
   57         def __repr__(self):

TypeError: __init__() got an unexpected keyword argument 'verbose'

sklearn2pmml version:
0.73.0

0.60.0 and older work well, but EstimatorTransformer isn't supported there.

vruusmann · 2021-06-24T09:16:24Z

TypeError: init() got an unexpected keyword argument 'verbose'

The sklearn.pipeline.Pipeline constructor introduced the verbose parameter in Scikit-Learn 0.21.0:
https://scikit-learn.org/0.21/modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline

Why would anyone use a pre-0.21 version in June 2021?

sidelmary · 2021-06-24T11:34:03Z

It works with updated libraries!
Thank you for your suggestions, it helped a lot!

vruusmann closed this as completed Aug 4, 2020

vruusmann mentioned this issue Jun 24, 2021

Accepting different Pipeline constructor signatures (between old/new Scikit-Learn versions) jpmml/sklearn2pmml#285

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to combine LGBMClassifier and IsotonicRegressor into a single PMML? #146

Is it possible to combine LGBMClassifier and IsotonicRegressor into a single PMML? #146

liamjoy commented Aug 4, 2020

vruusmann commented Aug 4, 2020

liamjoy commented Aug 4, 2020

vruusmann commented Aug 4, 2020 •

edited

sidelmary commented Jun 22, 2021

vruusmann commented Jun 22, 2021

sidelmary commented Jun 23, 2021

vruusmann commented Jun 23, 2021 •

edited

sidelmary commented Jun 24, 2021

vruusmann commented Jun 24, 2021

sidelmary commented Jun 24, 2021

Is it possible to combine LGBMClassifier and IsotonicRegressor into a single PMML? #146

Is it possible to combine LGBMClassifier and IsotonicRegressor into a single PMML? #146

Comments

liamjoy commented Aug 4, 2020

vruusmann commented Aug 4, 2020

liamjoy commented Aug 4, 2020

vruusmann commented Aug 4, 2020 • edited

sidelmary commented Jun 22, 2021

vruusmann commented Jun 22, 2021

sidelmary commented Jun 23, 2021

vruusmann commented Jun 23, 2021 • edited

sidelmary commented Jun 24, 2021

vruusmann commented Jun 24, 2021

sidelmary commented Jun 24, 2021

vruusmann commented Aug 4, 2020 •

edited

vruusmann commented Jun 23, 2021 •

edited