[FEATURE] Pass additional parameters to fit underlying estimator in `EstimatorTransformer` #530

CarloLepelaars · 2022-09-12T11:18:12Z

In EstimatorTransformer the underlying estimator is being fitted without the ability to pass along additional arguments to self.estimator_.fit.

This limits use cases for EstimatorTransformer. For example, if the underlying estimator is an XGBClassifier we would like to be able to pass eval_set to monitor validation performance and enable early stopping. This is currently not possible. Adding *args, **kwargs should fix this issue.

scikit-lego/sklego/meta/estimator_transformer.py

Line 31 in b4d087f

self.estimator_.fit(X, y)

The text was updated successfully, but these errors were encountered:

CarloLepelaars · 2022-09-12T11:44:49Z

@koaning

In the future, please don't make a PR until the direction of the solution has been discussed in the issues.

I'll pick up the conversation there.

Ok, no problem! Will keep that in mind.

koaning · 2022-09-12T11:44:57Z

It's a bit of an sklearn antipattern to pass lots of settings via .fit(). Is XGBoost part of sklearn or a 3rd party lib?

CarloLepelaars · 2022-09-12T11:58:10Z

XGBoost is a 3rd party library maintained by dmlc.

I agree hyperparameters shouldn't be passed via .fit(). Unfortunately, some parameters like eval_set often can be passed only with .fit(). I believe sample_weight for scikit-learn estimators can also only be passed through .fit(), but am not completely sure.

Other example use cases include CatBoost parameters and Lightgbm parameters that can only be passed through .fit(). This libraries are also 3rd party, but very often used within scikit-learn Pipelines.

UPDATE: From looking at scikit-learn source code it seems sample_weight can only be passed through .fit() and never with class initialization parameters. Adding this as an optional parameter to EstimatorTransformer.fit() would not necessarily extend this library beyond scikit-learn. Understand the case for generalizing to *args, **kwargs is a bit trickier.

koaning · 2022-09-12T12:10:29Z

I'm a bit uneasy to extend this library beyond scikit-learn because the dependencies quickly start to stack up. @MBrouns what's your opinion on this?

MBrouns · 2022-09-12T13:11:39Z

sklearn does describe the use of kwargs in fit methods: https://scikit-learn.org/stable/developers/develop.html#fitting. I'm not sure I like varargs, but I don't see a lot of problems with varkwargs. I would like to see a test added in the PR though before accepting it

koaning · 2022-09-12T13:36:03Z

TIL.

Yeah so if scikit-learn supports **kwargs then I won't mind.

CarloLepelaars · 2022-09-12T13:54:08Z

Sounds great! I removed the *args option and added a test.

@koaning Can we reopen this PR or should I create a new one?

koaning · 2022-09-12T20:42:02Z

Either option is fine. Just as long as an issue is discussed before a solution is implemented.

koaning · 2022-09-12T20:42:44Z

Oh! And one more thing. If you're adding this behavior to the estimator transformer, could you also add it to the estimatorpredictor?

CarloLepelaars · 2022-09-12T23:15:13Z

Sure! Will check that out tomorrow. Do you mean pass **kwargs through the .transform method in EstimatorTransformer (+ a test case)?

koaning · 2022-09-13T09:01:08Z

This issue is about the .fit() method, no?

CarloLepelaars · 2022-09-13T10:00:22Z

Yes, but what exactly do you mean by estimatorpredictor otherwise? Don't see an EstimatorPredictor object in this repository (on the main branch).

koaning · 2022-09-13T11:00:14Z

Ah! Crud. My bad.

I was confused with the Grouped variant of the meta estimators. These come with a predictor variant.

Please ignore the previous comment.

CarloLepelaars · 2022-09-13T11:14:48Z

Aha, thanks for clearing that up! Then I think we are ready for the PR. Will create a fresh one.

CarloLepelaars added the enhancement New feature or request label Sep 12, 2022

CarloLepelaars mentioned this issue Sep 12, 2022

WIP: Pass along arbitrary parameters to fit EstimatorTransformer #531

Closed

CarloLepelaars changed the title ~~[FEATURE] Pass parameters to fit of underlying estimator in EstimatorTransformer~~ [FEATURE] Pass additional parameters to fit underlying estimator in EstimatorTransformer Sep 12, 2022

CarloLepelaars mentioned this issue Sep 13, 2022

Pass kwargs to fit method of EstimatorTransformer. #532

Merged

CarloLepelaars closed this as completed Sep 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Pass additional parameters to fit underlying estimator in `EstimatorTransformer` #530

[FEATURE] Pass additional parameters to fit underlying estimator in `EstimatorTransformer` #530

CarloLepelaars commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022

koaning commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022 •

edited

koaning commented Sep 12, 2022

MBrouns commented Sep 12, 2022

koaning commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022

koaning commented Sep 12, 2022

koaning commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022

koaning commented Sep 13, 2022

CarloLepelaars commented Sep 13, 2022

koaning commented Sep 13, 2022

CarloLepelaars commented Sep 13, 2022

[FEATURE] Pass additional parameters to fit underlying estimator in EstimatorTransformer #530

[FEATURE] Pass additional parameters to fit underlying estimator in EstimatorTransformer #530

Comments

CarloLepelaars commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022

koaning commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022 • edited

koaning commented Sep 12, 2022

MBrouns commented Sep 12, 2022

koaning commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022

koaning commented Sep 12, 2022

koaning commented Sep 12, 2022

CarloLepelaars commented Sep 12, 2022

koaning commented Sep 13, 2022

CarloLepelaars commented Sep 13, 2022

koaning commented Sep 13, 2022

CarloLepelaars commented Sep 13, 2022

[FEATURE] Pass additional parameters to fit underlying estimator in `EstimatorTransformer` #530

[FEATURE] Pass additional parameters to fit underlying estimator in `EstimatorTransformer` #530

CarloLepelaars commented Sep 12, 2022 •

edited