-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor transformed estimators #51
Comments
@aazuspan, I like it! I especially like the prevention of instantiation by code rather than by convention and, as you say, it gets rid of some duplicated code as well. A couple of quick questions/comments just to make sure I get it:
|
Bingo!
Yes, great point! That was unintentional - we should definitely choose a better name. Is
Separating out the transformer fitting is a workaround to an inheritance problem that arises when It's 100% possible there's a better workaround that I didn't think of though! |
I guess I have a slight preference for the latter, but either is fine by me.
Oof, let's avoid that complexity! Having a separate |
Agreed! Should we stick with
Sounds like a plan! I already worked most of this up to make sure it was possible, so I'll make a PR shortly. |
Ooo, maybe |
This resolves #51 by: * Refactoring the TransformedKNeighborsRegressor into an abstract class * Moving transformer instantiation out of the estimator fit methods and into an abstract _get_transformer method * Moving transformer fitting out of the estimator fit methods and into the TransformedKNeighborsRegressor._set_fitted_transformer method to reduce duplication * Creating a YFitMixin to handle transformer fitting for GNN and MSN This also: * Renames the transform_ attribute to transformer_ for consistency with the new methods * Adds a _validate_data check with force_all_finite=True when fitting all transformed estimators. This was needed by MSN, but also fixed an xfailing estimator check for GNN, which allowed us to drop that from the tags.
Resolved by #52 |
@grovduck I have a proposal to run by you, inspired by your refactor of the inheritance in #50. This design builds off of the changes there, so that would be merged before tackling this. I thought about proposing this as part of that PR, but didn't want to derail things with even more refactoring.
I suggest we turn
TransformedKNeighborsRegressor
into an abstract class by inheriting from ABC, then add an abstract method_get_transform
that all subclasses would be required to implement, e.g.:Next, we would get rid of the
fit
methods on those estimators and move that functionality intoTransformedKNeighborsRegressor.fit
, using a_set_fit_transform
method to handle the instantiation and fitting of the transformer.To accommodate for fitting transformers with separate y data in
GNN
andMSN
, we could add a kNN-independentYFitMixin
that overrides thefit
method to accept the additional argument, store it as an attribute, and fit the transformer with it using an overriden_set_fit_transform
:Overall, this should reduce some code duplication in the
fit
methods, prevent instantiation ofTransformedKNeighborsRegressor
without having to rely on making it private, and add a runtime check to ensure that all transformed estimators define a transformer function. The main downside is the need to store a reference toy_fit_
. There may be another way to handle theYFitMixin
, but that was the best solution I could come up with after trying a few different strategies.Curious to hear your thoughts on this design, and if you foresee any limitations.
The text was updated successfully, but these errors were encountered: