Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG] ENH add missing indicator in KNNImputer #15010
For another PR
thomasjpfan left a comment •
This type of abstraction usually makes code harder to read for people unfamiliar with the code because one needs to go to the parent class to know information such as "when is
An alternative would be something like this:
class _BaseImputer: def __init__(self, missing_values=np.nan, add_indicator=False): self.missing_values = missing_values self.add_indicator = add_indicator def _fit_indicator(self, X): if self.add_indicator: self.indicator_ = MissingIndicator( missing_values=self.missing_values, error_on_new=False) self.indicator_.fit(X) else: self.indicator_ = None def _concatenate_indicator(self, X): if self.add_indicator is None: return X X_trans_indicator = self.indicator_.transform(X) hstack = sparse.hstack if sparse.issparse(X) else np.hstack return hstack((X_trans_imputer, X_trans_indicator)) class SimpleImputer(_BaseImputer): def fit(self, X): # fit things super()._fit_indicator(X) return self def transform(self, X): # do transform things return super()._concatenate_indicator(self, X)
Personally, I can read both abstraction styles about the same. But I think the above implementation is a little more explicit.
We need to decide which design is easier to maintain.
The current design, has the parent call functions that a child will define. This API exposed this contract through the abstractness of
My proposal has the child calls methods in the parent class, which I consider more explicit. The parent class provides methods that the child class calls to do common operations. (The operations may change state, or use that state to do
So I still need to go to the parents to know what it does.
Where I find your design a bit dangerous is that nothing will avoid me to create a new imputer without calling
This said, if @jnothman could give some advice, I'll be happy to go with one design or the other (I am far to be good with those).
You would know to go to the parent class to see what it does, since it explicitly calls the parent method. Imagine a new contributor with this PR:
With the my proposal:
I am also happy with either design.