Detect single-column transformers when they are the first step of a Pipeline#1900
Conversation
|
|
||
| This is done by checking the special attribute | ||
| __single_column_transformer__ (thus inheriting from the | ||
| SingleColumnTransformer class is not mandatory). We treat scikit-learn |
There was a problem hiding this comment.
In what situations would it be better to just set single_column_transformer rather than inheriting from the class?
There was a problem hiding this comment.
for example you don't want the checks and default methods provided by the base class, you're a third-party module that wants to interoperate with skrub but not depend on it, ... there can be a few reasons so a kind of "protocol" like this attribute can be less cumbersome than imposing inheritance. in any case that is already what is in place so not really related to this PR
There was a problem hiding this comment.
Makes sense. I wonder if this should be explained somewhere in the documentation (for very advanced users).
There was a problem hiding this comment.
probably -- we can polish the SingleColumnTransformer docs in #1851
| return True | ||
| if isinstance(transformer, Pipeline): | ||
| try: | ||
| return is_single_column_transformer(transformer.steps[0][1]) |
There was a problem hiding this comment.
There should be a comment here to explain why the function is calling itself, and why the 1 transformers.steps[0][1]
It's because each step is a tuple (name, transformer), but it's not clear from the code
rcap107
left a comment
There was a problem hiding this comment.
Looks good to me, thanks a lot @jeromedockes
when we inspect an estimator to find out if it is a single-column transformer, special-case Pipelines and look at the first step