Skip to content

FIX FunctionTransformer overwrite column names if not consistent #28241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 1, 2024

Conversation

glemaitre
Copy link
Member

closes #28232

This make the FunctionTransformer more lenient by overwriting columns if the output is not consistent with the behaviour of get_feature_names_out. We were previously raising an error instead but we used this inconsistency within the ColumnTransformer.

It remains one question: do we want to trigger a copy when setting the columns? I think it is weird that the input X get modified once passed in the FunctionTransformer.

Copy link

github-actions bot commented Jan 24, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 6cc0ea8. Link to the linter CI: here

@glemaitre glemaitre added this to the 1.4.1 milestone Jan 24, 2024
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ended up a large diff 😁

@glemaitre
Copy link
Member Author

glemaitre commented Jan 31, 2024

The failure will be solved by merging #28262

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM.

@celestinoxp
Copy link

As I mentioned before, my pull-request in pycaret to have scikit-learn 1.4 support failed, but now I have just modified the scikit-learn code on my laptop with the modifications in this PR, so far, I have run the scikit-learn test again pycaret that failed because of get_feature_names_out and now the test passed!
So I'm commenting to give my testimony with the aim of collaborating.

For more context, I refer to my Pycaret pr with failing tests because of get_feature_names_out, it's here: pycaret/pycaret#3857

@thomasjpfan thomasjpfan added the To backport PR merged in master that need a backport to a release branch defined based on the milestone. label Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:compose module:preprocessing To backport PR merged in master that need a backport to a release branch defined based on the milestone.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression in ColumnTransformer due to internal FunctionTransformer
4 participants