-
-
Notifications
You must be signed in to change notification settings - Fork 25k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ENH Add sparse_threshold keyword to ColumnTransformer (#11614)
Reasoning: when eg OneHotEncoder is used as part of ColumnTransformer it would cause the final result to be a sparse matrix. As this is a typical case when we have mixed dtype data, it means that many pipeline will have to deal with sparse data implicitly, even if you have only some categorical features with low cardinality. Idea was first to change default of `OneHotEncoder` sparse to False, but based on gitter discussion (https://gitter.im/scikit-learn/dev?at=5b4e5a69a94c5255523bc9fc) we decided to let ColumnTransformer switch between both based on a threshold. The user still has full control if he/she wants always or never sparse.
- Loading branch information
1 parent
c2b7478
commit cf897de
Showing
3 changed files
with
117 additions
and
27 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters