-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Open
Labels
Description
When using (for example) the following transformators
CountVectorizer
TruncatedSVD
SelectKBest
with GridSearchCV
it happens that it chooses a number for n_features
for CountVectorizer
that is less than n_components
for TruncatedSVD
or k
for SelectKBest
.
This leads to an error: ValueError: n_components must be < n_features
For SelectKBest
I found a temporary solution:
class SelectAtMostKBest(SelectKBest):
def _check_params(self, X, y):
if not (self.k == "all" or 0 <= self.k <= X.shape[1]):
self.k = "all"
But there is no equivalent for TruncatedSVD
.
Is this behaviour intended? If yes, what can I do about this?