-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Description
I have a sparse dataset that is too large for main memory if I call X.todense()
. If I understand correctly, GradientBoostingClassifier.fit
will accept my sparse X
, but it is not currently possible to use GradientBoostingClassifier.predict
on the results. It would be great if that were not the case.
Here is a minimal example of the issue:
from scipy import sparse
from sklearn.datasets.samples_generator import make_classification
from sklearn.ensemble import GradientBoostingClassifier
X, y = make_classification(n_samples=20, n_features=5, random_state=0)
X_sp = sparse.coo_matrix(X)
clf = GradientBoostingClassifier()
clf.fit(X,y)
clf.predict(X) # works
clf.fit(X_sp, y) # works
clf.predict(X_sp) # fails with TypeError: A sparse matrix was passed, but dense data is required.
wcbeard, rrhodes, ClimbsRocks, tutuca, iyer and 3 more
Metadata
Metadata
Assignees
Labels
No labels