Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fit for AutoSklearn2Classifier fails whens input matrix is in CSR sparse Format #1204

Closed
JohnnyC08 opened this issue Aug 2, 2021 · 3 comments · Fixed by #1245
Closed

Fit for AutoSklearn2Classifier fails whens input matrix is in CSR sparse Format #1204

JohnnyC08 opened this issue Aug 2, 2021 · 3 comments · Fixed by #1245
Labels
Projects

Comments

@JohnnyC08
Copy link

Describe the bug

When fitting a CSR matrix on the AutoSklearn2Classifier, the method fails and returns Trying to set illegal value 'gradient_boosting'

python version: 3.9
auto-sklearn version: 0.13.0

To Reproduce

Steps to reproduce the behavior:

import sklearn.datasets
from scipy import sparse
from autosklearn.experimental.askl2 import AutoSklearn2Classifier

X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = \
    sklearn.model_selection.train_test_split(X, y, random_state=1)

X_train = sparse.csr_matrix(X_train)
automl = AutoSklearn2Classifier(
    time_left_for_this_task=120,
    per_run_time_limit=30
)
automl.fit(X_train, y_train, dataset_name='breast_cancer')

Note, the above works fine when using autosklearn.classification.AutoSklearnClassifier.

The use case for a CSR matrix is fitting text classification models after the text has been transformed into a TF-IDF format.

@project-bot project-bot bot added this to LabelMe in Maintenance Aug 2, 2021
@eddiebergman
Copy link
Contributor

Hi @JohnnyC08, thanks for the bug report and the reproducible code. I can confirm this causes the same error but I can't immediately tell why.

I will address it as soon as I can as sparsity is a frequency culprit of issues, but in the meantime I'm afraid you will have to go with the non-experimental version.

@eddiebergman eddiebergman moved this from LabelMe to Bug in Maintenance Aug 4, 2021
@komal11lamba

This comment has been minimized.

@mfeurer mfeurer mentioned this issue Sep 13, 2021
@mfeurer
Copy link
Contributor

mfeurer commented Sep 13, 2021

We just added support for sparse data via #1245 and it will be available in the upcoming release.

@mfeurer mfeurer closed this as completed Sep 13, 2021
Maintenance automation moved this from Bug to Done Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants