Skip to content
This repository has been archived by the owner on Nov 14, 2023. It is now read-only.

This TuneGridSearchCV instance is not fitted yet. #107

Closed
yijing332 opened this issue Sep 18, 2020 · 6 comments
Closed

This TuneGridSearchCV instance is not fitted yet. #107

yijing332 opened this issue Sep 18, 2020 · 6 comments

Comments

@yijing332
Copy link

yijing332 commented Sep 18, 2020

I have a question,:Does Tune_sklearn only support estimators with 'partial_fit' or 'warm_start' attributes?

When I use the Randomforestclassifier as an estimator, the program always reports an error:
This TuneGridSearchCV instance is not fitted yet.

Code:

# from sklearn.model_selection import GridSearchCV
from tune_sklearn import TuneGridSearchCV
from sklearn.ensemble import RandomForestClassifier
# Other imports
import time
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier

# Set training and validation sets
X, y = make_classification(n_samples=11000, n_features=1000, n_informative=50, n_redundant=0, n_classes=10, class_sep=2.5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1000)

# Example parameters to tune from SGDClassifier
param_grid = {
        'n_estimators': [100,200,300],
        'max_depth':range(5,30,5),
        'min_samples_split':[1,2,5,10,15],
        'min_samples_leaf':[1,2,5,10],
        'max_features': ['log2','sqrt']
    }
forest_clf = RandomForestClassifier(random_state=42,warm_start=True)

grid_search = TuneGridSearchCV(forest_clf, param_grid, cv=5,scoring='accuracy',use_gpu=True)

start=time.time()
grid_search.fit(X_train, y_train)
end=time.time()
print('Tune time: ',end-start)
score=grid_search.score(X_test,y_test)
print("Tune Score:", score)
@inventormc
Copy link
Collaborator

inventormc commented Sep 18, 2020

I think this is an issue with the parameters set. -- min_samples_split must be greater than 1. Can you fix that and see if it works? That being said, this error message is not very informative at all, and we'll add a better error message for this.

To answer your question, tune-sklearn should work with any sklearn estimator (it can do everything that GridSearchCV or RandomSearchCV can do); it only requires the estimator to have partial_fit or warm_start when you want to do early stopping.

@richardliaw

@yijing332
Copy link
Author

Thank you very much for your answer and help me find out the mistakes!
I adjusted the parameters and tried again, but it still didn't seem to work, but when I remove the use_gpu parameter and reduce the number of n_jobs, it can run normally.

@inventormc
Copy link
Collaborator

Great! Glad it works now. I'm assuming you have a gpu on the machine you ran this on right?

@yijing332
Copy link
Author

Yes, there are two GPUs on my laptop. Is it necessary to configure it when I use that param?

@inventormc
Copy link
Collaborator

By default, tune-sklearn doesn't use gpu during training, so yes. It should work with n_jobs and use_gpu, so we'll take a look at this.

@richardliaw
Copy link
Collaborator

Closing this as we have seemed to address this issue!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants