Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fit_params for cross_val_score in StackingClassifier #177

Open
agzamovr opened this issue Apr 16, 2017 · 4 comments · May be fixed by #255
Open

Support fit_params for cross_val_score in StackingClassifier #177

agzamovr opened this issue Apr 16, 2017 · 4 comments · May be fixed by #255

Comments

@agzamovr
Copy link

Is it possible to pass fit params for individual classifiers? I tried to pass fit params for XGBClassifier but got error. My code is following:

clf1 = xgbooster()
clf2 = linear_svc()
lr = LogisticRegression()
fit_params = {
    'xgbclassifier__eval_metric': 'mlogloss',
    'xgbclassifier__eval_set': [(X_test, y_test)],
    'xgbclassifier__early_stopping_rounds': 100,
    'xgbclassifier__verbose': False}
estimator = StackingClassifier(classifiers=[clf1, clf2],
                               meta_classifier=lr)
mean_score = cross_val_score(estimator=estimator,
                             X=X_train,
                             y=y_train,
                             scoring='neg_log_loss',
                             cv=5, 
                             verbose=5, 
                             fit_params=fit_params,
                             n_jobs=-1).mean()
@rasbt
Copy link
Owner

rasbt commented Apr 16, 2017

Hi @agzamovr ,

Is it possible to pass fit params for individual classifiers?

Oh yes, of course :). I have an example here: http://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/#example-3-stacked-classification-and-gridsearch

The syntax for accessing estimator params is similar to the one used by make_pipeline in scikit-learn. E.g.,

from sklearn.pipeline import make_pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

pipe = make_pipeline(StandardScaler(), LogisticRegression())

params = {'logisticregression__C': [0.1, 1., 10.]}

grid = GridSearchCV(estimator=pipe, 
                    param_grid=params, 
                    cv=5)
grid.fit(X, y)

(Essentially, it is just lowercasing the class name. If you have multiple objects from the same class, it would enumerate them, e.g., 'logisticregression-1', 'logisticregression-2', etc.

So, looking at your code above, it looks like you have a small typo, and it should be 'xgbooster' in

fit_params = {
    'xgbclassifier__eval_metric': 'mlogloss',
...

PS: If in doubt what the actual parameter names are, you could get a list via

estimator = StackingClassifier(classifiers=[clf1, clf2],
                               meta_classifier=lr)

estimator.get_params().keys()

Let me know if it solves the problem!

@agzamovr
Copy link
Author

Thank you for response!
xgbooster is not a class name, it's a method which creates xgboost.XGBClassifier class instance. For this reason i used xgbclassifier__ prefix. In your example you use param_grid parameter, but GridSearchCV also has fit_params paramter. param_grid is used for creating estimator, while fit_params is used when calling fit paramter.

@rasbt
Copy link
Owner

rasbt commented Apr 16, 2017

Oh I see what you mean now. I don't know how I could have misread your issue so badly :P.

Yeah, unfortunately, this doesn't work, yet. But I guess it shouldn't be too hard to add this features; it could be pretty useful imho

@rasbt rasbt changed the title Fit params for cross_val_score Support fit_params for cross_val_score in StackingClassifier Apr 16, 2017
@jrbourbeau jrbourbeau linked a pull request Sep 22, 2017 that will close this issue
7 tasks
@fredsamhaak
Copy link

Hi @rasbt , I've encountered a problem when using GridSearchCV and cross_val_score.
I embedded GridSearchCV (inner cv) in cross_val_score (outer cv) according to the method written in your book (Python Machine Learning) so that my training set is divided into inner and outer part.
Inner part is used to tune hyper-parameters while outer part is used to evaluate models with the best hyper-paramters. But how to pass fit parameters for GridSearchCV rather than the model? Seems like fit_params is only for the fit parameters of models, right?
Looking forward to your reply, thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants