Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to tune a single model? #39

Closed
ahbon123 opened this issue Sep 24, 2017 · 6 comments
Closed

How to tune a single model? #39

ahbon123 opened this issue Sep 24, 2017 · 6 comments

Comments

@ahbon123
Copy link

ahbon123 commented Sep 24, 2017

Hi Marios,

Thanks for sharing Stacknet, great tool for stacking method, but still i'm not clear how to tune a single model, for example, if my paramter file is as following:

_XgboostRegressor booster:gblinear objective:reg:linear max_leaves:0 num_round:500 eta:0.1 threads:3 gamma:1 max_depth:4 colsample_bylevel:1.0 min_child_weight:4.0 max_delta_step:0.0 subsample:0.8 colsample_bytree:0.5 scale_pos_weight:1.0 alpha:10.0 lambda:1.0 seed:1 verbose:false

LightgbmRegressor verbose:false_

What should i put in the command line? Is it same as this one?

java -Xmx12048m -jar StackNet.jar train task=regression sparse=true has_head=false output_name=datasettwo model=model2 pred_file=pred2.csv train_file=dataset2_train.txt test_file=dataset2_test.txt test_target=false params=dataset2_params.txt verbose=true threads=1 metric=mae stackdata=false seed=1 folds=4 bins=3

Thank you!

@kaz-Anova
Copy link
Owner

That is right. So you put the model you want to tune on top and then a dummy (it does not matter what you put ) as a level 2 model. Than you run that command and you see the cross validation score. Then you terminate the process and you change a parameter - for example from colsample_bylevel:1.0 to colsample_bylevel:0.9 and you re-run the same process (e.g the same command) and you take note of performance. You keep doing this until no improvement, then you witch to another parameter - for example subsample.

This process is briefly explained in the last 2 minutes of this video if it helps...

@ahbon123
Copy link
Author

ahbon123 commented Sep 25, 2017

tune a single model
Another question, when i tune one model, i should focus on the score of Average of all folds model 0 in red circle, right? I try to tune xgboost, tuning n_estimator from 500 to 2000, the score changed from 0.05328404590186987 to 0.05310419857868738, does that mean performance get better? 2000 seems too large...

@goldentom42
Copy link

Hi @ahbon123, yes performance gets better, for Mean Absolute Error the closer to 0 the better.
2000 rounds for a booster is not that large.
Good luck ;-)

@kaz-Anova
Copy link
Owner

@ahbon123 . Yes. You could see the value you highlighted. When I am in a hurry, I only look the first 2 or folds and then I terminate the process.

@ahbon123
Copy link
Author

Thanks for your prompt reply. Did you try with K-fold CV in which K is more than 4? I tries with 5, the predictions seems get worse score, so i stop there and don't try with 10, maybe 4 is just good.

@kaz-Anova
Copy link
Owner

normally 4 or 5 is what I use. I dont expect much difference between the 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants