Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional params for xgbTree #308

Merged
merged 7 commits into from
Nov 8, 2015
Merged

Additional params for xgbTree #308

merged 7 commits into from
Nov 8, 2015

Conversation

terrytangyuan
Copy link
Contributor

Hi Max,

Many people have requested this so I added a couple of params to cope with changes of xgboost Github version and changed installation of xgboost back to Github. Let me know if there's any change I need to make so we can get this up ASAP.

Thanks,
Yuan

@terrytangyuan
Copy link
Contributor Author

BTW, they are all tested on my end but may still have some mistakes.

@topepo
Copy link
Owner

topepo commented Nov 6, 2015

Doesn't gamma depend on the loss function? Do you know if it is standardized/normalized or what its default value is?

Also, you might want to widen the random search ranges to explore regions that grid search might not.

@terrytangyuan
Copy link
Contributor Author

Yes, it does depend on loss function and is not standardized/normalized. The default value is not specified in the doc and hidden in the C code. From my experience, it is 0 since if you don't supply gamma, it will just split until it hits the max_depth as long as there's still gain when splitting. I have widened the range in random search and supplied some more reasonable gamma values for grid search. If one's going to have a ridiculous loss function, he will need to specify the range on his own then. Let me know if you have a better idea.

Thanks for the careful review.

@topepo
Copy link
Owner

topepo commented Nov 6, 2015

Okay. How about defaulting it to zero for the grid search via

 out <- expand.grid(max_depth = seq(1, len),
                    nrounds = floor((1:len) * 50),
                    eta = c(.3, .4),
                    gamma = 0,
                    colsample_bytree = c(.6, .8),
                    min_child_weight = c(1))

This will let people vary the value but it won't lead to bad performance if they don't know about it. Otherwise. I will send you all the submitted issues for when people use RMSE values that go from 0.0001 to 0.00001 =]

@terrytangyuan
Copy link
Contributor Author

Thanks. Changed.

@terrytangyuan
Copy link
Contributor Author

On green. Could you merge? Let me know if there's any other issues.

topepo added a commit that referenced this pull request Nov 8, 2015
Additional params for xgbTree
@topepo topepo merged commit 224c5fc into topepo:master Nov 8, 2015
@topepo
Copy link
Owner

topepo commented Nov 8, 2015

Thanks a bunch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants