Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random param builder for random hyperparameter search in model selectors #238

Merged
merged 10 commits into from Mar 23, 2019

Conversation

@leahmcguire
Copy link
Collaborator

commented Mar 6, 2019

Describe the proposed solution
Made a new param builder that can be used with model selectors in place of spark param grid builder. This builder will generate params randomly within the specified ranges and distributions (which is superior to grid search)

@leahmcguire leahmcguire requested a review from tovbinm as a code owner Mar 6, 2019
leahmcguire added 2 commits Mar 7, 2019
@codecov

This comment has been minimized.

Copy link

commented Mar 7, 2019

Codecov Report

Merging #238 into master will increase coverage by 0.02%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #238      +/-   ##
==========================================
+ Coverage   86.41%   86.44%   +0.02%     
==========================================
  Files         312      313       +1     
  Lines       10187    10223      +36     
  Branches      336      564     +228     
==========================================
+ Hits         8803     8837      +34     
- Misses       1384     1386       +2
Impacted Files Coverage Δ
...e/op/stages/impl/selector/RandomParamBuilder.scala 94.44% <94.44%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9826b38...b383351. Read the comment docs.

@leahmcguire leahmcguire requested review from Jauntbox and kinfaikan Mar 8, 2019
@tovbinm

This comment has been minimized.

Copy link
Collaborator

commented Mar 12, 2019

@leahmcguire please let us if this PR is ready for review


sealed abstract class Distribution extends EnumEntry with Serializable

private object Distribution extends Enum[Distribution] {

This comment has been minimized.

Copy link
@tovbinm

tovbinm Mar 14, 2019

Collaborator

don't make Distribution object private unless you have a very good reason.

This comment has been minimized.

Copy link
@leahmcguire

leahmcguire Mar 22, 2019

Author Collaborator

it is only used inside this class - why would I make it public?

* @param max maximum value for param
*/
def uniform(param: DoubleParam, min: Double, max: Double): this.type = {
assert(min < max, "min must be less than max")

This comment has been minimized.

Copy link
@tovbinm

tovbinm Mar 14, 2019

Collaborator

use require instead of assert, as assertions can be skipped at runtime and requires cannot.

/**
* Builder for a param sets used in random search-based model selection.
*/
class RandomParamBuilder {

This comment has been minimized.

Copy link
@tovbinm

tovbinm Mar 14, 2019

Collaborator

consider allowing users to pass an instance of Random and/or seed

val values: Seq[Distribution] = findValues
case object Uniform extends Distribution
case object Subset extends Distribution
case object Exponential extends Distribution // TODO inverse exponential?

This comment has been minimized.

Copy link
@tovbinm

tovbinm Mar 14, 2019

Collaborator

are you planning to add inverse exponential?

This comment has been minimized.

Copy link
@leahmcguire

leahmcguire Mar 22, 2019

Author Collaborator

I cant decide, it is only for parameters that are bounded by 0 and 1 where you want to have more fine grained exploration close to 1...so not sure how useful it is

private val xgb = new OpXGBoostClassifier()


it should "build a param grid of the desired length with one param variable" in {

This comment has been minimized.

Copy link
@tovbinm

tovbinm Mar 14, 2019

Collaborator

Spec[RandomParamBuilder] should ".."

@tovbinm

This comment has been minimized.

Copy link
Collaborator

commented Mar 22, 2019

@leahmcguire some scala style issues, otherwise lgtm

leahmcguire added 2 commits Mar 22, 2019
@leahmcguire leahmcguire merged commit 304adb0 into master Mar 23, 2019
9 checks passed
9 checks passed
CodeFactor No issues found.
Details
Travis CI - Pull Request Build Passed
Details
ci/circleci: compile Your tests passed on CircleCI!
Details
ci/circleci: compile-hw Your tests passed on CircleCI!
Details
ci/circleci: scala-style Your tests passed on CircleCI!
Details
ci/circleci: test Your tests passed on CircleCI!
Details
codecov/patch 94.44% of diff hit (target 86.41%)
Details
codecov/project 86.44% (+0.02%) compared to 9826b38
Details
salesforce-cla All contributors have signed the CLA
Details
@leahmcguire leahmcguire deleted the lm/randomParams branch Mar 23, 2019
@tovbinm tovbinm referenced this pull request Apr 10, 2019
@tovbinm tovbinm referenced this pull request Jul 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.