Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TODO Master issue #2

Closed
19 tasks done
nipunbatra opened this issue Jun 13, 2019 · 8 comments
Closed
19 tasks done

TODO Master issue #2

nipunbatra opened this issue Jun 13, 2019 · 8 comments
Assignees

Comments

@nipunbatra
Copy link
Collaborator

nipunbatra commented Jun 13, 2019

  • Write about Hyperparameter v/s parameter: https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)
  • Write about Thompson sampling
  • Motivate grid search expensive computation cost
  • Formally introduce the properties/problem of Bayesian optimization (aka Frazier video) and relate each point to the gold mining problem. This should happen after we have done the gold mining problem and more generally decide to introduce the problem.
  • Explain role of epsilon in PI
  • epsilon in EI..
  • Write about galore of packages and services including sigopt, hyperopt, spearmint, scikit-optimize etc.
  • Look into examples introduced in sigopt/hyperopt/spearmint/skoptimize to seek inspiration
  • BO v/s GD: https://stats.stackexchange.com/questions/161923/bayesian-optimization-or-gradient-descent This also has some nice things which can be problematic for GPs (like being cubic in input dimension)
  • Use prettier fonts and experiment with five thirty eight style for figures
  • Refer and read this excellent article: https://thuijskens.github.io/2016/12/29/bayesian-optimisation/ and the citations in it.
  • Introduce the problems (kernel choosing, other hyperparameters) in Bayesian optimization. -- will refer slide# 30 in Peter Fraizer Bayesian Optimization tutorial, link
  • Why the above problem is much easier to deal with?
  • Different Epsilon and the result on PI
  • BO when there are categorical variables also: Gaussian Process for Integer and Categorical Dimensions? scikit-optimize/scikit-optimize#580 and hyperopt has an implementation, so does sigopt..
  • Neural network hyper-parameter search for simple CNN for MNIST: i) learning rate (0.001, 0.01, ...) - this would require us to probably use a log-scale to make the search linear; ii) batch size: 1, 2, 4, .. use a log scale base 2; iii) number of hidden units: maybe again log scale
  • Continuing from above - we can also make use of gradient information when available. This should be only a discussion, but should be good for us to know (after submission!)
  • Explain intuition of GP-UCB
  • Like SVM, create an optimisation on Random Forests with the two params being # estimators and max depth
  • [ ] Graph showing time saved.
@nipunbatra nipunbatra changed the title Hyperparameter tuning TODO Master issue Jun 13, 2019
@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented Jun 13, 2019

Dear Prof.

I will be updating the jupyter-notebook here (In this repository itself) with the above issues.
I was planning to update the HTML file later when things are finalized.

@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented Jun 14, 2019

Should we add something as this sir? (Included)

  • Introduce the problems (kernel choosing, other hyperparameters) in Bayesian optimization. -- will refer slide# 30 in Peter Fraizer Bayesian Optimization tutorial, link
  • Why the above problem is much easier to deal with?

@nipunbatra
Copy link
Collaborator Author

@apoorvagnihotri yes. thanks. those make a lot of sense.

@apoorvagnihotri apoorvagnihotri self-assigned this Jun 14, 2019
@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented Jun 14, 2019

@nipunbatra Dear Sir could you please clarify regarding these issues:

prettier fonts and experiment with five thirty eight style for figures.

What is meant by five thirty eight style.

BO when there are categorical variables also: scikit-optimize/scikit-optimize#580 and hyperopt has an implementation, so does sigopt..

I have linked one tutorial that uses scikit-optimize. What should be the scope of the explanation should I provide for Bayesian optimization for the categorical domain?

@nipunbatra
Copy link
Collaborator Author

nipunbatra commented Jun 14, 2019 via email

@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented Jun 15, 2019

Search five thirty eight Matplotlib. Its a style by Nate Silver used for beautiful graphics.

Dear Prof. @nipunbatra
I found this link quite interesting.

Neural network hyper-parameter search for simple CNN for MNIST: i) learning rate (0.001, 0.01, ...) - this would require us to

Should I use scikit-optimize for this case?
I think it would be good to use scikit-optimize as we will be also able to show that BO can be applied to categorical variables. as shown here.

apoorvagnihotri added a commit that referenced this issue Jun 16, 2019
@apoorvagnihotri
Copy link
Collaborator

apoorvagnihotri commented Jun 16, 2019

Dear Prof.
@nipunbatra
I have some issue and need your inputs on this.

Continuing from above - we can also make use of gradient information when available. This should be only a discussion, but should be good for us to know (after submission!)

@nipunbatra
Copy link
Collaborator Author

@apoorvagnihotri
Please see:

  1. video: https://www.youtube.com/watch?v=bRyqfGvaUr4
  2. corresponding paper: https://arxiv.org/abs/1703.04389
  3. code: https://github.com/wujian16/Cornell-MOE

In the paper abstract they mention:

In this paper we show how Bayesian optimization can exploit derivative information to decrease the number of objective function evaluations required for good performance. In particular, we develop a novel Bayesian optimization algorithm, the derivative-enabled knowledge-gradient (dKG), for which we show one-step Bayes-optimality, asymptotic consistency, and greater one-step value of information than is possible in the derivative-free setting.

We can probably just write a couple of lines about this interesting direction.

apoorvagnihotri added a commit that referenced this issue Jun 16, 2019
apoorvagnihotri added a commit that referenced this issue Jun 16, 2019
apoorvagnihotri added a commit that referenced this issue Jun 16, 2019
@apoorvagnihotri apoorvagnihotri mentioned this issue Oct 14, 2019
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants