Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Chapter: hyperparameter selection #50

Closed
2 tasks
mehradans92 opened this issue Sep 8, 2021 · 6 comments
Closed
2 tasks

New Chapter: hyperparameter selection #50

mehradans92 opened this issue Sep 8, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@mehradans92
Copy link
Contributor

  • Initial draft
  • Applied example
@whitead
Copy link
Owner

whitead commented Sep 8, 2021

https://docs.ray.io/en/latest/tune/index.html

@kjappelbaum
Copy link
Contributor

a bit more far-fetched would be experiment tracking.

I'm thinking of stuff like weights and biases which 1) has tools for hyperparameter sweeps 2) tools to visualize some chemistry

@kjappelbaum
Copy link
Contributor

as framework, I really enjoy using Optuna

@whitead whitead added the enhancement New feature or request label Oct 5, 2021
@mehradans92
Copy link
Contributor Author

@whitead Do you want to use this package (https://docs.ray.io/en/latest/tune/index.html) in this chapter?

@whitead
Copy link
Owner

whitead commented Dec 6, 2021

@mehradans92 I think that comment was me sharing some existing methods used. It would be better to be as package agnostic as possible though.

@whitead
Copy link
Owner

whitead commented Mar 30, 2022

@mehradans92 Read through it briefly. Looks great, a lot of work went into it! Also I can tell it will be very helpful. A few proposed changes:

  • Try to look at the layers chapter once more, there is some overlapping material (e.g., dropout, regularization, hyperparameters).
  • Cite some papers on learning rate schedulers and maybe add some information on momentum, since it's related. Also some have mentioned warm-start, which I'm not familiar with. Maybe mention it..
  • Fig 8.2 - does it need to be a movie? Can be distracting while reading. I can see the benefit for 8.1 certainly
  • Batch size - would love to get 1-2 citations here on batch size and its connection to randomness in estimating gradient
  • Dropout - can you cite the paper and maybe add a bit more on where it should be added (all layers?), if it should be combined with other regularization, etc.
  • It is really critical to use validation data for a hyperparameter search - otherwise you're implicitly fitting to testing data. See here. You need to strongly emphasize this point early and make sure code/examples uses the word validation, instead of test, for the search.
  • On Keras, can you reduce the output level of the logging (verbose=0) so the text isn't rendered in the chapter.
  • You've split up the code nicely, but it'd be great to have some discussion, maybe showing how snippets of how the methods work too, before going right into training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants