-
-
Notifications
You must be signed in to change notification settings - Fork 1k
Optuna GSoC 2020
Optuna is an open source hyperparameter optimization framework to automate hyperparameter search. Optuna provides eager search spaces for automated search for optimal hyperparameters using Python conditionals, loops, and syntax, state-of-the-art algorithms to efficiently search large spaces and prune unpromising trials for faster results, and easy parallelization for hyperparameter searches over multiple threads or processes without modifying code.
Optuna is participating in GSoC 2020 as a member of NumFOCUS.
For coding on Optuna, a solid basis in Python coding will be required. Experience working with Git and github.com will also be useful. If you're interested in applying, we recommend you take a look at the Optuna github repository, and try your hand at some of the Contribution Welcome
labelled issues. We will evaluate applications largely based on their contributions to Optuna and other open source projects.
To contact us, please email us at optuna@preferred.jp
Create Optuna Documentation and Technical Writing
Add Integration Modules for Other Open Source Projects
Implement a GP-BO-based sampler for Optuna
Revise and update the Optuna documentation to make it more helpful for users. Some ideas are:
- Create a tutorial about user-defined pruners
- Create a document about sampling/pruning algorithms
- Create a document to clarify InMemoryStorage
- Create documentation in another major language (Mandarin, Spanish, Japanese, Korean…)
@Crissman
Easy to Medium
You need know:
- Python
- Hyperparameter optimization basics
- Sphinx-docs (optional)
Survey the Optuna documentation and compare with documentation for other hyperparameter optimization frameworks. Note what you feel is a clear explanation and inconsistencies in the Optuna documentation, then improve the Chainer documentation.
Hyperparameter optimization is critical to many complex algorithms, and many users are not aware that automated programs like Optuna are available. By improving the Optuna documentation, you will help users understand how to improve the performance of their algorithms without doing fiddly manual tuning. Learn to work with self-documenting code in a large open-source project. Build relationships with coders in a top tier startup in Japan.
Optuna has integration modules for PyTorch, TensorFlow, Keras, SciKit-Learn, and several other Open Source projects. We would like to increase the number of Open Source projects for which Optuna has a custom module to allow for easy integration. This could include creating pruner integrations similar to the existing modules or making new integrations, such as implementing a callback to export data to another framework like MLFlow or TensorBoard.
@Crissman
Medium to Hard
You need know:
- Python
- Hyperparameter optimization basics
- Open Source projects
Look at the existing Optuna integration modules, and review other popular Open Source projects to see which have hyperparameters or could provide additional functionality to Optuna.
You’ll work with both Optuna and other Open Source projects, and each integration module you create will make both Optuna and the other project more useful. Build relationships with coders in a top tier startup in Japan.
Optuna supports multiple sampling algorithms such as TPE or CMA-ES. Although Bayesian Optimization based on Gaussian Process (GP-BO), which is one of most popular algorithms for hyperparameter optimization, is also supported, it internally invokes scikit-optimize. Having Optuna’s own implementation of GP-BO instead of wrapping another library would make Optuna more efficient and extensible.
In this project, you will implement a new sampler class based on GP-BO. The sampler should meet the following requirements:
- the sampler is a subclass of
optuna.samplers.BaseSampler
, - the sampler shows comparable performance with TPE (the default algorithm of Optuna) on multiple benchmark tasks, without tuning any hyper-hyper parameter,
- the sampler is compatible with the basic parameter types of Optuna: float (uniform), integer, categorical, discrete float, and log-scale of float,
- the sampler is compatible with distributed optimization,
- kernels and acquisition function of the sampler are customizable,
[Optional] Based on the new sampler class, you are encouraged to implement state-of-the-art bayesian optimization algorithms such as:
- Bayesian Optimization with Unknown Search Space
- Scalable Global Optimization via Local Bayesian Optimization
TBD
Medium - Hard
You need to know:
- Python,
- Theory of bayesian optimization
Go through the document and/or implementation of samplers and understand how to implement a sampling algorithm. You are encouraged to write an example of a sampler class implementing either of the following algorithms and share it with us. The class may not be production-ready, i.e., it may be incompatible with categorical kernels, custom acquisition functions, or distributed optimization.
Some practice sampler classes you could write:
- simple Bayesian Optimization based on Gaussian Process
- BOHB
- Population-based Training
- Bayesian Optimization with Unknown Search Space
- Scalable Global Optimization via Local Bayesian Optimization
- any other state-of-the-art algorithm for hyperparameter optimization
This is a challenging project that will require building simple, scalable interfaces for complex gaussian processes and bayesian optimization mechanisms. You’ll learn how to design interfaces and gain practical understanding of how bayesian optimization and gaussian processes work. Build relationships with coders in a top tier startup in Japan.