Add random_seed to auto_train API to improve repeatability #1619
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add random_seed (set by default to default_random_seed) to the auto_train API to improve repeatability.
This option is passed to the Ray hyperparameter search algorithm, as a seed to the random
generation of hyperparameter sample order, and to the hyperparameter training job, as seed
where possible to data splitting, parameter initialization, and training set shuffling.
Change the default AutoML search_alg from BasicVariantGenerator, which does not currently take a random
seed parameter, to HyperOptSearch, which does take a random seed parameter. Testing across the 5
validation datasets has shown that HyperOptSearch yielded results similar to BasicVariantGenerator, and
with the random seed specified, the results were much less noisy.