-
Notifications
You must be signed in to change notification settings - Fork 89
Rename random state to random seed #1798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename random state to random seed #1798
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1798 +/- ##
========================================
+ Coverage 99.9% 100.0% +0.2%
========================================
Files 252 252
Lines 20117 20250 +133
========================================
+ Hits 20086 20242 +156
+ Misses 31 8 -23
Continue to review full report at Codecov.
|
0fad373
to
69756a3
Compare
@@ -19,7 +19,7 @@ def __init__(self, | |||
max_iterations=None, | |||
tuner_class=None, | |||
text_columns=None, | |||
random_state=0, | |||
random_seed=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm replacing random_state with random_seed here, rather than adding it as an argument, because this is not a user-facing class.
if n_components: | ||
pca = SkPCA(n_components=n_components, **kwargs) | ||
pca = SkPCA(n_components=n_components, random_state=random_seed, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we were not passing random_state to PCA before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooo, good catch 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job fixing everything and raising the warnings! LGTM
@@ -1800,6 +1800,7 @@ def total_pipelines(automl, num_batches, batch_size): | |||
automl.search() | |||
assert automl._pipelines_per_batch == 2 | |||
assert automl._automl_algorithm.pipelines_per_batch == 2 | |||
assert automl._automl_algorithm.pipelines_per_batch == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeated line?
@@ -511,3 +511,23 @@ def save_plot(fig, filepath=None, format='png', interactive=False, return_filepa | |||
|
|||
if return_filepath: | |||
return filepath | |||
|
|||
|
|||
def deprecate_arg(old_arg, new_arg, old_value, new_value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a test for this helper since it's public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
docs/source/release_notes.rst
Outdated
@@ -8,6 +8,7 @@ Release Notes | |||
* Sped up permutation importance for some pipelines :pr:`1762` | |||
* Fixes | |||
* Changes | |||
* Added ``random_seed`` as an argument to our automl/pipeline/component api. Using ``random_state`` will raise a warning :pr:`1798` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably can't add this to breaking_changes
, but is there another category similar that we can put it in to warn users? Just wondering since it'll soon be a breaking change once we officially deprecate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine for now? Here we're telling users we'll raise a warning and the warning itself will tell them about the impending deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments but nice and thorough. Thank you, LGTM!! 🚢
@@ -50,7 +50,7 @@ def __init__(self, | |||
self._best_pipeline_info = {} | |||
self.ensembling = ensembling and len(self.allowed_pipelines) > 1 | |||
self._pipeline_params = pipeline_params or {} | |||
self._random_state = random_state | |||
self._random_state = random_seed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to update this to self._random_seed
too?
@@ -65,7 +65,7 @@ def next_batch(self): | |||
|
|||
next_batch = [] | |||
if self._batch_number == 0: | |||
next_batch = [pipeline_class(parameters=self._transform_parameters(pipeline_class, {}), random_state=self.random_state) | |||
next_batch = [pipeline_class(parameters=self._transform_parameters(pipeline_class, {}), random_seed=self.random_seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused, what is self.random_seed set to here? 👀 I ask because I made the previous comment and noticed that we have self._random_state
right now... Is this supposed to be self.random_seed
, and then we need to update self._random_seed
--> self.random_seed
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.random_seed == self._random_state
. We get self.random_seed
from the parent class. self._random_state
is used to sample params in the first batch if the user provides pipelines with frozen hyperparams. I'll get rid of self._random_state
since we don't need it anymore!
@@ -36,8 +36,8 @@ class CatBoostClassifier(Estimator): | |||
SEED_MAX = SEED_BOUNDS.max_bound |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More out of curiosity but does this mean we dont need SEED_BOUNDS
anymore? Or maybe we need to check that the seed provided is within those bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get rid of them actually!
if n_components: | ||
pca = SkPCA(n_components=n_components, **kwargs) | ||
pca = SkPCA(n_components=n_components, random_state=random_seed, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooo, good catch 👀
evalml/automl/automl_search.py
Outdated
@@ -156,7 +158,9 @@ def __init__(self, | |||
additional_objectives (list): Custom set of objectives to score on. | |||
Will override default objectives for problem type if not empty. | |||
|
|||
random_state (int): Seed for the random number generator. Defaults to 0. | |||
random_state (None, int): Deprecated - use random_seed instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nit-picking (for no reason loool) but since the other places that were touched are now random_state (int)
, let's match that hehe
@@ -511,3 +511,23 @@ def save_plot(fig, filepath=None, format='png', interactive=False, return_filepa | |||
|
|||
if return_filepath: | |||
return filepath | |||
|
|||
|
|||
def deprecate_arg(old_arg, new_arg, old_value, new_value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this will be useful for any future dep patterns we follow 🥳
a881926
to
7c37f3e
Compare
Pull Request Description
Fixes #1737 by adding
random_seed
as an argument and keepingrandom_state
but setting it to None. A warning will show ifrandom_state
is used.After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rst
to include this pull request by adding :pr:123
.