-
Notifications
You must be signed in to change notification settings - Fork 45.5k
Adding flag to set random seeds. #3956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. |
model_function = model_fn | ||
|
||
if flags.seed is not None: | ||
model_helpers.set_random_seed(flags.seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's roll this into one line, and just return with a no-op if none.
if flags.seed is not None: | ||
model_helpers.set_random_seed(flags.seed) | ||
|
||
if flags.multi_gpu: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for here, but, for the record: we should consider at this point rolling some of this logic into the argparser. Something like, set_conditions_from_flags() that handles all of the validation/env-wide setting that we don't need return vals for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. As some models (like MiniGo) may need more flags for different modes, this is indeed necessary.
flags = parser.parse_args(args=argv[1:]) | ||
|
||
if flags.seed is not None: | ||
model_helpers.set_random_seed(flags.seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably not be in imagenet, but in the resnet run loop. Otherwise, has to be duped for cifar as well.
batch_size: Create a flag to specify the batch size. | ||
multi_gpu: Create a flag to allow the use of all available GPUs. | ||
hooks: Create a flag to specify hooks for logging. | ||
seed: Create a flag to set random seeds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
def set_random_seed(seed): | ||
"""Sets the random seeds for available RNGs. | ||
This seeds RNGs for python's random and for Tensorflow. The intended |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glint me. I think a \n is required here.
Raises: | ||
ValueError: if the seed is not an integer or if deemed unsuitable for | ||
seeding a the RNGs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or if deemed unsuitable? What does that mean?
"--seed", "-s", nargs="+", type=int, default=None, | ||
help="[default: %(default)s] An integer to seed random number" | ||
"generators. If unset, RNGs choose their own seeds resulting " | ||
"in each run having a different seed.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually work globally, including numpy? Can we expand on this description to describe the expected effect and also which RNGs get set specifically?
|
||
if seed: | ||
self.add_argument( | ||
"--seed", "-s", nargs="+", type=int, default=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this nargs=+? Does this take a list, or copy-pasta?
And, nit: random_seed, rs; seed is too general.
|
||
def test_random_seed(self): | ||
"""It is unclear if this test is a good idea or stable. | ||
If tests are run in parallel, this could be flakey.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should figure this out before including. CC @robieta for testing magic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably, TF does not seem to test set_random_seed in this way, which implies it is not a good idea. Perhaps we should instead test just that the seeds for python and tf are set as expected by their own reports.
if not isinstance(seed, int): | ||
raise ValueError("Random seed is not an integer: {}".format(seed)) | ||
random.seed(seed) | ||
tf.set_random_seed(seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notably, this will persist across sessions, but not across graphs. With estimators, I think the graph is maintained throughout the life of an estimator, but this might have to get called again in the case of starting a new graph? Not sure.
if flags.seed is not None: | ||
model_helpers.set_random_seed(flags.seed) | ||
|
||
if flags.multi_gpu: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. As some models (like MiniGo) may need more flags for different modes, this is indeed necessary.
def set_random_seed(seed): | ||
"""Sets the random seeds for available RNGs. | ||
This seeds RNGs for python's random and for Tensorflow. The intended | ||
use case is for this to be called exactly once at the start of execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which "execution" does it mean here? The execution of the training process? Or the execution of one Session? Or others? Does it act the same in the distributed environment (Multiple GPUs) vs. one single GPU?
This adds flags to set the random seed for python and tensorflow. This is intended to make timing benchmarks (more) repeatable.