Fix RNG state when resuming synthesis #16

billbrod · 2019-11-15T23:26:49Z

If you run synthesis twice in a row, you'll pick up more or less where you left off (assuming you set initial_image=None and learning_rate=None on the second call; this is pending the merger of the current ventral_model branch), with one major caveat: the state of the random number generator. We require a seed and always set it at the beginning of the synthesis call. If you resume synthesis with the same object in the same session, we can just allow the user to set seed=None and, if seed is None, don't set it.

However, if we save a metamer object, load it, and then resume synthesis (which is not uncommon when doing synthesis that takes a long period of time), currently we have no good way to resume the RNG state. Something like torch.random.fork_rng_state or what it does (I can't find an example code with how to use it) is probably what we want. But I'm not sure how to handle devices with this.

Grabbing the cpu state would be easy, my preference would be to do something like the following: at the end of synthesis, do self.cpu_rng_state = torch.get_rng_state(), make sure to save the cpu_rng_state attribute by adding it to the list of attributes in the save function and then, during load, call torch.set_rng_state(metamer.cpu_rng_state).

However, grabbing the GPU states apparently takes a long amount of time (see the warning in the function linked above) and we would only want to do it for the relevant devices. Currently the metamer object is not explicitly aware of what devices are relevant, which I prefer because it makes the code completely device-agnostic. However, it presents a problem here and I see three solutions:

Don't try to resume GPU rng state (current situation)
Grab RNG state from all available GPUs (as the fork_rng function linked above does if devices isn't specified), and set them all.
Figure out what devices are being used. I think this is the best solution, and my preference for how to handle it would be to check initial_image.device and model.device. Currently, we do not require the model's device to be set and so it's very possible that there is no device attribute (my ventral stream models have device attribute). We could start encouraging it and default to 2 if it's not present.

If we do something like 2 (or do that as the default in 3), then we should probably require this option to be enabled, rather than always doing it, since it apparently takes time. And regardless of whether we do 2 or 3, it should happen at the end of the synthesis call.

The text was updated successfully, but these errors were encountered:

billbrod · 2020-05-25T18:07:42Z

We have two seeds, numpy and torch. We want to get the state of the RNG, for numpy that's np.random.get_state(), need to figure it out for torch

billbrod · 2020-08-21T21:24:37Z

Currently, we don't set the seed when seed is None, but this only works if you're in the same (notebook, console) session; it won't help if you've saved and loaded the synthesis object.

billbrod added this to Medium term milestones in Roadmap Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix RNG state when resuming synthesis #16

Fix RNG state when resuming synthesis #16

billbrod commented Nov 15, 2019 •

edited

Loading

billbrod commented May 25, 2020

billbrod commented Aug 21, 2020

Fix RNG state when resuming synthesis #16

Fix RNG state when resuming synthesis #16

Comments

billbrod commented Nov 15, 2019 • edited Loading

billbrod commented May 25, 2020

billbrod commented Aug 21, 2020

billbrod commented Nov 15, 2019 •

edited

Loading