Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RNG state when resuming synthesis #16

Open
billbrod opened this issue Nov 15, 2019 · 2 comments
Open

Fix RNG state when resuming synthesis #16

billbrod opened this issue Nov 15, 2019 · 2 comments
Projects

Comments

@billbrod
Copy link
Member

billbrod commented Nov 15, 2019

If you run synthesis twice in a row, you'll pick up more or less where you left off (assuming you set initial_image=None and learning_rate=None on the second call; this is pending the merger of the current ventral_model branch), with one major caveat: the state of the random number generator. We require a seed and always set it at the beginning of the synthesis call. If you resume synthesis with the same object in the same session, we can just allow the user to set seed=None and, if seed is None, don't set it.

However, if we save a metamer object, load it, and then resume synthesis (which is not uncommon when doing synthesis that takes a long period of time), currently we have no good way to resume the RNG state. Something like torch.random.fork_rng_state or what it does (I can't find an example code with how to use it) is probably what we want. But I'm not sure how to handle devices with this.

Grabbing the cpu state would be easy, my preference would be to do something like the following: at the end of synthesis, do self.cpu_rng_state = torch.get_rng_state(), make sure to save the cpu_rng_state attribute by adding it to the list of attributes in the save function and then, during load, call torch.set_rng_state(metamer.cpu_rng_state).

However, grabbing the GPU states apparently takes a long amount of time (see the warning in the function linked above) and we would only want to do it for the relevant devices. Currently the metamer object is not explicitly aware of what devices are relevant, which I prefer because it makes the code completely device-agnostic. However, it presents a problem here and I see three solutions:

  1. Don't try to resume GPU rng state (current situation)
  2. Grab RNG state from all available GPUs (as the fork_rng function linked above does if devices isn't specified), and set them all.
  3. Figure out what devices are being used. I think this is the best solution, and my preference for how to handle it would be to check initial_image.device and model.device. Currently, we do not require the model's device to be set and so it's very possible that there is no device attribute (my ventral stream models have device attribute). We could start encouraging it and default to 2 if it's not present.

If we do something like 2 (or do that as the default in 3), then we should probably require this option to be enabled, rather than always doing it, since it apparently takes time. And regardless of whether we do 2 or 3, it should happen at the end of the synthesis call.

@billbrod billbrod added this to Medium term milestones in Roadmap Feb 7, 2020
@billbrod
Copy link
Member Author

We have two seeds, numpy and torch. We want to get the state of the RNG, for numpy that's np.random.get_state(), need to figure it out for torch

@billbrod
Copy link
Member Author

Currently, we don't set the seed when seed is None, but this only works if you're in the same (notebook, console) session; it won't help if you've saved and loaded the synthesis object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Roadmap
  
Medium term milestones
Development

No branches or pull requests

1 participant