-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A better implementation of device placement? #67
Comments
@mingyr Hi, thanks for your detailed comment. I've looked into this before and unfortunately there are no publicly exposed APIs from TF which allow us to have more control over this. I would like to be able to capture the surrounding device context when constructing a module, as your middle example shows. However unlike the various kinds of scopes,
Without being able to capture the context, we cannot respect device placement directives around where we construct modules. If you read the TF source you see that this works by accessing various private members of the We've had some conversations with the TF team about creating an external more advanced use of device placement (and other things(), and those discussions are ongoing. I hope to have some improvement in this area in the public version soon. It's very useful to hear from users about the pain points of the library, so once again thankyou. |
@malcolmreynolds I am sincerely appreciating the detailed explanation and fully understanding the situation. The important thing is not we invent something perfect but we invent something based on which we can advance our daily job. I am glad that Sonnet is such a library which helps me a great deal. Thanks for the contribution made by you engineers at Deepmind which benefits the deep learning community a lot. |
It is known Sonnet module fails to merit the simple device placement directive, which was addressed as in issue #61 , there @kosklain offered a good solution. However there is a tiny problem in practice. Just look at the demonstrating code below:
The first concern is the following code snippet in the above code example:
Here 0 is just for some placeholder purpose, since the intention is to have the model parameters placed on CPU, in order to be shared across multiple GPUs. Although I could define get_device_setter in some prototype like get_device_setter(gpu_id = 0), probably someone still makes the criticism that here the sole intention is to construct everything on CPU, why a GPU ID involves?
The criticism is tended to be a little stronger when the code snippet below is added together:
The intention here is we have all inputs prepared on CPU. So people will say, why you mix directives together, why you can not place them under one directive? Simply answering because Sonnet doesn't support the simple device placement directive, so it is can not be done in a harmonic way, is probably not a good answer to ease every skepticism.
So I just wonder could it be done a little nicer than the above approach, namely use a more obvious directive for a neatly placement control both for inputs preparation and model construction?
Many thanks.
The text was updated successfully, but these errors were encountered: