-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow DataParallel to wrap CPU modules #17065
Comments
This is not true. The model is broadcast at the beginning of each forward, not when constructing the
|
How about explicitly throwing an error when constructing |
This SGTM :) |
Okay, I think we should update the documentation for this then? Also, what is the best way to then move |
@douwekiela yes, I will update the docs in the fix for this issue.
You don't have to move
|
Right. So I guess |
|
@douwekiela The recommended way is to define one |
馃殌 Feature
Creating a model on CPU and then wrapping the model with
DataParallel
should automatically replicate the model on destination GPUs. Are there any reason to enforce thatDataParallel
's input model must be on GPU?Motivation
The code above throws
TypeError: Broadcast function not implemented for CPU tensors
. It avoid the error, users need to explicitly callmodel.cuda()
.nn.DataParallel(model, device_ids=[0,1])
, we already have enough info on where the model should be replicated. It can be automatically handles regardless of whether themodel
is stored on CPU or GPU.Pitch
Support the above code snippet.
The text was updated successfully, but these errors were encountered: