Bug of group2ctx? Wrong device placement? #7934
Comments
During |
@formath What do you mean by "if an op can't be assigned a device id"? How op's device id is assigned? |
Depends on the device id of its first control node or input node. Your op may not assigned by this. |
@formath So if a symbol a has input b, and b is assigned to cpu device, then a will also be automatically assigned to cpu device regardless of the default context? Why it is designed in this way? This is surprising and no documentation for this is found. |
|
@formath Is executor just a scheduler? What does it mean by "executor run on this context"? What executor runs? Parameter updating? |
@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage. For general "how-to" questions, our user forum (and Chinese version) is a good place to get help. |
@x10000year Thanks for submitting the issue - were you able to resolve it? |
For the following code:
x = mx.symbol.MyOp()
exe = x.bind(mx.gpu(), {}, group2ctx={"a": mx.cpu(), "b": mx.gpu()})
exe.forward()
where MyOp is a custom operator written in c++, which prints "CPU" if it is run in cpu context, or prints "GPU" if run in gpu context. MyOp has no input.
I don't use mx.AttrScope to specify the group of x, so default context should be used for x. However, the above code prints "CPU", which means that x is run in cpu context. Why?
If I set group2ctx={"b": mx.cpu(), "a": mx.gpu()}. Then it prints "GPU".
After more tests, I found that the group that has the alphabetically smaller name is chosen for x. Very strange.
Is this a bug? How device placement works?
The text was updated successfully, but these errors were encountered: