Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC couldn't match torch.ones with requires_grad=True #28786

Open
mrshenli opened this issue Oct 28, 2019 · 2 comments
Open

RPC couldn't match torch.ones with requires_grad=True #28786

mrshenli opened this issue Oct 28, 2019 · 2 comments
Assignees
Labels
better-engineering Relatively self-contained tasks for better engineering contributors high priority module: internals Related to internal abstractions in c10 and ATen module: rpc Related to RPC, distributed autograd, RRef, and distributed optimizer oncall: distributed Add this issue/PR to distributed oncall triage queue triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@mrshenli
Copy link
Contributor

mrshenli commented Oct 28, 2019

馃悰 Bug

I was trying to use rpc.remote to create a tensor with requires_grad=True, but hit the following error:

Couldn't match schema: aten::ones(int[] size, *, int? dtype=None, int? layout=None, Device? device=None, bool? pin_memory=None) -> (Tensor) to args: ((3, 3),) and kwar
gs: {'requires_grad': True}, reason: Unknown keyword argument 'requires_grad' for operator 'aten::ones'. Schema: aten::ones(int[] size, *, int? dtype=None, int? layout
=None, Device? device=None, bool? pin_memory=None) -> (Tensor)

To Reproduce

rpc.remote(dst, torch.ones, args=((3, 3),), kwargs={"requires_grad":True})

Work Around

Wrap torch.ones into a user function can pass that user function to remote, e.g.,

def _torch_ones(sizes, requires_grad=False):
    return torch.ones(sizes, requires_grad=requires_grad)

rpc.remote(dst, _torch_ones, args=((2, 2), ), kwargs={"requires_grad":True})

cc @ezyang @gchanan @zou3519 @bdhirsh @bhosmer @smessmer @ljk53 @ailzhang @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @rohan-varma @jjlilley @osalpekar @jiayisuse @agolynski @SciPioneer @H-Huang @mrzzd @xush6528

@mrshenli mrshenli added oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: rpc Related to RPC, distributed autograd, RRef, and distributed optimizer labels Oct 28, 2019
@pietern pietern removed the oncall: distributed Add this issue/PR to distributed oncall triage queue label Nov 5, 2019
@pritamdamania87 pritamdamania87 added the better-engineering Relatively self-contained tasks for better engineering contributors label Nov 25, 2019
@rohan-varma
Copy link
Member

rohan-varma commented Feb 15, 2020

Even though there's a workaround (that uses UDF over builtin types basically), this seems like its pretty important since we should be able to send over tensors that require grad for distributed autograd etc.What might be the right way to get this fixed? Should we add the optional arg to the shapes in torch/csrc/jit/passes/shape_analysis.cpp or is there another better approach?

cc JIT folks @zdevito @jamesr66a

@pritamdamania87 pritamdamania87 self-assigned this Nov 9, 2020
@pritamdamania87 pritamdamania87 added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Dec 7, 2020
@ezyang ezyang added the module: internals Related to internal abstractions in c10 and ATen label Dec 7, 2020
@ezyang
Copy link
Contributor

ezyang commented Dec 7, 2020

Tagging internals on this, this is hitting a longstanding problem #19393

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
better-engineering Relatively self-contained tasks for better engineering contributors high priority module: internals Related to internal abstractions in c10 and ATen module: rpc Related to RPC, distributed autograd, RRef, and distributed optimizer oncall: distributed Add this issue/PR to distributed oncall triage queue triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

5 participants