-
Notifications
You must be signed in to change notification settings - Fork 25.1k
Closed
Labels
module: compiled autogradcompiled_autogradcompiled_autogradmodule: dynamic shapesoncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🚀 The feature, motivation and pitch
User-defined autograd functions may use ctx->saved_data to pass non-Tensor activations e.g. int, float, bool, etc. These are passed as IValues, and currently we specialize on them, resulting in recompiles any time there's any value change.
Workaround that exist today:
- move all ctx->saved_data scalar conversions into a custom op to hide it from the compiler
- rewrite the c++ autograd function to only use tensors
- (if you're just trying to get perf numbers while this is unsupported) build from source and comment out
pytorch/torch/csrc/autograd/custom_function.h
Line 205 in 18e75c0
args.collect(ctx_.saved_data);
Supporting dynamism for non-Tensors requires a bit of work because:
- The user-defined backward will generally call conversion methods e.g. FBGEMM PermutePooledEmbsFunction
ctx->saved_data["allow_duplicates"].toBool();
, which will return a new objectpytorch/aten/src/ATen/core/ivalue.h
Lines 1341 to 1354 in 3d56673
union TriviallyCopyablePayload { TriviallyCopyablePayload() : as_int(0) {} int64_t as_int; double as_double; bool as_bool; // Invariant: never nullptr; null state is represented as // c10::UndefinedTensorImpl::singleton() for consistency of // representation with Tensor. c10::intrusive_ptr_target* as_intrusive_ptr; struct { c10::DeviceType type; DeviceIndex index; } as_device; } u; - Dynamo has a few different codepaths to specialize on scalar values, but these lifted scalars from saved activations frequently change and specialization here will cause us to recompile
Proposed fixes:
- Derived class that makes IValue underlying scalars swappable and override conversion methods to return symbolic variables tracked by compiled autograd
- Dynamo + dynamic support for keeping SymFloat/SymBool/(Sym?)String as lifted inputs
Alternatives
A possible, but annoying alternative, is to ask users to rewrite their autograd function to only use Tensors, and move their Tensor to scalar conversion in custom ops
Additional context
No response
Metadata
Metadata
Assignees
Labels
module: compiled autogradcompiled_autogradcompiled_autogradmodule: dynamic shapesoncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module