Skip to content

[compiled autograd][cppnode] eliminate recompiles on ctx->saved_data #130170

@xmfan

Description

@xmfan

🚀 The feature, motivation and pitch

User-defined autograd functions may use ctx->saved_data to pass non-Tensor activations e.g. int, float, bool, etc. These are passed as IValues, and currently we specialize on them, resulting in recompiles any time there's any value change.

Workaround that exist today:

  • move all ctx->saved_data scalar conversions into a custom op to hide it from the compiler
  • rewrite the c++ autograd function to only use tensors
  • (if you're just trying to get perf numbers while this is unsupported) build from source and comment out
    args.collect(ctx_.saved_data);

Supporting dynamism for non-Tensors requires a bit of work because:

  • The user-defined backward will generally call conversion methods e.g. FBGEMM PermutePooledEmbsFunction ctx->saved_data["allow_duplicates"].toBool();, which will return a new object
    union TriviallyCopyablePayload {
    TriviallyCopyablePayload() : as_int(0) {}
    int64_t as_int;
    double as_double;
    bool as_bool;
    // Invariant: never nullptr; null state is represented as
    // c10::UndefinedTensorImpl::singleton() for consistency of
    // representation with Tensor.
    c10::intrusive_ptr_target* as_intrusive_ptr;
    struct {
    c10::DeviceType type;
    DeviceIndex index;
    } as_device;
    } u;
    This is annoying for tracing a dynamic graph, since we want all scalars to use their symbolic equivalents but the API returns a new object that cannot be swapped by compiled autograd
  • Dynamo has a few different codepaths to specialize on scalar values, but these lifted scalars from saved activations frequently change and specialization here will cause us to recompile

Proposed fixes:

  1. Derived class that makes IValue underlying scalars swappable and override conversion methods to return symbolic variables tracked by compiled autograd
  2. Dynamo + dynamic support for keeping SymFloat/SymBool/(Sym?)String as lifted inputs

Alternatives

A possible, but annoying alternative, is to ask users to rewrite their autograd function to only use Tensors, and move their Tensor to scalar conversion in custom ops

Additional context

No response

cc @ezyang @anijain2305 @chauhang @penguinwu

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions