Skip to content

Allow custom arguments to torch.compile backends #98663

@gs-olive

Description

@gs-olive

🚀 The feature, motivation and pitch

Some torch.compile backends use optional arguments, such as the tvm backend:

def tvm(gm, example_inputs, *, scheduler=None, trials=20000):

There is already a field in torch.compile for options, but these only apply to Torch Inductor

pytorch/torch/__init__.py

Lines 1533 to 1566 in c68a94c

def compile(model: Optional[Callable] = None, *,
fullgraph: builtins.bool = False,
dynamic: builtins.bool = False,
backend: Union[str, Callable] = "inductor",
mode: Union[str, None] = None,
options: Optional[Dict[str, Union[str, builtins.int, builtins.bool]]] = None,
disable: builtins.bool = False) -> Callable:
"""
Optimizes given model/function using TorchDynamo and specified backend.
Args:
model (Callable): Module/function to optimize
fullgraph (bool): Whether it is ok to break model into several subgraphs
dynamic (bool): Use dynamic shape tracing
backend (str or Callable): backend to be used
- "inductor" is the default backend, which is a good balance between performance and overhead
- Non experimental in-tree backends can be seen with `torch._dynamo.list_backends()`
- Experimental or debug in-tree backends can be seen with `torch._dynamo.list_backends(None)`
- To register an out-of-tree custom backend: https://pytorch.org/docs/master/dynamo/custom-backends.html
mode (str): Can be either "default", "reduce-overhead" or "max-autotune"
- "default" is the default mode, which is a good balance between performance and overhead
- "reduce-overhead" is a mode that reduces the overhead of python with CUDA graphs, useful for small batches
- "max-autotune" is a mode that that leverages Triton based matrix multiplications and convolutions
- To see the exact configs that each mode sets you can call `torch._inductor.list_mode_options()`
options (dict): A dictionary of options to pass to the backend. Some notable ones to try out are
- `epilogue_fusion` which fuses pointwise ops into templates. Requires `max_autotune` to also be set
- `max_autotune` which will profile to pick the best matmul configuration
- `fallback_random` which is useful when debugging accuracy issues
- `shape_padding` which pads matrix shapes to better align loads on GPUs especially for tensor cores
- `triton.cudagraphs` which will reduce the overhead of python with CUDA graphs
- `trace.enabled` which is the most useful debugging flag to turn on
- `trace.graph_diagram` which will show you a picture of your graph after fusion
- For inductor you can see the full list of configs that it supports by calling `torch._inductor.list_options()`
disable (bool): Turn torch.compile() into a no-op for testing

The options argument could be expanded to allow providing custom arguments to any backend. Alternatively, a **kwargs could be added to the function signature, for keyword arguments which are passed to the backend directly, if specified.

Alternatives

One alternative to providing optional arguments to torch.compile is to use functools.partial or lambda expressions to pre-populate keyword arguments in backends, but this leads to function implementations which are neither clean nor one-liners, and require building a custom backend for each set of desired keyword arguments. This can also require additional infrastructure on the backend side, which can unnecessarily increase the size of backend code. For example, to specify keyword arguments currently, one would need to do:

custom_backend = lambda gm, inputs : my_backend(gm, inputs, a=a, b=b, c=c)
torch.compile(model, backend=custom_backend)

As opposed to the much simpler:

torch.compile(model, backend=custom_backend, a=a, b=b, c=c)

##### OR #####

torch.compile(model, backend=custom_backend, options={"a": a, "b": b, "c": c})

Additional context

No response

cc @ezyang @soumith @msaroufim @wconstab @ngimel @bdhirsh

Metadata

Metadata

Assignees

No one assigned

    Labels

    oncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions