# PyTorch ONNX Exporter new features and architecture

Jan 2026

![infographics](<pt-onnx-infographics.png>)

## `dynamo=True` is the default

- The New Default: Starting from PyTorch 2.9, the **`dynamo=True`** option is the **default and recommended** way to export models to ONNX.
- Core Shift: It moves away from the older TorchScript-based capture mechanism to a torch.export based modern stack.
- Deprecation Plan: While the TorchScript exporter (dynamo=False) is currently usable, it is planned for eventual deprecation in alignment with PyTorch core's handling of TorchScript.

## New options in `export()`

```py
torch.onnx.export(
    model, args, kwargs=kwargs,
    # New way of expressing dynamic shapes (more examples later)
    dynamic_shapes=({0: "batch", 1: "sequence_len"}),
    # dynamic_axes=...,  # Deprecated
    dynamo=True,  # Default (2.9)
    report=True,  # Creates a markdown report
    verify=True,  # Runs onnx runtime on the example
    optimize=True, # Runs onnxscript graph optimizations
) -> torch.onnx.ONNXProgram
```

## What happens inside `torch.onnx.export`

torch.export() **captures FX** graph
-> **translate** and build ONNX IR
-> graph **optimization** with ONNX Script

Entry point is at: https://github.com/pytorch/pytorch/blob/0ad306cac740eaf2ce582e2bdf097cc61d929a40/torch/onnx/_internal/exporter/_core.py#L1282

![diagram](https://raw.githubusercontent.com/justinchuby/diagrams/refs/heads/main/pytorch/torch-export-flow.svg)

In [None]:
## FX graph and the ExportedProgram

In [28]:
import torch
import torch.export

class Mod(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.weight = torch.nn.Parameter(torch.randn(2, 3, 10, 10))

    def forward(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
        a = torch.sin(x)
        a.add_(y)
        b = a * self.weight
        return torch.nn.functional.scaled_dot_product_attention(b, b, b)

example_args = (torch.randn(2, 3, 10, 10), torch.randn(2, 3, 10, 10))

# Important to set to eval mode before exporting
mod = Mod().eval()
exported_program: "ExportedProgram" = torch.export.export(mod, args=example_args)
print(exported_program)

ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, p_weight: "f32[2, 3, 10, 10]", x: "f32[2, 3, 10, 10]", y: "f32[2, 3, 10, 10]"):
             # File: /tmp/ipykernel_246584/197873441.py:10 in forward, code: a = torch.sin(x)
            sin: "f32[2, 3, 10, 10]" = torch.ops.aten.sin.default(x);  x = None
            
             # File: /tmp/ipykernel_246584/197873441.py:11 in forward, code: a.add_(y)
            add_: "f32[2, 3, 10, 10]" = torch.ops.aten.add_.Tensor(sin, y);  sin = y = None
            
             # File: /tmp/ipykernel_246584/197873441.py:12 in forward, code: b = a * self.weight
            mul: "f32[2, 3, 10, 10]" = torch.ops.aten.mul.Tensor(add_, p_weight);  add_ = p_weight = None
            
             # File: /tmp/ipykernel_246584/197873441.py:13 in forward, code: return torch.nn.functional.scaled_dot_product_attention(b, b, b)
            scaled_dot_product_attention: "f32[2, 3, 10, 10]" = torch.ops.aten.scaled_dot_product_at

In [29]:
decomposed = exported_program.run_decompositions()
print(decomposed)

ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, p_weight: "f32[2, 3, 10, 10]", x: "f32[2, 3, 10, 10]", y: "f32[2, 3, 10, 10]"):
             # File: /tmp/ipykernel_246584/197873441.py:10 in forward, code: a = torch.sin(x)
            sin: "f32[2, 3, 10, 10]" = torch.ops.aten.sin.default(x);  x = None
            
             # File: /tmp/ipykernel_246584/197873441.py:11 in forward, code: a.add_(y)
            add: "f32[2, 3, 10, 10]" = torch.ops.aten.add.Tensor(sin, y);  sin = y = None
            
             # File: /tmp/ipykernel_246584/197873441.py:12 in forward, code: b = a * self.weight
            mul: "f32[2, 3, 10, 10]" = torch.ops.aten.mul.Tensor(add, p_weight);  add = p_weight = None
            
             # File: /tmp/ipykernel_246584/197873441.py:13 in forward, code: return torch.nn.functional.scaled_dot_product_attention(b, b, b)
            mul_1: "f32[2, 3, 10, 10]" = torch.ops.aten.mul.Scalar(mul, 0.5623413251903491)
            

## Translation to ONNX

In [33]:
onnx_program = torch.onnx.export(exported_program, report=True, opset_version=23)
print(onnx_program)

[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
x input_kind: InputKind.USER_INPUT persistent: None
y input_kind: InputKind.USER_INPUT persistent: None
p_weight input_kind: InputKind.PARAMETER persistent: None
[torch.onnx] Translate the graph into ONNX... ✅
[torch.onnx] Export report has been saved to 'onnx_export_2026-01-05_12-44-51-152246_success.md'.
ONNXProgram(
    model=
        <
            ir_version=10,
            opset_imports={'': 23},
            producer_name='pytorch',
            producer_version='2.10.0.dev20251028+cpu',
            domain=None,
            model_version=None,
        >
        graph(
            name=main_graph,
            inputs=(
                %"x"<FLOAT,[2,3,10,10]>,
                %"y"<FLOAT,[2,3,10,10]>
            ),
            outputs=(
                %"scaled_dot_product_attention"<FLOAT,[2,3,10,10]>
            ),
            initializers=(
                %"weight"<FL

In [34]:
onnx_program.save("model.onnx")

In [20]:
!onnxvis model.onnx

Loading extensions...
I0000 00:00:1767644749.496075  248247 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

Loaded 9 adapters:
 - TFLite adapter (Flatbuffer)
 - TFLite adapter (MLIR)
 - TF adapter (MLIR)
 - TF adapter (direct)
 - GraphDef adapter
 - Pytorch adapter (exported program)
 - MLIR adapter
 - ONNX adapter
 - JSON adapter

Starting Model Explorer server at:
http://localhost:8080/?data=%7B%22models%22%3A%20%5B%7B%22url%22%3A%20%22/home/justinchu/dev/talk-torch-onnx-apis-architecture/src/model.onnx%22%7D%5D%7D

Press Ctrl+C to stop.
gio: http://localhost:8080/?data=%7B%22models%22%3A%20%5B%7B%22url%22%3A%20%22/home/justinchu/dev/talk-torch-onnx-apis-architecture/src/model.onnx%22%7D%5D%7D: Operation not supported
Stopping server...
^C


## Model in `onnx_program.model` is an onnx_ir.Model

- You can run any ONNX->ONNX transformation on it.
- The exporter by default runs ONNX Script pattern replacement and whole graph optimization. These are robust, in-memory graph passes the team has created
- Low memory consumption by sharing tensor data with the PyTorch model

In [35]:
# Explore the IR model

model = onnx_program.model
print("Model has", len(model.graph), "nodes")

print("All initializers:")
for init in model.graph.initializers.values():
    print(" ", init)

Model has 4 nodes
All initializers:
  %"weight"<FLOAT,[2,3,10,10]>{TorchTensor(...)}


In [36]:
print(model.graph.initializers["weight"].const_value.raw is mod.weight)

True


In [37]:
model.graph.initializers["weight"].const_value.display()

In [38]:
print("All users of the initializer:", model.graph.initializers["weight"].uses())

All users of the initializer: (Usage(node=Node(name='node_mul', domain='', op_type='Mul', inputs=(SymbolicTensor(name='add', type=Tensor(FLOAT), shape=Shape([2, 3, 10, 10]), producer='node_add', index=0), SymbolicTensor(name='weight', type=Tensor(FLOAT), shape=Shape([2, 3, 10, 10]), const_value={TorchTensor(...)})), attributes={}, overload='', outputs=(SymbolicTensor(name='mul', type=Tensor(FLOAT), shape=Shape([2, 3, 10, 10]), producer='node_mul', index=0),), version=23, doc_string=None), idx=1),)


## Verify model outputs

https://github.com/justinchuby/model-explorer-onnx

In [39]:
from torch.onnx.verification import verify_onnx_program

from model_explorer_onnx.torch_utils import save_node_data_from_verification_info

verification_infos = verify_onnx_program(onnx_program, compare_intermediates=True)

# Produce node data for Model Explorer for visualization
save_node_data_from_verification_info(
    verification_infos, onnx_program.model, model_name="model"
)


In [40]:
!onnxvis model.onnx --node_data_paths=model_max_abs_diff.json,model_max_rel_diff.json

Loading extensions...
I0000 00:00:1767645930.860103  250391 port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.

Loaded 9 adapters:
 - TFLite adapter (Flatbuffer)
 - TFLite adapter (MLIR)
 - TF adapter (MLIR)
 - TF adapter (direct)
 - GraphDef adapter
 - Pytorch adapter (exported program)
 - MLIR adapter
 - ONNX adapter
 - JSON adapter

Starting Model Explorer server at:
http://localhost:8080/?data=%7B%22models%22%3A%20%5B%7B%22url%22%3A%20%22/home/justinchu/dev/talk-torch-onnx-apis-architecture/src/model.onnx%22%7D%5D%2C%20%22nodeData%22%3A%20%5B%22/home/justinchu/dev/talk-torch-onnx-apis-architecture/src/model_max_abs_diff.json%22%2C%20%22/home/justinchu/dev/talk-torch-onnx-apis-architecture/src/model_max_rel_diff.json%22%5D%2C%20%22nodeDataTargets%22%3A%20%5B%22%22%2C%20%22%22%5D%7D

Press Ctrl+

## Multiple ways to represent dynamic shapes

