AOT plugin: examples with RMSNORM #3529

bowang007 · 2025-05-21T22:11:12Z

Description

This PR includes the AOT Plugin demo for RMSNorm Triton kernel.

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

examples/dynamo/aot_flashinfer_plugin.py

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/constant_folding.py	2025-06-12 23:42:18.553208+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/constant_folding.py	2025-06-12 23:42:47.520072+00:00
@@ -98,16 +98,17 @@
class _TorchTensorRTConstantFolder(ConstantFolder):  # type: ignore[misc]
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        super().__init__(*args, **kwargs)

    def is_impure(self, node: torch.fx.node.Node) -> bool:
-        # Set of known quantization ops to be excluded from constant folding. 
+        # Set of known quantization ops to be excluded from constant folding.
        # Currently, we exclude all quantization ops coming from modelopt library.
        quantization_ops = {}
        try:
-            # modelopt import ensures torch.ops.tensorrt.quantize_op.default is registered 
+            # modelopt import ensures torch.ops.tensorrt.quantize_op.default is registered
            import modelopt.torch.quantization as mtq
+
            assert torch.ops.tensorrt.quantize_op.default
            quantization_ops.add(torch.ops.tensorrt.quantize_op.default)
        except Exception as e:
            pass
        if quantization_ops and node.target in quantization_ops:

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/constant_folding.py	2025-06-12 23:43:24.402980+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/constant_folding.py	2025-06-12 23:43:48.369761+00:00
@@ -98,16 +98,17 @@
class _TorchTensorRTConstantFolder(ConstantFolder):  # type: ignore[misc]
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        super().__init__(*args, **kwargs)

    def is_impure(self, node: torch.fx.node.Node) -> bool:
-        # Set of known quantization ops to be excluded from constant folding. 
+        # Set of known quantization ops to be excluded from constant folding.
        # Currently, we exclude all quantization ops coming from modelopt library.
        quantization_ops = {}
        try:
-            # modelopt import ensures torch.ops.tensorrt.quantize_op.default is registered 
+            # modelopt import ensures torch.ops.tensorrt.quantize_op.default is registered
            import modelopt.torch.quantization as mtq
+
            assert torch.ops.tensorrt.quantize_op.default
            quantization_ops.add(torch.ops.tensorrt.quantize_op.default)
        except Exception as e:
            pass
        if quantization_ops and node.target in quantization_ops:

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/constant_folding.py	2025-06-12 23:43:28.499249+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/lowering/passes/constant_folding.py	2025-06-12 23:43:53.506624+00:00
@@ -98,16 +98,17 @@
class _TorchTensorRTConstantFolder(ConstantFolder):  # type: ignore[misc]
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        super().__init__(*args, **kwargs)

    def is_impure(self, node: torch.fx.node.Node) -> bool:
-        # Set of known quantization ops to be excluded from constant folding. 
+        # Set of known quantization ops to be excluded from constant folding.
        # Currently, we exclude all quantization ops coming from modelopt library.
        quantization_ops = {}
        try:
-            # modelopt import ensures torch.ops.tensorrt.quantize_op.default is registered 
+            # modelopt import ensures torch.ops.tensorrt.quantize_op.default is registered
            import modelopt.torch.quantization as mtq
+
            assert torch.ops.tensorrt.quantize_op.default
            quantization_ops.add(torch.ops.tensorrt.quantize_op.default)
        except Exception as e:
            pass
        if quantization_ops and node.target in quantization_ops:

facebook-github-bot added the cla signed label May 21, 2025

narendasan reviewed May 22, 2025

View reviewed changes

examples/dynamo/aot_flashinfer_plugin.py Outdated Show resolved Hide resolved

narendasan reviewed May 22, 2025

View reviewed changes

examples/dynamo/aot_flashinfer_plugin.py Outdated Show resolved Hide resolved

bowang007 added 2 commits June 12, 2025 22:11

AOT plugin: examples with RMSNORM

ea01a78

update

4928f8b

bowang007 force-pushed the aot_rmsnorm branch from 452bdb4 to 4928f8b Compare June 12, 2025 23:42

github-actions bot added the documentation label Jun 12, 2025

github-actions bot requested changes Jun 12, 2025

View reviewed changes

bowang007 requested review from narendasan and peri044 June 12, 2025 23:43

github-actions bot requested changes Jun 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AOT plugin: examples with RMSNORM #3529

AOT plugin: examples with RMSNORM #3529

Uh oh!

bowang007 commented May 21, 2025

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

AOT plugin: examples with RMSNORM #3529

Are you sure you want to change the base?

AOT plugin: examples with RMSNORM #3529

Uh oh!

Conversation

bowang007 commented May 21, 2025

Description

Checklist:

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!