From e776638289a0d5c23435b16e0bf4a45159199912 Mon Sep 17 00:00:00 2001 From: tlopex <820958424@qq.com> Date: Sun, 29 Mar 2026 15:59:44 -0400 Subject: [PATCH 1/4] finish1 --- docs/arch/index.rst | 30 +++++++++++++++---- docs/arch/pass_infra.rst | 12 ++++---- docs/arch/runtimes/vulkan.rst | 2 +- docs/deep_dive/tensor_ir/abstraction.rst | 2 +- docs/deep_dive/tensor_ir/index.rst | 16 ++++++++-- .../tensor_ir/tutorials/tir_creation.py | 2 +- .../tensor_ir/tutorials/tir_transformation.py | 8 ++--- docs/reference/api/python/tirx/tirx.rst | 2 +- 8 files changed, 53 insertions(+), 21 deletions(-) diff --git a/docs/arch/index.rst b/docs/arch/index.rst index 3d9e92b25ebb..e388363278ec 100644 --- a/docs/arch/index.rst +++ b/docs/arch/index.rst @@ -306,13 +306,33 @@ in the IRModule. Please refer to the :ref:`Relax Deep Dive ` fo tvm/tirx -------- -tirx contains the definition of the low-level program representations. We use ``tirx::PrimFunc`` to represent functions that can be transformed by tirx passes. -Besides the IR data structures, the tirx module also includes: +``tirx`` contains the core IR definitions and lowering infrastructure for +TensorIR. ``tirx::PrimFunc`` represents low-level tensor functions that can be +transformed by tirx passes. -- A set of analysis passes to analyze the tirx functions in ``tirx/analysis``. -- A set of transformation passes to lower or optimize the tirx functions in ``tirx/transform``. +The tirx module includes: -The schedule primitives and tensor intrinsics are in ``s_tir/schedule`` and ``s_tir/tensor_intrin`` respectively. +- IR data structures (PrimFunc, Buffer, SBlock, expressions, statements). +- Analysis passes in ``tirx/analysis``. +- Transformation and lowering passes in ``tirx/transform``. +- Hardware-aware layout abstractions (TileLayout, SwizzleLayout, ComposeLayout). +- Operator dispatch framework for mapping high-level operators to hardware-specific + implementations. +- Async pipeline primitives (MBarrier, TMABar, TCGen05Bar) for Hopper/Blackwell. + +tvm/s_tir +--------- + +``s_tir`` (Schedulable TIR) contains schedule primitives and auto-tuning tools +that operate on ``tirx::PrimFunc``: + +- Schedule primitives to control code generation (tiling, vectorization, thread + binding) in ``s_tir/schedule``. +- Builtin tensor intrinsics in ``s_tir/tensor_intrin``. +- MetaSchedule for automated performance tuning. +- DLight for pre-defined, high-performance schedules. + +``s_tir`` depends on ``tirx``; ``tirx`` does not depend on ``s_tir``. Please refer to the :ref:`TensorIR Deep Dive ` for more details. diff --git a/docs/arch/pass_infra.rst b/docs/arch/pass_infra.rst index 047a0f48b396..2034e99db429 100644 --- a/docs/arch/pass_infra.rst +++ b/docs/arch/pass_infra.rst @@ -31,7 +31,7 @@ transformation using the analysis result collected during and/or before traversa However, as TVM evolves quickly, the need for a more systematic and efficient way to manage these passes is becoming apparent. In addition, a generic framework that manages the passes across different layers of the TVM stack (e.g. -Relax and tirx) paves the way for developers to quickly prototype and plug the +Relax and TensorIR) paves the way for developers to quickly prototype and plug the implemented passes into the system. This doc describes the design of such an infra that takes the advantage of the @@ -166,7 +166,7 @@ Pass Constructs ^^^^^^^^^^^^^^^ The pass infra is designed in a hierarchical manner, and it could work at -different granularities of Relax/tirx programs. A pure virtual class ``PassNode`` is +different granularities of Relax/TensorIR programs. A pure virtual class ``PassNode`` is introduced to serve as the base of the different optimization passes. This class contains several virtual methods that must be implemented by the subclasses at the level of modules, functions, or sequences of passes. @@ -222,13 +222,13 @@ Function-Level Passes ^^^^^^^^^^^^^^^^^^^^^ Function-level passes are used to implement various intra-function level -optimizations for a given Relax/tirx module. It fetches one function at a time from +optimizations for a given Relax/TensorIR module. It fetches one function at a time from the function list of a module for optimization and yields a rewritten Relax -``Function`` or tirx ``PrimFunc``. Most of passes can be classified into this category, such as +``Function`` or TensorIR ``PrimFunc``. Most of passes can be classified into this category, such as common subexpression elimination and inference simplification in Relax as well as vectorization -and flattening storage in tirx, etc. +and flattening storage in TensorIR, etc. -Note that the scope of passes at this level is either a Relax function or a tirx primitive function. +Note that the scope of passes at this level is either a Relax function or a TensorIR primitive function. Therefore, we cannot add or delete a function through these passes as they are not aware of the global information. diff --git a/docs/arch/runtimes/vulkan.rst b/docs/arch/runtimes/vulkan.rst index e60cc4092487..720f6259c77b 100644 --- a/docs/arch/runtimes/vulkan.rst +++ b/docs/arch/runtimes/vulkan.rst @@ -254,6 +254,6 @@ string are all false boolean flags. validated with `spvValidate`_. * ``TVM_VULKAN_DEBUG_SHADER_SAVEPATH`` - A path to a directory. If - set to a non-empty string, the Vulkan codegen will save tirx, binary + set to a non-empty string, the Vulkan codegen will save TIR, binary SPIR-V, and disassembled SPIR-V shaders to this directory, to be used for debugging purposes. diff --git a/docs/deep_dive/tensor_ir/abstraction.rst b/docs/deep_dive/tensor_ir/abstraction.rst index a36e15677eee..f46cc1058002 100644 --- a/docs/deep_dive/tensor_ir/abstraction.rst +++ b/docs/deep_dive/tensor_ir/abstraction.rst @@ -15,7 +15,7 @@ specific language governing permissions and limitations under the License. -.. _tir-abstraction: +.. _tirx-abstraction-basics: Tensor Program Abstraction -------------------------- diff --git a/docs/deep_dive/tensor_ir/index.rst b/docs/deep_dive/tensor_ir/index.rst index 66e153ec01a5..51ada9096757 100644 --- a/docs/deep_dive/tensor_ir/index.rst +++ b/docs/deep_dive/tensor_ir/index.rst @@ -19,8 +19,20 @@ TensorIR ======== -TensorIR is one of the core abstraction in Apache TVM stack, which is used to -represent and optimize the primitive tensor functions. +TensorIR is one of the core abstractions in the Apache TVM stack, used to +represent and optimize primitive tensor functions. + +The TensorIR codebase is organized into two modules: + +- **tirx** — Core IR definitions and lowering (PrimFunc, Buffer, SBlock, + expressions, statements, lowering passes). +- **s_tir** (Schedulable TIR) — Schedule primitives, MetaSchedule, DLight, and + tensor intrinsics. These tools operate on tirx IR to apply performance + optimizations. + +In TVMScript, the recommended alias for new tirx code is +``from tvm.script import tirx as Tx``. Existing code uses +``from tvm.script import tirx as T``. .. toctree:: :maxdepth: 2 diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_creation.py b/docs/deep_dive/tensor_ir/tutorials/tir_creation.py index 973eac4c6d34..40be0284b493 100644 --- a/docs/deep_dive/tensor_ir/tutorials/tir_creation.py +++ b/docs/deep_dive/tensor_ir/tutorials/tir_creation.py @@ -17,7 +17,7 @@ # ruff: noqa: E402 """ -.. _tir-creation: +.. _tirx-creation: TensorIR Creation ----------------- diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py index 4e59c6c1a7f6..b010a1b1c51d 100644 --- a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py +++ b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py @@ -19,10 +19,10 @@ """ .. _tirx-transform: -Transformation --------------- -In this section, we will get to the main ingredients of the compilation flows - -transformations of primitive tensor functions. +Transformation (s_tir) +---------------------- +This section covers **s_tir** (Schedulable TIR) schedule primitives — the main +tools for transforming primitive tensor functions for performance optimization. """ ###################################################################### diff --git a/docs/reference/api/python/tirx/tirx.rst b/docs/reference/api/python/tirx/tirx.rst index cf55ad5588b1..d64c248344b6 100644 --- a/docs/reference/api/python/tirx/tirx.rst +++ b/docs/reference/api/python/tirx/tirx.rst @@ -20,4 +20,4 @@ tvm.tirx .. automodule:: tvm.tirx :members: :imported-members: - :exclude-members: PrimExpr, const, StmtSRef, SBlockScope, ScheduleState, Schedule, ScheduleError + :exclude-members: PrimExpr, const From bb1323ca7945a3cd886e0d905dc30bb0ceb74e9e Mon Sep 17 00:00:00 2001 From: tlopex <820958424@qq.com> Date: Sun, 29 Mar 2026 17:11:14 -0400 Subject: [PATCH 2/4] finish2 --- docs/arch/index.rst | 16 ++++++---------- docs/deep_dive/tensor_ir/index.rst | 15 +++++++-------- 2 files changed, 13 insertions(+), 18 deletions(-) diff --git a/docs/arch/index.rst b/docs/arch/index.rst index e388363278ec..853da16185a8 100644 --- a/docs/arch/index.rst +++ b/docs/arch/index.rst @@ -306,8 +306,9 @@ in the IRModule. Please refer to the :ref:`Relax Deep Dive ` fo tvm/tirx -------- -``tirx`` contains the core IR definitions and lowering infrastructure for -TensorIR. ``tirx::PrimFunc`` represents low-level tensor functions that can be +``tirx`` is the renamed low-level portion of the former ``tir`` module. +It contains the core IR definitions and lowering infrastructure for TensorIR. +``tirx::PrimFunc`` represents low-level tensor functions that can be transformed by tirx passes. The tirx module includes: @@ -315,16 +316,13 @@ The tirx module includes: - IR data structures (PrimFunc, Buffer, SBlock, expressions, statements). - Analysis passes in ``tirx/analysis``. - Transformation and lowering passes in ``tirx/transform``. -- Hardware-aware layout abstractions (TileLayout, SwizzleLayout, ComposeLayout). -- Operator dispatch framework for mapping high-level operators to hardware-specific - implementations. -- Async pipeline primitives (MBarrier, TMABar, TCGen05Bar) for Hopper/Blackwell. tvm/s_tir --------- -``s_tir`` (Schedulable TIR) contains schedule primitives and auto-tuning tools -that operate on ``tirx::PrimFunc``: +``s_tir`` (Schedulable TIR) is the renamed scheduling portion of the former +``tir`` module. It contains schedule primitives and auto-tuning tools that +operate on ``tirx::PrimFunc``: - Schedule primitives to control code generation (tiling, vectorization, thread binding) in ``s_tir/schedule``. @@ -332,8 +330,6 @@ that operate on ``tirx::PrimFunc``: - MetaSchedule for automated performance tuning. - DLight for pre-defined, high-performance schedules. -``s_tir`` depends on ``tirx``; ``tirx`` does not depend on ``s_tir``. - Please refer to the :ref:`TensorIR Deep Dive ` for more details. tvm/arith diff --git a/docs/deep_dive/tensor_ir/index.rst b/docs/deep_dive/tensor_ir/index.rst index 51ada9096757..f5a09a55a473 100644 --- a/docs/deep_dive/tensor_ir/index.rst +++ b/docs/deep_dive/tensor_ir/index.rst @@ -22,16 +22,15 @@ TensorIR TensorIR is one of the core abstractions in the Apache TVM stack, used to represent and optimize primitive tensor functions. -The TensorIR codebase is organized into two modules: +The former ``tir`` module has been split into two modules: -- **tirx** — Core IR definitions and lowering (PrimFunc, Buffer, SBlock, - expressions, statements, lowering passes). -- **s_tir** (Schedulable TIR) — Schedule primitives, MetaSchedule, DLight, and - tensor intrinsics. These tools operate on tirx IR to apply performance - optimizations. +- **tirx** — The renamed low-level portion: core IR definitions and lowering + (PrimFunc, Buffer, SBlock, expressions, statements, lowering passes). +- **s_tir** (Schedulable TIR) — The renamed scheduling portion: schedule + primitives, MetaSchedule, DLight, and tensor intrinsics. These tools operate + on tirx IR to apply performance optimizations. -In TVMScript, the recommended alias for new tirx code is -``from tvm.script import tirx as Tx``. Existing code uses +In TVMScript, both modules are accessed via ``from tvm.script import tirx as T``. .. toctree:: From 15fcc41154173c7479104f85c7f84d04a7e9bf7f Mon Sep 17 00:00:00 2001 From: tlopex <820958424@qq.com> Date: Sun, 29 Mar 2026 19:50:20 -0400 Subject: [PATCH 3/4] finish3 --- docs/arch/index.rst | 40 +++++++++---------- docs/deep_dive/tensor_ir/abstraction.rst | 2 +- docs/deep_dive/tensor_ir/index.rst | 11 +++-- .../tensor_ir/tutorials/tir_creation.py | 2 +- .../tensor_ir/tutorials/tir_transformation.py | 8 ++-- docs/reference/api/python/tirx/tirx.rst | 2 +- 6 files changed, 31 insertions(+), 34 deletions(-) diff --git a/docs/arch/index.rst b/docs/arch/index.rst index 853da16185a8..e9b2a4dd49ed 100644 --- a/docs/arch/index.rst +++ b/docs/arch/index.rst @@ -83,14 +83,14 @@ relax transformations relax transformations contain a collection of passes that apply to relax functions. The optimizations include common graph-level optimizations such as constant folding and dead-code elimination for operators, and backend-specific optimizations such as library dispatch. -tirx transformations -^^^^^^^^^^^^^^^^^^^^ +TensorIR transformations +^^^^^^^^^^^^^^^^^^^^^^^^ - **TensorIR schedule**: TensorIR schedules are designed to optimize the TensorIR functions for a specific target, with user-guided instructions and control how the target code is generated. - For CPU targets, tirx PrimFunc can generate valid code and execute on the target device without schedule but with very-low performance. However, for GPU targets, the schedule is essential + For CPU targets, a TensorIR PrimFunc can generate valid code and execute on the target device without schedule but with very-low performance. However, for GPU targets, the schedule is essential for generating valid code with thread bindings. For more details, please refer to the :ref:`TensorIR Transformation ` section. Additionally, we provides ``MetaSchedule`` to automate the search of TensorIR schedule. -- **Lowering Passes**: These passes usually perform after the schedule is applied, transforming a tirx PrimFunc into another functionally equivalent PrimFunc, but closer to the +- **Lowering Passes**: These passes usually perform after the schedule is applied, transforming a TensorIR PrimFunc into another functionally equivalent PrimFunc, but closer to the target-specific representation. For example, there are passes to flatten multi-dimensional access to one-dimensional pointer access, to expand the intrinsics into target-specific ones, and to decorate the function entry to meet the runtime calling convention. @@ -101,12 +101,12 @@ focus on optimizations that are not covered by them. cross-level transformations ^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Apache TVM enables cross-level optimization of end-to-end models. As the IRModule includes both relax and tirx functions, the cross-level transformations are designed to mutate +Apache TVM enables cross-level optimization of end-to-end models. As the IRModule includes both Relax and TensorIR functions, the cross-level transformations are designed to mutate the IRModule by applying different transformations to these two types of functions. -For example, ``relax.LegalizeOps`` pass mutates the IRModule by lowering relax operators, adding corresponding tirx PrimFunc into the IRModule, and replacing the relax operators -with calls to the lowered tirx PrimFunc. Another example is operator fusion pipeline in relax (including ``relax.FuseOps`` and ``relax.FuseTIR``), which fuses multiple consecutive tensor operations -into one. Different from the previous implementations, relax fusion pipeline analyzes the pattern of tirx functions and detects the best fusion rules automatically rather +For example, ``relax.LegalizeOps`` pass mutates the IRModule by lowering relax operators, adding corresponding TensorIR PrimFunc into the IRModule, and replacing the relax operators +with calls to the lowered TensorIR PrimFunc. Another example is operator fusion pipeline in relax (including ``relax.FuseOps`` and ``relax.FuseTIR``), which fuses multiple consecutive tensor operations +into one. Different from the previous implementations, relax fusion pipeline analyzes the pattern of TensorIR functions and detects the best fusion rules automatically rather than human-defined operator fusion patterns. Target Translation @@ -306,10 +306,9 @@ in the IRModule. Please refer to the :ref:`Relax Deep Dive ` fo tvm/tirx -------- -``tirx`` is the renamed low-level portion of the former ``tir`` module. -It contains the core IR definitions and lowering infrastructure for TensorIR. -``tirx::PrimFunc`` represents low-level tensor functions that can be -transformed by tirx passes. +``tirx`` contains the core IR definitions and lowering infrastructure +for TensorIR (split from the former ``tir`` module). ``tirx::PrimFunc`` +represents low-level tensor functions that can be transformed by tirx passes. The tirx module includes: @@ -320,9 +319,8 @@ The tirx module includes: tvm/s_tir --------- -``s_tir`` (Schedulable TIR) is the renamed scheduling portion of the former -``tir`` module. It contains schedule primitives and auto-tuning tools that -operate on ``tirx::PrimFunc``: +``s_tir`` (Schedulable TIR) contains schedule primitives and auto-tuning +tools that operate on ``tirx::PrimFunc`` (split from the former ``tir`` module): - Schedule primitives to control code generation (tiling, vectorization, thread binding) in ``s_tir/schedule``. @@ -335,9 +333,9 @@ Please refer to the :ref:`TensorIR Deep Dive ` for more det tvm/arith --------- -This module is closely tied to tirx. One of the key problems in the low-level code generation is the analysis of the indices' +This module is closely tied to TensorIR. One of the key problems in the low-level code generation is the analysis of the indices' arithmetic properties — the positiveness, variable bound, and the integer set that describes the iterator space. arith module provides -a collection of tools that do (primarily integer) analysis. A tirx pass can use these analyses to simplify and optimize the code. +a collection of tools that do (primarily integer) analysis. A TensorIR pass can use these analyses to simplify and optimize the code. tvm/te and tvm/topi ------------------- @@ -346,7 +344,7 @@ TE stands for Tensor Expression. TE is a domain-specific language (DSL) for desc itself is not a self-contained function that can be stored into IRModule. We can use ``te.create_prim_func`` to convert a tensor expression to a ``tirx::PrimFunc`` and then integrate it into the IRModule. -While possible to construct operators directly via tirx or tensor expressions (TE) for each use case, it is tedious to do so. +While possible to construct operators directly via TensorIR or tensor expressions (TE) for each use case, it is tedious to do so. `topi` (Tensor operator inventory) provides a set of pre-defined operators defined by numpy and found in common deep learning workloads. tvm/s_tir/meta_schedule @@ -355,10 +353,10 @@ tvm/s_tir/meta_schedule MetaSchedule is a system for automated search-based program optimization, and can be used to optimize TensorIR schedules. Note that MetaSchedule only works with static-shape workloads. -tvm/dlight ----------- +tvm/s_tir/dlight +---------------- -DLight is a set of pre-defined, easy-to-use, and performant tirx schedules. DLight aims: +DLight is a set of pre-defined, easy-to-use, and performant s_tir schedules. DLight aims: - Fully support **dynamic shape workloads**. - **Light weight**. DLight schedules provides tuning-free schedule with reasonable performance. diff --git a/docs/deep_dive/tensor_ir/abstraction.rst b/docs/deep_dive/tensor_ir/abstraction.rst index f46cc1058002..a36e15677eee 100644 --- a/docs/deep_dive/tensor_ir/abstraction.rst +++ b/docs/deep_dive/tensor_ir/abstraction.rst @@ -15,7 +15,7 @@ specific language governing permissions and limitations under the License. -.. _tirx-abstraction-basics: +.. _tir-abstraction: Tensor Program Abstraction -------------------------- diff --git a/docs/deep_dive/tensor_ir/index.rst b/docs/deep_dive/tensor_ir/index.rst index f5a09a55a473..95a6a3a402cc 100644 --- a/docs/deep_dive/tensor_ir/index.rst +++ b/docs/deep_dive/tensor_ir/index.rst @@ -22,13 +22,12 @@ TensorIR TensorIR is one of the core abstractions in the Apache TVM stack, used to represent and optimize primitive tensor functions. -The former ``tir`` module has been split into two modules: +The TensorIR codebase consists of two modules (split from the former ``tir``): -- **tirx** — The renamed low-level portion: core IR definitions and lowering - (PrimFunc, Buffer, SBlock, expressions, statements, lowering passes). -- **s_tir** (Schedulable TIR) — The renamed scheduling portion: schedule - primitives, MetaSchedule, DLight, and tensor intrinsics. These tools operate - on tirx IR to apply performance optimizations. +- **tirx** — Core IR definitions and lowering (PrimFunc, Buffer, SBlock, + expressions, statements, lowering passes). +- **s_tir** (Schedulable TIR) — Schedule primitives, MetaSchedule, DLight, + and tensor intrinsics. In TVMScript, both modules are accessed via ``from tvm.script import tirx as T``. diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_creation.py b/docs/deep_dive/tensor_ir/tutorials/tir_creation.py index 40be0284b493..973eac4c6d34 100644 --- a/docs/deep_dive/tensor_ir/tutorials/tir_creation.py +++ b/docs/deep_dive/tensor_ir/tutorials/tir_creation.py @@ -17,7 +17,7 @@ # ruff: noqa: E402 """ -.. _tirx-creation: +.. _tir-creation: TensorIR Creation ----------------- diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py index b010a1b1c51d..4e59c6c1a7f6 100644 --- a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py +++ b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.py @@ -19,10 +19,10 @@ """ .. _tirx-transform: -Transformation (s_tir) ----------------------- -This section covers **s_tir** (Schedulable TIR) schedule primitives — the main -tools for transforming primitive tensor functions for performance optimization. +Transformation +-------------- +In this section, we will get to the main ingredients of the compilation flows - +transformations of primitive tensor functions. """ ###################################################################### diff --git a/docs/reference/api/python/tirx/tirx.rst b/docs/reference/api/python/tirx/tirx.rst index d64c248344b6..cf55ad5588b1 100644 --- a/docs/reference/api/python/tirx/tirx.rst +++ b/docs/reference/api/python/tirx/tirx.rst @@ -20,4 +20,4 @@ tvm.tirx .. automodule:: tvm.tirx :members: :imported-members: - :exclude-members: PrimExpr, const + :exclude-members: PrimExpr, const, StmtSRef, SBlockScope, ScheduleState, Schedule, ScheduleError From fbccbb337c4f879da3f97d2936ce1d603a1256c0 Mon Sep 17 00:00:00 2001 From: tlopex <820958424@qq.com> Date: Sun, 29 Mar 2026 19:53:50 -0400 Subject: [PATCH 4/4] finish4 --- docs/arch/index.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/arch/index.rst b/docs/arch/index.rst index e9b2a4dd49ed..878717c35581 100644 --- a/docs/arch/index.rst +++ b/docs/arch/index.rst @@ -319,8 +319,8 @@ The tirx module includes: tvm/s_tir --------- -``s_tir`` (Schedulable TIR) contains schedule primitives and auto-tuning -tools that operate on ``tirx::PrimFunc`` (split from the former ``tir`` module): +``s_tir`` (Schedulable TIR, split from the former ``tir`` module) contains +schedule primitives and auto-tuning tools that operate on ``tirx::PrimFunc``: - Schedule primitives to control code generation (tiling, vectorization, thread binding) in ``s_tir/schedule``.