From c8698ed4fe3c632add239f2b91100e66a9228422 Mon Sep 17 00:00:00 2001 From: Farzon Lotfi Date: Tue, 21 Oct 2025 22:43:35 -0400 Subject: [PATCH 1/3] Add HLSL Matrix expression fixes #354 Adds a working plan for the new Expression type. --- .../NNNN-hlsl-matrix-accessor-swizzle-expr.md | 109 ++++++++++++++++++ 1 file changed, 109 insertions(+) create mode 100644 proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md diff --git a/proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md b/proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md new file mode 100644 index 0000000..b9b1d25 --- /dev/null +++ b/proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md @@ -0,0 +1,109 @@ +--- +title: NNNN - HLSL Matrix Swizzle Expression +params: + authors: + + farzonl: Farzon Lotfi + status: Design In Progress +--- + +## Introduction + +The MatrixSwizzleExpr node is needed to extend Clang’s AST to accurately +represent matrix element “swizzling” syntax used in HLSL +(e.g., M._m00_m01, M._m10_m01 = 1.xx). Existing AST constructs such as +MatrixSubscriptExpr or ExtVectorElementExpr cannot fully capture this behavior +because they would not exactly match source spelling and per-component locations +and would not have correct l-value semantics and duplication rules. + +## Requirements for a MatrixSwizzleExpr AST Node + +To represent access like `M._m00_m01_m10` that can produce a vector of the +matrix element type. + +For l-value cases: +* The base is an l-value and modifiable. +* The swizzle has no duplicate element references (same (row, col) repeated) + on assignment. + + example: this is not ok `M._m00_m00 = 1.xx;` + +For r-value cases: +* Rvalues are allowed even with duplicates (like vector swizzles). + + example: this is ok `float2 V = M._m00_m00;` + +The AST Node must: +* Preserve exact spelling (token sequence after the dot) and per-component + source locations for faithful printing and rewriting. + + That means we store source location start and stop for each matrix element + accessor in the swizzle. + +* Be able to represent a matrix swizzle sequence between one to four elements. + +## Implementation + +### AST Implementation + +We need a way to represent each element: + +```cpp +struct Component { + unsigned Row, Col; + SourceLocation TokBegin, TokEnd; + }; +``` + +We will create a new expression that has a means of knowing if duplicates are in +the swizzle sequence. It should know the base matrix sequence, and a list to +keep track of components. It should be able to keep track of source location +from the dot to the last component in the sequence. It should know if we are +in a zero or one indexed sequence so as not to mix the two. + +```cpp +class MatrixSwizzleExpr final : public Expr { + private: + Stmt *Base; // matrix-typed expression + llvm::SmallVector Comps; // selected (r,c) list + SourceLocation DotLoc, UnderLoc; // '.' and first '_' (after dot) + StringRef FullSuffixSpelling; // e.g. "_m00_m01" (owned by ASTContext) + bool FromIdentifierToken : 1; // was lexed as one ident (i.e., one index) + bool HasDuplicates : 1; +}; +``` + +There should be a small parser to populate the Comps SmallVector. + +### Codegen Implementation + +After we add this new AST component, we need a special emitter similar to +`EmitExtVectorElementExpr` . + +A new emitter `EmitMatrixSwizzleExpr` will be added. The codegen will breakdown +into two cases: R-Value and L-Value cases that will breakdown into either an +`EmitLoadOfScalar` or an `EmitExtVectorElementExpr` case. + +R-value path: +* Emit Base to an address/aggregate per matrix lowering. +* For each (r, c), compute the element address and `EmitLoadOfScalar`. +* If N==1 and scalar policy we are done, just return that scalar. +* Else construct a VectorExt via InsertElement. + +L-value path: +* Only when `VK_LValue`. Represent as a pseudo-lvalue that, on store, scatters + into the computed element addresses. +* Implement similar to `ExtVectorElementExpr` l-value: materialize an + addressable proxy that on EmitStoreThroughLValue performs per-element stores. + +### AST serialization implementation + +Other than codegen, we also need to support AST serialization. +The files we need to modify are `ASTReaderStmt.cpp` and `ASTWriterStmt.cpp` . + +This is where things like `FullSuffixSpelling` will be important. +The implementation should make sure we can encode the base stmt of the matrix, +that it can capture all the rows and columns of the components in the swizzle. +We also need to consider if no duplicates and no index type mixing should be +enforced as part of serialization/deserialization. + +### Miscellaneous + +There might be some work that needs to be done to support clang tooling. +Investigation into ASTImporter or TreeTransform tools should be done. From feddcdd82d1e42fe001e1f1fde38dd656799b8b4 Mon Sep 17 00:00:00 2001 From: Farzon Lotfi Date: Mon, 8 Dec 2025 17:50:24 -0500 Subject: [PATCH 2/3] update the proposal --- .../NNNN-hlsl-matrix-accessor-swizzle-expr.md | 109 ------------- proposals/NNNN-hlsl-matrix-element-expr.md | 146 ++++++++++++++++++ 2 files changed, 146 insertions(+), 109 deletions(-) delete mode 100644 proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md create mode 100644 proposals/NNNN-hlsl-matrix-element-expr.md diff --git a/proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md b/proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md deleted file mode 100644 index b9b1d25..0000000 --- a/proposals/NNNN-hlsl-matrix-accessor-swizzle-expr.md +++ /dev/null @@ -1,109 +0,0 @@ ---- -title: NNNN - HLSL Matrix Swizzle Expression -params: - authors: - + farzonl: Farzon Lotfi - status: Design In Progress ---- - -## Introduction - -The MatrixSwizzleExpr node is needed to extend Clang’s AST to accurately -represent matrix element “swizzling” syntax used in HLSL -(e.g., M._m00_m01, M._m10_m01 = 1.xx). Existing AST constructs such as -MatrixSubscriptExpr or ExtVectorElementExpr cannot fully capture this behavior -because they would not exactly match source spelling and per-component locations -and would not have correct l-value semantics and duplication rules. - -## Requirements for a MatrixSwizzleExpr AST Node - -To represent access like `M._m00_m01_m10` that can produce a vector of the -matrix element type. - -For l-value cases: -* The base is an l-value and modifiable. -* The swizzle has no duplicate element references (same (row, col) repeated) - on assignment. - + example: this is not ok `M._m00_m00 = 1.xx;` - -For r-value cases: -* Rvalues are allowed even with duplicates (like vector swizzles). - + example: this is ok `float2 V = M._m00_m00;` - -The AST Node must: -* Preserve exact spelling (token sequence after the dot) and per-component - source locations for faithful printing and rewriting. - + That means we store source location start and stop for each matrix element - accessor in the swizzle. - -* Be able to represent a matrix swizzle sequence between one to four elements. - -## Implementation - -### AST Implementation - -We need a way to represent each element: - -```cpp -struct Component { - unsigned Row, Col; - SourceLocation TokBegin, TokEnd; - }; -``` - -We will create a new expression that has a means of knowing if duplicates are in -the swizzle sequence. It should know the base matrix sequence, and a list to -keep track of components. It should be able to keep track of source location -from the dot to the last component in the sequence. It should know if we are -in a zero or one indexed sequence so as not to mix the two. - -```cpp -class MatrixSwizzleExpr final : public Expr { - private: - Stmt *Base; // matrix-typed expression - llvm::SmallVector Comps; // selected (r,c) list - SourceLocation DotLoc, UnderLoc; // '.' and first '_' (after dot) - StringRef FullSuffixSpelling; // e.g. "_m00_m01" (owned by ASTContext) - bool FromIdentifierToken : 1; // was lexed as one ident (i.e., one index) - bool HasDuplicates : 1; -}; -``` - -There should be a small parser to populate the Comps SmallVector. - -### Codegen Implementation - -After we add this new AST component, we need a special emitter similar to -`EmitExtVectorElementExpr` . - -A new emitter `EmitMatrixSwizzleExpr` will be added. The codegen will breakdown -into two cases: R-Value and L-Value cases that will breakdown into either an -`EmitLoadOfScalar` or an `EmitExtVectorElementExpr` case. - -R-value path: -* Emit Base to an address/aggregate per matrix lowering. -* For each (r, c), compute the element address and `EmitLoadOfScalar`. -* If N==1 and scalar policy we are done, just return that scalar. -* Else construct a VectorExt via InsertElement. - -L-value path: -* Only when `VK_LValue`. Represent as a pseudo-lvalue that, on store, scatters - into the computed element addresses. -* Implement similar to `ExtVectorElementExpr` l-value: materialize an - addressable proxy that on EmitStoreThroughLValue performs per-element stores. - -### AST serialization implementation - -Other than codegen, we also need to support AST serialization. -The files we need to modify are `ASTReaderStmt.cpp` and `ASTWriterStmt.cpp` . - -This is where things like `FullSuffixSpelling` will be important. -The implementation should make sure we can encode the base stmt of the matrix, -that it can capture all the rows and columns of the components in the swizzle. -We also need to consider if no duplicates and no index type mixing should be -enforced as part of serialization/deserialization. - -### Miscellaneous - -There might be some work that needs to be done to support clang tooling. -Investigation into ASTImporter or TreeTransform tools should be done. diff --git a/proposals/NNNN-hlsl-matrix-element-expr.md b/proposals/NNNN-hlsl-matrix-element-expr.md new file mode 100644 index 0000000..fd28d64 --- /dev/null +++ b/proposals/NNNN-hlsl-matrix-element-expr.md @@ -0,0 +1,146 @@ +--- +title: NNNN - HLSL Matrix Element Expression +params: + authors: + farzonl: Farzon Lotfi +status: Design In Progress +--- + +## Introduction + +HLSL supports matrix “element accessor” syntax that can name individual elements +or swizzles of elements, for example: + +- `M._m00` (zero-based, one element) +- `M._11` (one-based, one element) +- `M._m00_m11` (zero-based swizzle) +- `M._11_22_33` (one-based swizzle) + +To represent these in Clang’s AST we introduce `MatrixElementExpr`, an +expression node that is structurally similar to `ExtVectorElementExpr`, but +specialized for matrices. + +`MatrixElementExpr`: + +- Represents both scalar matrix element access and matrix swizzles. +- Preserves the accessor spelling (e.g. `_m00_m11`) as a single identifier. +- Implements the HLSL rules for 0-based (`_mRC`) vs 1-based (`_RC`) accessors. +- Enforces l-value semantics (no duplicate components on assignment) analogous + to vector swizzles. + +Unlike the earlier proposal for a dedicated `MatrixSwizzleExpr` with an +explicit per-component list and per-component source locations, the final +design: + +- Reuses the existing `ExtVectorElementExpr` machinery via a shared base class. + - Per [Aaron Ballman Feedback](https://discourse.llvm.org/t/rfc-extend-extvectorelementexpr-for-hlsl-matrix-accessors/88802/4) + - ExtVectorElementExpr is migrated to use this base. + - MatrixElementExpr is introduced as a sibling. +- Keeps the AST node compact, storing only the base expression, the accessor + identifier, and the accessor source location. +- Computes row/column indices and duplicate information on demand. + +## Requirements for a MatrixElementExpr AST Node + +We want to represent access such as `M._m00_m01_m10` that can produce a vector +of the matrix element type, as well as single-element cases like `M._m00`. + +### General + +- The node must be usable for: + - Scalar element access: `float r = A._m00;` + - Swizzle access with length 1–4: `float3 v = A._11_22_33;` +- The accessor must follow the HLSL matrix rules: + - Zero-based form: `_mRC` where `R` and `C` are decimal digits. + - One-based form: `_RC` where `R` and `C` are decimal digits. + - A swizzle is a `_`-separated sequence of those forms: + - `_m00_m11`, `_11_22_33`, etc. +- Swizzle length must be between 1 and 4 (inclusive). Invalid lengths should + be rejected during semantic analysis. + +### L-value vs R-value semantics + +For l-value (assignment) cases: + +- The base must be a modifiable l-value matrix. +- The swizzle must not contain duplicate element references (same `(row, col)` + pair repeated) when used as a store destination. + - Example (not allowed): `A._m00_m00 = 1.xx;` +- Assignments with duplicate components should be rejected with an error akin + to “matrix is not assignable (contains duplicate components)”. + +For r-value cases: + +- Reads are allowed even with duplicate components, analogous to vector + swizzles. + - Example (allowed): `float2 v = A._m00_m00;` +- The number of components in the accessor determines the result type: + - 1 component → scalar element type. + - N components (2–4) → `vector`. + +### Bounds and accessor validation + +Sema must validate that: + +- The accessor uses one of the supported forms (`_mRC` or `_RC`). +- The indices are within bounds for the matrix type: + - For zero-based accessors, `R` and `C` must be in `[0, rows-1]` and + `[0, cols-1]`. + - For one-based accessors, `R` and `C` must be in `[1, rows]` and + `[1, cols]`. +- Clear diagnostics are produced for: + - Accessors that are lexically malformed (bad characters, wrong length, + wrong prefix, mixed forms in a single accessor string). + - Accessors that are syntactically correct but out-of-bounds for the + given matrix type. + - Swizzle lengths that are not in `[1, 4]`. + +## Implementation + +### AST Implementation + +We introduce a small CRTP base for element-like access expressions that are +spelled via a single identifier and applied to a base expression: + +```cpp +template +class ElementAccessExprBase : public Expr { + Stmt *Base; + IdentifierInfo *Accessor; + SourceLocation AccessorLoc; + +protected: + ElementAccessExprBase(StmtClass SC, QualType Ty, ExprValueKind VK, + Expr *Base, IdentifierInfo &Accessor, + SourceLocation Loc, ExprObjectKind OK) + : Expr(SC, Ty, VK, OK), + Base(Base), + Accessor(&Accessor), + AccessorLoc(Loc) { + setDependence(computeDependence(static_cast(this))); + } + + explicit ElementAccessExprBase(StmtClass SC, EmptyShell Empty) + : Expr(SC, Empty) {} + +public: + const Expr *getBase() const { return cast(Base); } + Expr *getBase() { return cast(Base); } + void setBase(Expr *E) { Base = E; } + + IdentifierInfo *getAccessor() const { return Accessor; } + void setAccessor(IdentifierInfo *II) { Accessor = II; } + + SourceLocation getAccessorLoc() const { return AccessorLoc; } + void setAccessorLoc(SourceLocation L) { AccessorLoc = L; } + + SourceLocation getBeginLoc() const { return getBase()->getBeginLoc(); } + SourceLocation getEndLoc() const { return AccessorLoc; } + + /// Helpers implemented by the derived class: + /// + /// - unsigned getNumElements() const; + /// - bool containsDuplicateElements() const; + /// - void getEncodedElementAccess(SmallVectorImpl &Elts) const; +}; +``` \ No newline at end of file From 4d998f27c948445cc6ea8ac99587af8ae6569390 Mon Sep 17 00:00:00 2001 From: Farzon Lotfi Date: Thu, 11 Dec 2025 12:33:09 -0500 Subject: [PATCH 3/3] Apply suggestions from code review Co-authored-by: Helena Kotas --- proposals/NNNN-hlsl-matrix-element-expr.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/NNNN-hlsl-matrix-element-expr.md b/proposals/NNNN-hlsl-matrix-element-expr.md index fd28d64..51160b0 100644 --- a/proposals/NNNN-hlsl-matrix-element-expr.md +++ b/proposals/NNNN-hlsl-matrix-element-expr.md @@ -49,7 +49,7 @@ of the matrix element type, as well as single-element cases like `M._m00`. - The node must be usable for: - Scalar element access: `float r = A._m00;` - - Swizzle access with length 1–4: `float3 v = A._11_22_33;` + - Swizzle access for 1 to 4 elements: `float3 v = A._11_22_33;` - The accessor must follow the HLSL matrix rules: - Zero-based form: `_mRC` where `R` and `C` are decimal digits. - One-based form: `_RC` where `R` and `C` are decimal digits. @@ -65,7 +65,7 @@ For l-value (assignment) cases: - The base must be a modifiable l-value matrix. - The swizzle must not contain duplicate element references (same `(row, col)` pair repeated) when used as a store destination. - - Example (not allowed): `A._m00_m00 = 1.xx;` + - Example (not allowed): `A._m00_m00 = float2(1, 2);` - Assignments with duplicate components should be rejected with an error akin to “matrix is not assignable (contains duplicate components)”.