[mlir][llvm] Implement ConstantLike for ZeroOp, UndefOp, PoisonOp #17

guy-david · 2024-05-28T22:21:09Z

These act as constants and should be propagated whenever possible. It is safe to do so for mlir.undef and mlir.poison because they remain "dirty" through out their lifetime and can be duplicated, merged, etc. per the LangRef.

If the direction is good, I'll tend to the broken tests.

…itC (llvm#93450)

…inearize (llvm#92370) Building on top of [llvm#88204](llvm#88204), this PR adds support for converting `vector.insert` into an equivalent `vector.shuffle` operation that operates on linearized (1-D) vectors.

…93539) The pass constructor can be generated automatically. This pass is module-level and then runs on all relevant intrinsic operations inside of the module, no matter what top level operation they are inside of.

@main

…nd in dropUnitDims pass. (llvm#93317) `mlir-opt --linalg-fold-unit-extent-dims` pass on the following IR ``` #map = affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1 + d4, d2 + d5, d6)> #map1 = affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d4, d5, d6, d3)> #map2 = affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1, d2, d3)> module { func.func @main(%arg0: tensor<1x?x?x1xf32>, %arg1: index) -> tensor<?x1x61x1xf32> { %cst = arith.constant dense<1.000000e+00> : tensor<1x1x1x1xf32> %0 = tensor.empty(%arg1) : tensor<?x1x61x1xf32> %1 = linalg.generic {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]} ins(%arg0, %cst : tensor<1x?x?x1xf32>, tensor<1x1x1x1xf32>) outs(%0 : tensor<?x1x61x1xf32>) { ^bb0(%in: f32, %in_0: f32, %out: f32): %2 = arith.mulf %in, %in_0 : f32 %3 = arith.addf %out, %2 : f32 linalg.yield %3 : f32 } -> tensor<?x1x61x1xf32> return %1 : tensor<?x1x61x1xf32> } } ``` produces an incorrect tensor.expand_shape operation: ``` error: 'tensor.expand_shape' op expected dimension 0 of collapsed type to be dynamic since one or more of the corresponding dimensions in the expanded type is dynamic %1 = linalg.generic {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]} ins(%arg0, %cst : tensor<1x?x?x1xf32>, tensor<1x1x1x1xf32>) outs(%0 : tensor<?x1x61x1xf32>) { ^ /mathworks/devel/sandbox/sayans/geckWorks/g3294570/repro.mlir:8:10: note: see current operation: %5 = "tensor.expand_shape"(%4) <{reassociation = [[0, 1, 2, 3]]}> : (tensor<61xf32>) -> tensor<?x1x61x1xf32> // -----// IR Dump After LinalgFoldUnitExtentDimsPass Failed (linalg-fold-unit-extent-dims) //----- // #map = affine_map<(d0) -> (0, d0)> #map1 = affine_map<(d0) -> ()> #map2 = affine_map<(d0) -> (d0)> "builtin.module"() ({ "func.func"() <{function_type = (tensor<1x?x?x1xf32>, index) -> tensor<?x1x61x1xf32>, sym_name = "main"}> ({ ^bb0(%arg0: tensor<1x?x?x1xf32>, %arg1: index): %0 = "arith.constant"() <{value = dense<1.000000e+00> : tensor<f32>}> : () -> tensor<f32> %1 = "tensor.collapse_shape"(%arg0) <{reassociation = [[0, 1], [2, 3]]}> : (tensor<1x?x?x1xf32>) -> tensor<?x?xf32> %2 = "tensor.empty"() : () -> tensor<61xf32> %3 = "tensor.empty"() : () -> tensor<61xf32> %4 = "linalg.generic"(%1, %0, %2, %3) <{indexing_maps = [#map, #map1, #map2, #map2], iterator_types = [#linalg.iterator_type<parallel>], operandSegmentSizes = array<i32: 3, 1>}> ({ ^bb0(%arg2: f32, %arg3: f32, %arg4: f32, %arg5: f32): %6 = "arith.mulf"(%arg2, %arg3) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32 %7 = "arith.addf"(%arg4, %6) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32 "linalg.yield"(%7) : (f32) -> () }) : (tensor<?x?xf32>, tensor<f32>, tensor<61xf32>, tensor<61xf32>) -> tensor<61xf32> %5 = "tensor.expand_shape"(%4) <{reassociation = [[0, 1, 2, 3]]}> : (tensor<61xf32>) -> tensor<?x1x61x1xf32> "func.return"(%5) : (tensor<?x1x61x1xf32>) -> () }) : () -> () }) : () -> () ``` The reason of this is because the dimension `d0` is determined to be an unit-dim that can be dropped based on the dimensions of operand `arg0` to `linalg.generic`. Later on when iterating over operand `outs` the dimension `d0` is determined to be an unit-dim even though the shape corresponding to it is `Shape::kDynamic`. For the `linalg.generic` to be valid `d0` of `outs` does need to be `1` but that isn't properly processed in the current implementation and the dimension is dropped resulting in `outs` operand to be `tensor<61xf32>` in the example. The fix is to also check that the dimension shape is actually `1` before dropping the dimension. The IR after the fix is: ``` #map = affine_map<()[s0, s1] -> (s0 * s1)> #map1 = affine_map<(d0) -> (0, d0)> #map2 = affine_map<(d0) -> ()> module { func.func @main(%arg0: tensor<1x?x?x1xf32>, %arg1: index) -> tensor<?x1x61x1xf32> { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index %cst = arith.constant dense<1.000000e+00> : tensor<f32> %collapsed = tensor.collapse_shape %arg0 [[0, 1], [2, 3]] : tensor<1x?x?x1xf32> into tensor<?x?xf32> %0 = tensor.empty(%arg1) : tensor<?x61xf32> %1 = affine.apply #map()[%arg1, %c1] %2 = tensor.empty(%1) : tensor<?x61xf32> %3 = linalg.generic {indexing_maps = [#map1, #map2, #map1, #map1], iterator_types = ["parallel"]} ins(%collapsed, %cst, %0 : tensor<?x?xf32>, tensor<f32>, tensor<?x61xf32>) outs(%2 : tensor<?x61xf32>) { ^bb0(%in: f32, %in_0: f32, %in_1: f32, %out: f32): %4 = arith.mulf %in, %in_0 : f32 %5 = arith.addf %in_1, %4 : f32 linalg.yield %5 : f32 } -> tensor<?x61xf32> %expanded = tensor.expand_shape %3 [[0, 1], [2, 3]] output_shape [%c0, 1, 61, 1] : tensor<?x61xf32> into tensor<?x1x61x1xf32> return %expanded : tensor<?x1x61x1xf32> } } ```

Clang has some unwritten rules about diagnostic wording regarding things like punctuation and capitalization. This patch documents those rules and adds some tablegen support for checking diagnostics follow the rules. Specifically: tablegen now checks that a diagnostic does not start with a capital letter or end with punctuation, except for the usual exceptions like proper nouns or ending with a question. Now that the code base is clean of such issues, the diagnostics are emitted as an error rather than a warning to ensure that failure to follow these rules is either addressed by an author, or a new exception is added to the checking logic.

Fixes llvm#90941. Add support for ``[[msvc::noinline]]`` attribute, which is actually an alias of ``[[clang::noinline]]``.

And as an extension in older language modes. Per https://eel.is/c++draft/lex.string#nt:d-char Fixes llvm#93130

…lvm#91599) This reverts commit aa9d467.

…le::makeUniqueName()`. (llvm#89057) E.g. during inlining new symbol name can be duplicated and then `ValueSymbolTable::makeUniqueName()` will add unique suffix, exceeding the `non-global-value-max-name-size` restriction. Also fixed `unsigned` type of the option to `int` since `ValueSymbolTable`' constructor can use `-1` value that means unrestricted name size.

…93415) "const" being removed in this patch prevents the move semantics from being used in: AI.CallStack = Callback(IndexedAI.CSId); With this patch on an indexed MemProf Version 2 profile, the cycle count and instruction count go down by 13.3% and 26.3%, respectively, with "llvm-profdata show" modified to deserialize all MemProfRecords.

There was existing support for constant folding a `linalg.generic` that was actually a transpose. This commit adds support for the named op, `linalg.transpose`, as well by making use of the `LinalgOp` interface.

…2127) This change updates the dataLayout string to ensure alignment with the latest LLVM TargetMachine configuration. The aim is to maintain consistency and prevent potential compilation issues related to memory address space handling.

fir.box_rank codegen was invalid, it was assuming the rank field in the descriptor was an i32. This is not correct. Do not hard code the type, use the named position to find the type, and convert as needed in the patterns.

Rename things in a couple of places to make the code a bit clearer.

…ing when parsing declaration DIEs. (llvm#92328) This reapplies llvm@9a7262c (llvm#90663) and added llvm#91808 as a fix. It was causing tests on macos to fail because `SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map owned by this symol file. When there were two symbol files, two different maps were created for caching from compiler type to DIE even if they are for the same module. The solution is to do the same as `SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so the map is shared among multiple SymbolFileDWARF.

…ounding ops. (llvm#93356) The elements that aren't sNans need to get passed through this fadd instruction unchanged. With the agnostic mask policy they might be forced to all ones.

Summary: There was a bug here where we would initialize the plugin multiple times when there were multiple images. Fix it by putting the `is_initliaized` check later.

) This adds - `mlir::tosa::populateTosaToLinalgTypeConversion` which converts tensors of unsigned integers into tensors of signless integers - modifies the `tosa.reshape` lowering in TosaToTensor to use the type converter correctly I choose to implement the type converter in `mlir/Conversion/TosaToLinalg/TosaToLinalg.h` instead of `mlir/Conversion/TosaToTensor/TosaToTensor.h` because I need the same type converter in the TosaToLinalg lowerings (future PR). Alternatively, I could duplicate the type converter so it exists both in TosaToLinalg and TosaToTensor. Let me know if you prefer that.

This patch fixes: clang/unittests/Interpreter/IncrementalProcessingTest.cpp:39:13: error: unused function 'HostSupportsJit' [-Werror,-Wunused-function]

These interfaces are LLVM interfaces, not Clang ones; but this worked because of LLVM.h adding the interfaces to the clang namespace.

…3490) Skip explicit this check in non-valid scopes due to `null` type in lambdas with invalid captures or incomplete parameter lists during parsing Fixes llvm#91536

Limit the logic added in llvm#9230 to cases where either sink or source are loop-invariant, to avoid compile-time increases. This is not needed for correctness. I am working on follow-up changes to reduce the compile-time impact in general to allow us to enable this again for any source/sink. This should fix the compile-time regression introduced by this change: * compile-time improvement with this change: https://llvm-compile-time-tracker.com/compare.php?from=4351787fb650da6d1bfb8d6e58753c90dcd4c418&to=b89010a2eb5f98494787c1c3b77f25208c59090c&stat=instructions:u * compile-time improvement with original patch reverted on top of this change: https://llvm-compile-time-tracker.com/compare.php?from=b89010a2eb5f98494787c1c3b77f25208c59090c&to=19a1103fe68115cfd7d6472c6961f4fabe81a593&stat=instructions:u

… to CMLTz (llvm#92915) This patch mirrors the following SelectionDAG patch for GlobalISel: https://reviews.llvm.org/D130874

…92922)

We know that this gep is inbounds. Constant expression construction already infers this fact, but make it explicit.

This patch adds support for lowering `bitreverse` with types smaller than 8 bits. It also fixes an existing assertion failure in `llvm::APInt::getSplat`: https://godbolt.org/z/7crs8xrcG The lowering logic is copied from SDAG: https://github.com/llvm/llvm-project/blob/2034f2fc8729bd4645ef7caa3c5c6efa284d2d3f/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp#L9384-L9398

We know that this GEP is inbounds, so make it explicit. NFCI because constant expression construction already infers this.

…m#90964)

Summary: The `add_llvm_library` interface handles installing the llvm libraries, however we want to do our own handling. Otherwise, this will install into the `./lib` location instead of the `./lib/<target>` one.

Calling code explicitly checks that ArgNo is inbounds. NFCI because constant expression creation already infers it, this just makes it explicit.

Co-authored-by: klensy <nightouser@gmail.com>

gysit

LGTM modulo some small nits.

Let's prepare an upstream PR?

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

mlir/test/Dialect/LLVMIR/constant-folding.mlir

llvm#93614) This patch generates ExtensionDependency pairs {Earlier, Later} inferred by the 'Implies' field of every Extension defined in tablegen. Implied Subtarget Features that are not Extensions are skipped.

mlir/include/mlir/Dialect/LLVMIR/LLVMAttrDefs.td

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

These are known to be inbounds, create them as such. NFCI because constant expression construction currently already infers this. Also drop the unnecessary zero-index GEP: This is equivalent to the pointer itself nowadays.

…90964) The regression in one test is due to a SUB instruction being pushed through the extend, leaving behind the abs instruction, which prevents it from selecting uabdl instructions shown below: `i32 abs(i32 sub(i32 ext i8, i32 ext i8))` => `i32 abs(i32 ext(i16 sub(i16 ext i8, i16 ext i8)))` This is intended to be fixed in a follow up patch

…umf (llvm#93278) For maxnumf and minnumf, the result of calculations involving NaN will be another value, so their neutral element is set to NaN.

As suggested in a review of some new code for this file, Stream is more general. The code does not need to know that it's backed by a string.

All of these are inbounds as they access known offsets in fixed globals. NFCI because constant expression construction currently already infers this, this patch just makes it explicit.

…ess instruction (llvm#92543) The Prefix instruction is introduced on PowerPC ISA3_1. In the PR, 1. The `FeaturePrefixInstrs` do not imply the `FeatureP8Vector` ,`FeatureP9Vector` . 2. `FeaturePrefixInstrs` implies only the FeatureISA3_1. 3. For the prefix instructions `paddi` and `pli` , they have `Predicates = [PrefixInstrs] ` 4. For the prefix instructions `plfs` and `plfd`, they have `Predicates = [PrefixInstrs, HasFPU] ` 5. For the prefix instructions "plxv` , "plxssp` and `plxsd` , they have `Predicates = [PrefixInstrs, HasP10Vector]` Fixes llvm#62372

These act as constants and should be propagated whenever possible. It is safe to do so for mlir.undef and mlir.poison because they remain "dirty" through out their lifetime and can be duplicated, merged, etc. per the LangRef. Signed-off-by: Guy David <guy.david@nextsilicon.com>

josel-amd and others added 30 commits May 28, 2024 14:36

[mlir][emitc] Support conversion of arith.divsi and arith.remsi to Em…

125bd06

…itC (llvm#93450)

AMDGPU: Add some multi-use negative tests for minimum3/maximum3

fe5d791

[flang] Fix typos PPC intrinsics tests (NFC) (llvm#92943)

1da52ca

[mlir][vector] Add support for linearizing Insert VectorOp in VectorL…

01fbc56

…inearize (llvm#92370) Building on top of [llvm#88204](llvm#88204), this PR adds support for converting `vector.insert` into an equivalent `vector.shuffle` operation that operates on linearized (1-D) vectors.

[bazel] Port 17ecd23

bdd4e8b

[X86][tablgen] Add assertions when emitting NF transform table

5988c79

[gn] port 17ecd23 (-gen-x86-instr-mapping)

2c7c9df

Fix failure after d46e373

6e1a042

[Clang] Add support for [[msvc::noinline]] attribute. (llvm#91720)

8995ccc

Fixes llvm#90941. Add support for ``[[msvc::noinline]]`` attribute, which is actually an alias of ``[[clang::noinline]]``.

[Clang] allow ` @$ `` in raw string delimiters in C++26 (llvm#93216)

2ace7bd

And as an extension in older language modes. Per https://eel.is/c++draft/lex.string#nt:d-char Fixes llvm#93130

[gn build] Port 23e1ed6

57790db

Reland "[AArch64] NFC: Add RUN lines for streaming-compatible code." (l…

46a30df

…lvm#91599) This reverts commit aa9d467.

[mlir][linalg] Add linalg.transpose constant folding (llvm#92589)

74ed79f

There was existing support for constant folding a `linalg.generic` that was actually a transpose. This commit adds support for the named op, `linalg.transpose`, as well by making use of the `LinalgOp` interface.

[lldb][NativePDB] Fix uninitialized values found by msan.

cde1ae4

[Frontend][OpenMP] Rename some variables, NFC

8890214

Rename things in a couple of places to make the code a bit clearer.

[RISCV] Use mask undisturbed policy when silencing sNans for strict r…

d490ce2

…ounding ops. (llvm#93356) The elements that aren't sNans need to get passed through this fadd instruction unchanged. With the agnostic mask policy they might be forced to all ones.

[Offload][Fix] Fix lazy initialization with multiple images

f284af4

Summary: There was a bug here where we would initialize the plugin multiple times when there were multiple images. Fix it by putting the `is_initliaized` check later.

[clang] Fix a warning

fea7399

This patch fixes: clang/unittests/Interpreter/IncrementalProcessingTest.cpp:39:13: error: unused function 'HostSupportsJit' [-Werror,-Wunused-function]

clang:: to llvm::; NFC

273777e

These interfaces are LLVM interfaces, not Clang ones; but this worked because of LLVM.h adding the interfaces to the clang namespace.

[Clang] Fix an assertion failure when checking invalid this (llvm#9…

259caad

…3490) Skip explicit this check in non-valid scopes due to `null` type in lambdas with invalid captures or incomplete parameter lists during parsing Fixes llvm#91536

ChuongGoh-arm and others added 10 commits May 29, 2024 14:15

[AArch64][GlobalISel] Combine MUL(AND(LSHR(X, 15), 0x10001), 0xffff)…

23366d4

… to CMLTz (llvm#92915) This patch mirrors the following SelectionDAG patch for GlobalISel: https://reviews.llvm.org/D130874

[InstSimplify] Generate test checks (NFC)

4ffe263

[InferAddressSpaces] Generate test checks (NFC)

a49b5ca

[clang][OpenMP] Remove unused include of UniqueVector.h, NFC

9377412

[AArch64] Add patterns for conversions using fixed-point scvtf (llvm#…

1ea8cae

…92922)

[LTT] Directly create inbounds gep (NFCI)

0dfd2bf

We know that this gep is inbounds. Constant expression construction already infers this fact, but make it explicit.

[AMDGPU] Reduce use of continue in SIWholeQuadMode. NFC. (llvm#93659)

180448b

[WPD] Directly create geteleementptr inbounds (NFCI)

fba84ec

We know that this GEP is inbounds, so make it explicit. NFCI because constant expression construction already infers this.

[AArch64][NFC] Pre-commit test for Push ADD/SUB through {S|Z}EXT (llv…

886d316

…m#90964)

guy-david force-pushed the llvm-constant-like branch from cc21c9a to c2753fe Compare May 29, 2024 13:45

guy-david requested a review from gysit May 29, 2024 13:45

s-barannikov and others added 4 commits May 29, 2024 16:54

[RISCV][test] Add missing check-prefix to a test (NFC) (llvm#93683)

753ac47

[OpenMP] Fix multiply installing libomp.so (llvm#93685)

df9701b

Summary: The `add_llvm_library` interface handles installing the llvm libraries, however we want to do our own handling. Otherwise, this will install into the `./lib` location instead of the `./lib/<target>` one.

[DFSan] Directly create gep inbounds for arg origin tls (NFCI)

7af5b68

Calling code explicitly checks that ArgNo is inbounds. NFCI because constant expression creation already infers it, this just makes it explicit.

[AMDGPU] Fix filecheck annotation typos

fbe98da

Co-authored-by: klensy <nightouser@gmail.com>

gysit reviewed May 29, 2024

View reviewed changes

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp Outdated Show resolved Hide resolved

mlir/test/Dialect/LLVMIR/constant-folding.mlir Outdated Show resolved Hide resolved

mlir/test/Dialect/LLVMIR/constant-folding.mlir Outdated Show resolved Hide resolved

[AArch64][TargetParser] Move ExtensionDependencies into tablegen [NFC] (

e8e5ba0

llvm#93614) This patch generates ExtensionDependency pairs {Earlier, Later} inferred by the 'Implies' field of every Extension defined in tablegen. Implied Subtarget Features that are not Extensions are skipped.

definelicht reviewed May 29, 2024

View reviewed changes

nikic and others added 2 commits May 29, 2024 16:13

[WasmEHPrepare] Explicitly create inbounds GEP (NFCI)

e20f0fe

These are known to be inbounds, create them as such. NFCI because constant expression construction currently already infers this. Also drop the unnecessary zero-index GEP: This is equivalent to the pointer itself nowadays.

guy-david force-pushed the llvm-constant-like branch from c2753fe to 92b51bc Compare May 29, 2024 14:19

cxy-1993 and others added 4 commits May 29, 2024 10:20

[mlir][arith] Add neutral element support to arith.maxnumf/arith.minn…

0981dca

…umf (llvm#93278) For maxnumf and minnumf, the result of calculations involving NaN will be another value, so their neutral element is set to NaN.

[lldb][NFC] Pass Stream& to ToXML methods in RegisterFlags

799316f

As suggested in a review of some new code for this file, Stream is more general. The code does not need to know that it's backed by a string.

[CGBuiltin] Explicitly use inbounds GEP (NFCI)

975477e

All of these are inbounds as they access known offsets in fixed globals. NFCI because constant expression construction currently already infers this, this patch just makes it explicit.

guy-david force-pushed the llvm-constant-like branch from 92b51bc to dab7be7 Compare May 29, 2024 21:02

guy-david force-pushed the llvm-constant-like branch from dab7be7 to 8fefca1 Compare May 29, 2024 21:03

guy-david closed this May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][llvm] Implement ConstantLike for ZeroOp, UndefOp, PoisonOp #17

[mlir][llvm] Implement ConstantLike for ZeroOp, UndefOp, PoisonOp #17

Uh oh!

guy-david commented May 28, 2024

Uh oh!

gysit left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

105 participants

[mlir][llvm] Implement ConstantLike for ZeroOp, UndefOp, PoisonOp #17

[mlir][llvm] Implement ConstantLike for ZeroOp, UndefOp, PoisonOp #17

Uh oh!

Conversation

guy-david commented May 28, 2024

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

105 participants