Skip to content

Conversation

@guy-david
Copy link

These act as constants and should be propagated whenever possible. It is safe to do so for mlir.undef and mlir.poison because they remain "dirty" through out their lifetime and can be duplicated, merged, etc. per the LangRef.

If the direction is good, I'll tend to the broken tests.

josel-amd and others added 30 commits May 28, 2024 14:36
…inearize (llvm#92370)

Building on top of
[llvm#88204](llvm#88204), this PR adds
support for converting `vector.insert` into an equivalent
`vector.shuffle` operation that operates on linearized (1-D) vectors.
…93539)

The pass constructor can be generated automatically.

This pass is module-level and then runs on all relevant intrinsic
operations inside of the module, no matter what top level operation they
are inside of.
…nd in dropUnitDims pass. (llvm#93317)

`mlir-opt --linalg-fold-unit-extent-dims` pass on the following IR

```
#map = affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1 + d4, d2 + d5, d6)>
#map1 = affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d4, d5, d6, d3)>
#map2 = affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1, d2, d3)>
module {
  func.func @main(%arg0: tensor<1x?x?x1xf32>, %arg1: index) -> tensor<?x1x61x1xf32> {
    %cst = arith.constant dense<1.000000e+00> : tensor<1x1x1x1xf32>
    %0 = tensor.empty(%arg1) : tensor<?x1x61x1xf32>
    %1 = linalg.generic {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]} ins(%arg0, %cst : tensor<1x?x?x1xf32>, tensor<1x1x1x1xf32>) outs(%0 : tensor<?x1x61x1xf32>) {
    ^bb0(%in: f32, %in_0: f32, %out: f32):
      %2 = arith.mulf %in, %in_0 : f32
      %3 = arith.addf %out, %2 : f32
      linalg.yield %3 : f32
    } -> tensor<?x1x61x1xf32>
    return %1 : tensor<?x1x61x1xf32>
  }
}
```

produces an incorrect tensor.expand_shape operation:

```
error: 'tensor.expand_shape' op expected dimension 0 of collapsed type to be dynamic since one or more of the corresponding dimensions in the expanded type is dynamic
    %1 = linalg.generic {indexing_maps = [#map, #map1, #map2], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]} ins(%arg0, %cst : tensor<1x?x?x1xf32>, tensor<1x1x1x1xf32>) outs(%0 : tensor<?x1x61x1xf32>) {
         ^
/mathworks/devel/sandbox/sayans/geckWorks/g3294570/repro.mlir:8:10: note: see current operation: %5 = "tensor.expand_shape"(%4) <{reassociation = [[0, 1, 2, 3]]}> : (tensor<61xf32>) -> tensor<?x1x61x1xf32>
// -----// IR Dump After LinalgFoldUnitExtentDimsPass Failed (linalg-fold-unit-extent-dims) //----- //
#map = affine_map<(d0) -> (0, d0)>
#map1 = affine_map<(d0) -> ()>
#map2 = affine_map<(d0) -> (d0)>
"builtin.module"() ({
  "func.func"() <{function_type = (tensor<1x?x?x1xf32>, index) -> tensor<?x1x61x1xf32>, sym_name = "main"}> ({
  ^bb0(%arg0: tensor<1x?x?x1xf32>, %arg1: index):
    %0 = "arith.constant"() <{value = dense<1.000000e+00> : tensor<f32>}> : () -> tensor<f32>
    %1 = "tensor.collapse_shape"(%arg0) <{reassociation = [[0, 1], [2, 3]]}> : (tensor<1x?x?x1xf32>) -> tensor<?x?xf32>
    %2 = "tensor.empty"() : () -> tensor<61xf32>
    %3 = "tensor.empty"() : () -> tensor<61xf32>
    %4 = "linalg.generic"(%1, %0, %2, %3) <{indexing_maps = [#map, #map1, #map2, #map2], iterator_types = [#linalg.iterator_type<parallel>], operandSegmentSizes = array<i32: 3, 1>}> ({
    ^bb0(%arg2: f32, %arg3: f32, %arg4: f32, %arg5: f32):
      %6 = "arith.mulf"(%arg2, %arg3) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32
      %7 = "arith.addf"(%arg4, %6) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32
      "linalg.yield"(%7) : (f32) -> ()
    }) : (tensor<?x?xf32>, tensor<f32>, tensor<61xf32>, tensor<61xf32>) -> tensor<61xf32>
    %5 = "tensor.expand_shape"(%4) <{reassociation = [[0, 1, 2, 3]]}> : (tensor<61xf32>) -> tensor<?x1x61x1xf32>
    "func.return"(%5) : (tensor<?x1x61x1xf32>) -> ()
  }) : () -> ()
}) : () -> ()
```

The reason of this is because the dimension `d0` is determined to be an
unit-dim that can be dropped based on the dimensions of operand `arg0`
to `linalg.generic`. Later on when iterating over operand `outs` the
dimension `d0` is determined to be an unit-dim even though the shape
corresponding to it is `Shape::kDynamic`. For the `linalg.generic` to be
valid `d0` of `outs` does need to be `1` but that isn't properly
processed in the current implementation and the dimension is dropped
resulting in `outs` operand to be `tensor<61xf32>` in the example.

The fix is to also check that the dimension shape is actually `1` before
dropping the dimension. The IR after the fix is:

```
#map = affine_map<()[s0, s1] -> (s0 * s1)>
#map1 = affine_map<(d0) -> (0, d0)>
#map2 = affine_map<(d0) -> ()>
module {
  func.func @main(%arg0: tensor<1x?x?x1xf32>, %arg1: index) -> tensor<?x1x61x1xf32> {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %cst = arith.constant dense<1.000000e+00> : tensor<f32>
    %collapsed = tensor.collapse_shape %arg0 [[0, 1], [2, 3]] : tensor<1x?x?x1xf32> into tensor<?x?xf32>
    %0 = tensor.empty(%arg1) : tensor<?x61xf32>
    %1 = affine.apply #map()[%arg1, %c1]
    %2 = tensor.empty(%1) : tensor<?x61xf32>
    %3 = linalg.generic {indexing_maps = [#map1, #map2, #map1, #map1], iterator_types = ["parallel"]} ins(%collapsed, %cst, %0 : tensor<?x?xf32>, tensor<f32>, tensor<?x61xf32>) outs(%2 : tensor<?x61xf32>) {
    ^bb0(%in: f32, %in_0: f32, %in_1: f32, %out: f32):
      %4 = arith.mulf %in, %in_0 : f32
      %5 = arith.addf %in_1, %4 : f32
      linalg.yield %5 : f32
    } -> tensor<?x61xf32>
    %expanded = tensor.expand_shape %3 [[0, 1], [2, 3]] output_shape [%c0, 1, 61, 1] : tensor<?x61xf32> into tensor<?x1x61x1xf32>
    return %expanded : tensor<?x1x61x1xf32>
  }
}
```
Clang has some unwritten rules about diagnostic wording regarding things
like punctuation and capitalization. This patch documents those rules
and adds some tablegen support for checking diagnostics follow the
rules.

Specifically: tablegen now checks that a diagnostic does not start with
a capital letter or end with punctuation, except for the usual
exceptions like proper nouns or ending with a question.

Now that the code base is clean of such issues, the diagnostics are
emitted as an error rather than a warning to ensure that failure to
follow these rules is either addressed by an author, or a new exception
is added to the checking logic.
Fixes llvm#90941.
Add support for ``[[msvc::noinline]]`` attribute, which is actually an
alias of ``[[clang::noinline]]``.
…le::makeUniqueName()`. (llvm#89057)

E.g. during inlining new symbol name can be duplicated and then
`ValueSymbolTable::makeUniqueName()` will add unique suffix, exceeding
the `non-global-value-max-name-size` restriction.

Also fixed `unsigned` type of the option to `int` since `ValueSymbolTable`'
constructor can use `-1` value that means unrestricted name size.
…93415)

"const" being removed in this patch prevents the move semantics from
being used in:

  AI.CallStack = Callback(IndexedAI.CSId);

With this patch on an indexed MemProf Version 2 profile, the cycle
count and instruction count go down by 13.3% and 26.3%, respectively,
with "llvm-profdata show" modified to deserialize all MemProfRecords.
There was existing support for constant folding a `linalg.generic` that
was actually a transpose. This commit adds support for the named op,
`linalg.transpose`, as well by making use of the `LinalgOp` interface.
…2127)

This change updates the dataLayout string to ensure alignment with the
latest LLVM TargetMachine configuration. The aim is to
maintain consistency and prevent potential compilation issues related to
memory address space handling.
fir.box_rank codegen was invalid, it was assuming the rank field in the
descriptor was an i32. This is not correct. Do not hard code the type,
use the named position to find the type, and convert as needed in the
patterns.
Rename things in a couple of places to make the code a bit clearer.
…ing when parsing declaration DIEs. (llvm#92328)

This reapplies
llvm@9a7262c
(llvm#90663) and added llvm#91808 as a
fix.

It was causing tests on macos to fail because
`SymbolFileDWARF::GetForwardDeclCompilerTypeToDIE` returned the map
owned by this symol file. When there were two symbol files, two
different maps were created for caching from compiler type to DIE even
if they are for the same module. The solution is to do the same as
`SymbolFileDWARF::GetUniqueDWARFASTTypeMap`: inquery
SymbolFileDWARFDebugMap first to get the shared underlying SymbolFile so
the map is shared among multiple SymbolFileDWARF.
…ounding ops. (llvm#93356)

The elements that aren't sNans need to get passed through this fadd
instruction unchanged. With the agnostic mask policy they might be
forced to all ones.
Summary:
There was a bug here where we would initialize the plugin multiple times
when there were multiple images. Fix it by putting the `is_initliaized`
check later.
)

This adds 
- `mlir::tosa::populateTosaToLinalgTypeConversion` which converts
tensors of unsigned integers into tensors of signless integers
- modifies the `tosa.reshape` lowering in TosaToTensor to use the type
converter correctly

I choose to implement the type converter in
`mlir/Conversion/TosaToLinalg/TosaToLinalg.h` instead of
`mlir/Conversion/TosaToTensor/TosaToTensor.h` because I need the same
type converter in the TosaToLinalg lowerings (future PR).
Alternatively, I could duplicate the type converter so it exists both in
TosaToLinalg and TosaToTensor. Let me know if you prefer that.
This patch fixes:

  clang/unittests/Interpreter/IncrementalProcessingTest.cpp:39:13:
  error: unused function 'HostSupportsJit' [-Werror,-Wunused-function]
These interfaces are LLVM interfaces, not Clang ones; but this worked
because of LLVM.h adding the interfaces to the clang namespace.
…3490)

Skip explicit this check in non-valid scopes due to `null` type in
lambdas with invalid captures or incomplete parameter lists during
parsing


Fixes llvm#91536
Limit the logic added in llvm#9230
to cases where either sink or source are loop-invariant, to avoid
compile-time increases. This is not needed for correctness.

I am working on follow-up changes to reduce the compile-time impact in
general to allow us to enable this again for any source/sink.

This should fix the compile-time regression introduced by this change:

* compile-time improvement with this change:
  https://llvm-compile-time-tracker.com/compare.php?from=4351787fb650da6d1bfb8d6e58753c90dcd4c418&to=b89010a2eb5f98494787c1c3b77f25208c59090c&stat=instructions:u

* compile-time improvement with original patch reverted on top of this
  change:
  https://llvm-compile-time-tracker.com/compare.php?from=b89010a2eb5f98494787c1c3b77f25208c59090c&to=19a1103fe68115cfd7d6472c6961f4fabe81a593&stat=instructions:u
ChuongGoh-arm and others added 10 commits May 29, 2024 14:15
… to CMLTz (llvm#92915)

This patch mirrors the following SelectionDAG patch for GlobalISel:
https://reviews.llvm.org/D130874
We know that this gep is inbounds. Constant expression construction
already infers this fact, but make it explicit.
This patch adds support for lowering `bitreverse` with types smaller
than 8 bits. It also fixes an existing assertion failure in
`llvm::APInt::getSplat`: https://godbolt.org/z/7crs8xrcG

The lowering logic is copied from SDAG:

https://github.com/llvm/llvm-project/blob/2034f2fc8729bd4645ef7caa3c5c6efa284d2d3f/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp#L9384-L9398
We know that this GEP is inbounds, so make it explicit. NFCI
because constant expression construction already infers this.
@guy-david guy-david force-pushed the llvm-constant-like branch from cc21c9a to c2753fe Compare May 29, 2024 13:45
@guy-david guy-david requested a review from gysit May 29, 2024 13:45
s-barannikov and others added 4 commits May 29, 2024 16:54
Summary:
The `add_llvm_library` interface handles installing the llvm libraries,
however we want to do our own handling. Otherwise, this will install
into the `./lib` location instead of the `./lib/<target>` one.
Calling code explicitly checks that ArgNo is inbounds. NFCI
because constant expression creation already infers it, this just
makes it explicit.
Co-authored-by: klensy <nightouser@gmail.com>
Copy link
Collaborator

@gysit gysit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo some small nits.

Let's prepare an upstream PR?

llvm#93614)

This patch generates ExtensionDependency pairs {Earlier, Later} inferred
by the 'Implies' field of every Extension defined in tablegen. Implied
Subtarget Features that are not Extensions are skipped.
nikic and others added 2 commits May 29, 2024 16:13
These are known to be inbounds, create them as such. NFCI because
constant expression construction currently already infers this.

Also drop the unnecessary zero-index GEP: This is equivalent to
the pointer itself nowadays.
…90964)

The regression in one test is due to a SUB instruction being pushed
through the extend, leaving behind the abs instruction, which prevents
it from selecting uabdl instructions shown below:

`i32 abs(i32 sub(i32 ext i8, i32 ext i8))` => 
`i32 abs(i32 ext(i16 sub(i16 ext i8, i16 ext i8)))`

This is intended to be fixed in a follow up patch
@guy-david guy-david force-pushed the llvm-constant-like branch from c2753fe to 92b51bc Compare May 29, 2024 14:19
cxy-1993 and others added 4 commits May 29, 2024 10:20
…umf (llvm#93278)

For maxnumf and minnumf, the result of calculations involving NaN will
be another value, so their neutral element is set to NaN.
As suggested in a review of some new code for this file, Stream
is more general. The code does not need to know that it's backed
by a string.
All of these are inbounds as they access known offsets in fixed
globals. NFCI because constant expression construction currently
already infers this, this patch just makes it explicit.
…ess instruction (llvm#92543)

The Prefix instruction is introduced on PowerPC ISA3_1.

In the PR,  
1. The `FeaturePrefixInstrs` do not imply the `FeatureP8Vector`
,`FeatureP9Vector` .
2. `FeaturePrefixInstrs`  implies only the FeatureISA3_1.
3. For the prefix instructions `paddi` and `pli` , they have `Predicates
= [PrefixInstrs] `
4. For the prefix instructions `plfs` and `plfd`, they have `Predicates
= [PrefixInstrs, HasFPU] `
5. For the prefix instructions "plxv` , "plxssp` and `plxsd` , they have
`Predicates = [PrefixInstrs, HasP10Vector]`

Fixes llvm#62372
@guy-david guy-david force-pushed the llvm-constant-like branch from 92b51bc to dab7be7 Compare May 29, 2024 21:02
These act as constants and should be propagated whenever possible.
It is safe to do so for mlir.undef and mlir.poison because they remain
"dirty" through out their lifetime and can be duplicated, merged, etc.
per the LangRef.

Signed-off-by: Guy David <guy.david@nextsilicon.com>
@guy-david guy-david force-pushed the llvm-constant-like branch from dab7be7 to 8fefca1 Compare May 29, 2024 21:03
@guy-david guy-david closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.