Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][transform] Add an op for replacing values with function calls #78398

Merged
merged 4 commits into from Jan 19, 2024

Conversation

qedawkins
Copy link
Contributor

Adds transform.func.cast_and_call that takes a set of inputs and
outputs and replaces the uses of those outputs with a call to a function
at a specified insertion point.

The idea with this operation is to allow users to author independent IR
outside of a to-be-compiled module, and then match and replace a slice of
the program with a call to the external function.

Additionally adds a mechanism for populating a type converter with a set
of conversion materialization functions that allow insertion of
casts on the inputs/outputs to and from the types of the function
signature.

Depends on #78397

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 17, 2024

@llvm/pr-subscribers-mlir-memref
@llvm/pr-subscribers-mlir-linalg
@llvm/pr-subscribers-mlir-tensor
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-func

Author: Quinn Dawkins (qedawkins)

Changes

Adds transform.func.cast_and_call that takes a set of inputs and
outputs and replaces the uses of those outputs with a call to a function
at a specified insertion point.

The idea with this operation is to allow users to author independent IR
outside of a to-be-compiled module, and then match and replace a slice of
the program with a call to the external function.

Additionally adds a mechanism for populating a type converter with a set
of conversion materialization functions that allow insertion of
casts on the inputs/outputs to and from the types of the function
signature.

Depends on #78397


Patch is 36.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78398.diff

10 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Func/TransformOps/FuncTransformOps.td (+65)
  • (modified) mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td (+13)
  • (modified) mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.td (+22)
  • (modified) mlir/include/mlir/Dialect/Transform/IR/TransformOps.td (+26-5)
  • (modified) mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp (+197)
  • (modified) mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp (+40)
  • (modified) mlir/lib/Dialect/Transform/IR/TransformOps.cpp (+40-1)
  • (added) mlir/test/Dialect/Func/func-transform.mlir (+120)
  • (added) mlir/test/Dialect/Tensor/transform-op-casting.mlir (+65)
  • (modified) mlir/test/Dialect/Transform/test-interpreter.mlir (+73)
diff --git a/mlir/include/mlir/Dialect/Func/TransformOps/FuncTransformOps.td b/mlir/include/mlir/Dialect/Func/TransformOps/FuncTransformOps.td
index 7a7e991c786188..e5086c26c55a4f 100644
--- a/mlir/include/mlir/Dialect/Func/TransformOps/FuncTransformOps.td
+++ b/mlir/include/mlir/Dialect/Func/TransformOps/FuncTransformOps.td
@@ -12,6 +12,8 @@
 include "mlir/Dialect/Transform/IR/TransformDialect.td"
 include "mlir/Dialect/Transform/IR/TransformInterfaces.td"
 include "mlir/Dialect/Transform/IR/TransformTypes.td"
+include "mlir/Interfaces/SideEffectInterfaces.td"
+include "mlir/IR/RegionKindInterface.td"
 include "mlir/IR/OpBase.td"
 
 def ApplyFuncToLLVMConversionPatternsOp : Op<Transform_Dialect,
@@ -26,4 +28,67 @@ def ApplyFuncToLLVMConversionPatternsOp : Op<Transform_Dialect,
   let assemblyFormat = "attr-dict";
 }
 
+def CastAndCallOp : Op<Transform_Dialect,
+    "func.cast_and_call",
+    [DeclareOpInterfaceMethods<TransformOpInterface>,
+     DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
+     AttrSizedOperandSegments,
+     ReportTrackingListenerFailuresOpTrait]
+        # GraphRegionNoTerminator.traits> {
+  let summary = "Casts values to the signature of a function and replaces them "
+                "with a call";
+  let description = [{
+    This transform takes a set of |input| and |output| value handles and
+    attempts to cast them to the function signature of the attached function
+    op, then builds a call to the function and replaces the users of the
+    outputs. It is the responsibility of the user to ensure that the slice of
+    the program replaced by this operation makes sense, i.e. there is no
+    verification that the inputs to this operation have any relation to the
+    outputs outside of basic dominance requirements needed for the replacement.
+
+    The casting materialization functions are specified in the graph region of
+    this op. They must implement the `TypeConversionOpInterface`. The order of
+    ops within the region is irrelevant.
+
+    The target function can be specified by a symbol name or by a handle to the
+    operation.
+
+    This transform only reads the target handles and only replaces the users of
+    the outputs with the results of the call. No handles are consumed and no
+    operations are removed. Users are expected to run cleanup separately if
+    desired.
+
+    This transform will emit a silenceable failure if:
+     - The set of outputs isn't unique
+     - The handle for the insertion point does not include exactly one operation
+     - The insertion point op does not dominate any of the output users
+     - The insertion point op is not dominated by any of the inputs
+     - The function signature does not match the number of inputs/outputs
+     - Any of the input conversions fail to be materialized
+
+    This transform will emit a definite failure if it fails to resolve the
+    target function, or if it fails to materialize the conversion from the call
+    results to the output types.
+  }];
+
+  let arguments = (ins
+    TransformHandleTypeInterface:$insertion_point,
+    UnitAttr:$insert_after,
+    Optional<TransformValueHandleTypeInterface>:$inputs,
+    Optional<TransformValueHandleTypeInterface>:$outputs,
+    OptionalAttr<SymbolRefAttr>:$function_name,
+    Optional<TransformHandleTypeInterface>:$function);
+  let results = (outs TransformHandleTypeInterface:$result);
+  let regions = (region MaxSizedRegion<1>:$conversions);
+
+  let assemblyFormat = [{
+    ($function_name^)? ($function^)?
+    ( `(` $inputs^ `)` )?
+    ( `->` $outputs^ )?
+    (`after` $insert_after^):(`before`)? $insertion_point
+    ($conversions^)? attr-dict `:` functional-type(operands, results)
+  }];
+  let hasVerifier = 1;
+}
+
 #endif // FUNC_TRANSFORM_OPS
diff --git a/mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td b/mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
index 8556d9570fd120..28e9249c82e309 100644
--- a/mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
+++ b/mlir/include/mlir/Dialect/Tensor/TransformOps/TensorTransformOps.td
@@ -169,4 +169,17 @@ def MakeLoopIndependentOp
   }];
 }
 
+def TypeConversionCastOp : Op<Transform_Dialect,
+    "type_conversion.tensor.cast",
+    [DeclareOpInterfaceMethods<TypeConversionOpInterface>]> {
+  let description = [{
+    Indicates that tensor ops (such as tensor.generate) should be replaced with
+    constants (arith.constant) when possible.
+  }];
+  let arguments = (ins UnitAttr:$ignore_dynamic_info);
+
+  let assemblyFormat =
+      "(`ignore_dynamic_info` $ignore_dynamic_info^)? attr-dict";
+}
+
 #endif // TENSOR_TRANSFORM_OPS
diff --git a/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.td b/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.td
index f29efaee620d84..3b601f42a6452d 100644
--- a/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.td
+++ b/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.td
@@ -280,6 +280,28 @@ def PatternDescriptorOpInterface : OpInterface<"PatternDescriptorOpInterface"> {
   ];
 }
 
+def TypeConversionOpInterface : OpInterface<"TypeConversionOpInterface"> {
+  let description = [{
+    This interface should be implemented by ops that populate type casting
+    of a `transform.cast_and_inline` op. It provides a method to populate a
+    type converter with source/target materialization patterns.
+  }];
+
+  let cppNamespace = "::mlir::transform";
+
+  let methods = [
+    InterfaceMethod<
+      /*desc=*/[{
+        Populate the given type converter with source/target materialization
+        functions.
+      }],
+      /*returnType=*/"void",
+      /*name=*/"populateTypeMaterializations",
+      /*arguments=*/(ins "::mlir::TypeConverter &":$converter)
+    >
+  ];
+}
+
 def TypeConverterBuilderOpInterface
     : OpInterface<"TypeConverterBuilderOpInterface"> {
   let description = [{
diff --git a/mlir/include/mlir/Dialect/Transform/IR/TransformOps.td b/mlir/include/mlir/Dialect/Transform/IR/TransformOps.td
index fe2c28f45aea04..6637d81dab5e2a 100644
--- a/mlir/include/mlir/Dialect/Transform/IR/TransformOps.td
+++ b/mlir/include/mlir/Dialect/Transform/IR/TransformOps.td
@@ -725,22 +725,43 @@ def GetProducerOfOperand : TransformDialectOp<"get_producer_of_operand",
                        "functional-type(operands, results)";
 }
 
+def GetOperandOp : TransformDialectOp<"get_operand",
+    [DeclareOpInterfaceMethods<TransformOpInterface>,
+     NavigationTransformOpTrait, MatchOpInterface, MemoryEffectsOpInterface]> {
+  let summary = "Get a handle to the operand(s) of the targeted op";
+  let description = [{
+    The handle defined by this Transform op corresponds to the Operands of the
+    given `target` operation. Optionally `operand_number` can be specified to
+    select a specific operand.
+    
+    This transform fails silently if the targeted operation does not have enough
+    operands. It reads the target handle and produces the result handle.
+  }];
+
+  let arguments = (ins TransformHandleTypeInterface:$target,
+                       OptionalAttr<I64Attr>:$operand_number);
+  let results = (outs TransformValueHandleTypeInterface:$result);
+  let assemblyFormat = "$target (`[` $operand_number^ `]`)? attr-dict `:` "
+                       "functional-type(operands, results)";
+}
+
 def GetResultOp : TransformDialectOp<"get_result",
     [DeclareOpInterfaceMethods<TransformOpInterface>,
      NavigationTransformOpTrait, MemoryEffectsOpInterface]> {
-  let summary = "Get handle to the a result of the targeted op";
+  let summary = "Get a handle to the result(s) of the targeted op";
   let description = [{
-    The handle defined by this Transform op corresponds to the OpResult with
-    `result_number` that is defined by the given `target` operation.
+    The handle defined by this Transform op correspond to the OpResults of the
+    given `target` operation. Optionally `result_number` can be specified to
+    select a specific result.
     
     This transform fails silently if the targeted operation does not have enough
     results. It reads the target handle and produces the result handle.
   }];
 
   let arguments = (ins TransformHandleTypeInterface:$target,
-                       I64Attr:$result_number);
+                       OptionalAttr<I64Attr>:$result_number);
   let results = (outs TransformValueHandleTypeInterface:$result);
-  let assemblyFormat = "$target `[` $result_number `]` attr-dict `:` "
+  let assemblyFormat = "$target (`[` $result_number^ `]`)? attr-dict `:` "
                        "functional-type(operands, results)";
 }
 
diff --git a/mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp b/mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp
index 9e9b6bcea790de..14b6e633520d6c 100644
--- a/mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp
+++ b/mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp
@@ -15,6 +15,7 @@
 #include "mlir/Dialect/Transform/IR/TransformDialect.h"
 #include "mlir/Dialect/Transform/IR/TransformInterfaces.h"
 #include "mlir/Dialect/Transform/IR/TransformOps.h"
+#include "mlir/Transforms/DialectConversion.h"
 
 using namespace mlir;
 
@@ -36,6 +37,202 @@ transform::ApplyFuncToLLVMConversionPatternsOp::verifyTypeConverter(
   return success();
 }
 
+//===----------------------------------------------------------------------===//
+// CastAndCallOp
+//===----------------------------------------------------------------------===//
+
+DiagnosedSilenceableFailure
+transform::CastAndCallOp::apply(transform::TransformRewriter &rewriter,
+                                transform::TransformResults &results,
+                                transform::TransformState &state) {
+  SmallVector<Value> inputs;
+  if (getInputs())
+    for (Value input : state.getPayloadValues(getInputs()))
+      inputs.push_back(input);
+  SmallVector<Value> outputs;
+  if (getOutputs())
+    for (Value output : state.getPayloadValues(getOutputs()))
+      outputs.push_back(output);
+
+  // Verify that the set of output values to be replaced is unique.
+  llvm::SmallDenseSet<Value> outputSet;
+  for (Value output : outputs) {
+    outputSet.insert(output);
+  }
+  if (outputSet.size() != outputs.size()) {
+    return emitSilenceableFailure(getLoc())
+           << "cast and call output values must be unique";
+  }
+
+  // Get the insertion point for the call.
+  auto insertionOps = state.getPayloadOps(getInsertionPoint());
+  if (!llvm::hasSingleElement(insertionOps)) {
+    return emitSilenceableFailure(getLoc())
+           << "Only one op can be specified as an insertion point";
+  }
+  bool insertAfter = getInsertAfter();
+  Operation *insertionPoint = *insertionOps.begin();
+
+  // Check that all inputs dominate the insertion point, and the insertion
+  // point dominates all users of the outputs.
+  DominanceInfo dom(insertionPoint);
+  for (Value output : outputs) {
+    for (Operation *user : output.getUsers()) {
+      // If we are inserting after the insertion point operation, the
+      // insertion point operation must properly dominate the user. Otherwise
+      // basic dominance is enough.
+      bool doesDominate = insertAfter
+                              ? dom.properlyDominates(insertionPoint, user)
+                              : dom.dominates(insertionPoint, user);
+      if (!doesDominate) {
+        return emitDefiniteFailure()
+               << "User " << user << " is not dominated by insertion point "
+               << insertionPoint;
+      }
+    }
+  }
+
+  for (Value input : inputs) {
+    // If we are inserting before the insertion point operation, the
+    // input must properly dominate the insertion point operation. Otherwise
+    // basic dominance is enough.
+    bool doesDominate = insertAfter
+                            ? dom.dominates(input, insertionPoint)
+                            : dom.properlyDominates(input, insertionPoint);
+    if (!doesDominate) {
+      return emitDefiniteFailure()
+             << "input " << input << " does not dominate insertion point "
+             << insertionPoint;
+    }
+  }
+
+  // Get the function to inline. This can either be specified by symbol or as a
+  // transform handle.
+  func::FuncOp targetFunction = nullptr;
+  if (getFunctionName()) {
+    targetFunction = SymbolTable::lookupNearestSymbolFrom<func::FuncOp>(
+        insertionPoint, *getFunctionName());
+    if (!targetFunction) {
+      return emitDefiniteFailure()
+             << "unresolved symbol " << *getFunctionName();
+    }
+  } else if (getFunction()) {
+    auto payloadOps = state.getPayloadOps(getFunction());
+    if (!llvm::hasSingleElement(payloadOps)) {
+      return emitDefiniteFailure() << "requires a single function to call";
+    }
+    targetFunction = dyn_cast<func::FuncOp>(*payloadOps.begin());
+    if (!targetFunction) {
+      return emitDefiniteFailure() << "invalid non-function callee";
+    }
+  } else {
+    llvm_unreachable("Invalid CastAndCall op without a function to call");
+    return emitDefiniteFailure();
+  }
+  assert(targetFunction && "no target function found");
+
+  // Verify that the function argument and result lengths match the inputs and
+  // outputs given to this op.
+  if (targetFunction.getNumArguments() != inputs.size()) {
+    return emitSilenceableFailure(targetFunction.getLoc())
+           << "mismatch between number of function arguments "
+           << targetFunction.getNumArguments() << " and number of inputs "
+           << inputs.size();
+  }
+  if (targetFunction.getNumResults() != outputs.size()) {
+    return emitSilenceableFailure(targetFunction.getLoc())
+           << "mismatch between number of function results "
+           << targetFunction->getNumResults() << " and number of outputs "
+           << outputs.size();
+  }
+
+  // Gather all specified converters.
+  MLIRContext *ctx = insertionPoint->getContext();
+  mlir::TypeConverter converter;
+  if (!getRegion().empty()) {
+    for (Operation &op : getRegion().front()) {
+      cast<transform::TypeConversionOpInterface>(&op)
+          .populateTypeMaterializations(converter);
+    }
+  }
+
+  OpBuilder builder(ctx);
+  if (insertAfter)
+    builder.setInsertionPointAfter(insertionPoint);
+  else
+    builder.setInsertionPoint(insertionPoint);
+
+  for (auto [input, type] :
+       llvm::zip_equal(inputs, targetFunction.getArgumentTypes())) {
+    if (input.getType() != type) {
+      Value newInput = converter.materializeSourceConversion(
+          builder, input.getLoc(), type, input);
+      if (!newInput) {
+        return emitSilenceableFailure(input.getLoc())
+               << "Failed to materialize conversion of " << input << " to type "
+               << type;
+      }
+      input = newInput;
+    }
+  }
+
+  auto callOp = builder.create<func::CallOp>(insertionPoint->getLoc(),
+                                             targetFunction, inputs);
+
+  // Cast the call results back to the expected types. If any conversions fail
+  // this is a definite failure as the call has been constructed at this point.
+  for (auto [output, newOutput] :
+       llvm::zip_equal(outputs, callOp.getResults())) {
+    Value convertedOutput = newOutput;
+    if (output.getType() != newOutput.getType()) {
+      convertedOutput = converter.materializeTargetConversion(
+          builder, output.getLoc(), output.getType(), newOutput);
+      if (!convertedOutput) {
+        return emitSilenceableFailure(output.getLoc())
+               << "Failed to materialize conversion of " << newOutput
+               << " to type " << output.getType();
+      }
+    }
+    output.replaceAllUsesExcept(convertedOutput, callOp);
+  }
+  results.set(cast<OpResult>(getResult()), {callOp});
+  return DiagnosedSilenceableFailure::success();
+}
+
+LogicalResult transform::CastAndCallOp::verify() {
+  if (!getRegion().empty()) {
+    for (Operation &op : getRegion().front()) {
+      if (!isa<transform::TypeConversionOpInterface>(&op)) {
+        InFlightDiagnostic diag = emitOpError()
+                                  << "expected children ops to implement "
+                                     "TypeConversionOpInterface";
+        diag.attachNote(op.getLoc()) << "op without interface";
+        return diag;
+      }
+    }
+  }
+  if (!getFunction() && !getFunctionName()) {
+    return emitOpError() << "expected a function handle or name to call";
+  }
+  if (getFunction() && getFunctionName()) {
+    return emitOpError() << "function handle and name are mutually exclusive";
+  }
+  return success();
+}
+
+void transform::CastAndCallOp::getEffects(
+    SmallVectorImpl<MemoryEffects::EffectInstance> &effects) {
+  transform::onlyReadsHandle(getInsertionPoint(), effects);
+  if (getInputs())
+    transform::onlyReadsHandle(getInputs(), effects);
+  if (getOutputs())
+    transform::onlyReadsHandle(getOutputs(), effects);
+  if (getFunction())
+    transform::onlyReadsHandle(getFunction(), effects);
+  transform::producesHandle(getResult(), effects);
+  transform::modifiesPayload(effects);
+}
+
 //===----------------------------------------------------------------------===//
 // Transform op registration
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp b/mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp
index ed274238704713..0c89ba2a1f1895 100644
--- a/mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp
+++ b/mlir/lib/Dialect/Tensor/TransformOps/TensorTransformOps.cpp
@@ -15,6 +15,8 @@
 #include "mlir/Dialect/Tensor/Utils/Utils.h"
 #include "mlir/Dialect/Transform/IR/TransformDialect.h"
 #include "mlir/Dialect/Transform/IR/TransformInterfaces.h"
+#include "mlir/IR/Builders.h"
+#include "mlir/Transforms/DialectConversion.h"
 
 using namespace mlir;
 using namespace tensor;
@@ -128,6 +130,44 @@ void transform::ApplyRewriteTensorOpsAsConstantPatternsOp::populatePatterns(
   tensor::populateRewriteAsConstantPatterns(patterns);
 }
 
+//===----------------------------------------------------------------------===//
+// TypeConversionCastOp
+//===----------------------------------------------------------------------===//
+
+void transform::TypeConversionCastOp::populateTypeMaterializations(
+    TypeConverter &converter) {
+  bool ignoreDynamicInfo = getIgnoreDynamicInfo();
+  converter.addSourceMaterialization([ignoreDynamicInfo](
+                                         OpBuilder &builder, Type resultType,
+                                         ValueRange inputs,
+                                         Location loc) -> std::optional<Value> {
+    if (inputs.size() != 1) {
+      return std::nullopt;
+    }
+    Value input = inputs[0];
+    if (!ignoreDynamicInfo &&
+        !tensor::preservesStaticInformation(resultType, input.getType())) {
+      return std::nullopt;
+    }
+    if (!tensor::CastOp::areCastCompatible(input.getType(), resultType)) {
+      return std::nullopt;
+    }
+    return builder.create<tensor::CastOp>(loc, resultType, input).getResult();
+  });
+  converter.addTargetMaterialization([](OpBuilder &builder, Type resultType,
+                                        ValueRange inputs,
+                                        Location loc) -> std::optional<Value> {
+    if (inputs.size() != 1) {
+      return std::nullopt;
+    }
+    Value input = inputs[0];
+    if (!tensor::CastOp::areCastCompatible(input.getType(), resultType)) {
+      return std::nullopt;
+    }
+    return builder.create<tensor::CastOp>(loc, resultType, input).getResult();
+  });
+}
+
 //===----------------------------------------------------------------------===//
 // MakeLoopIndependentOp
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/Transform/IR/TransformOps.cpp b/mlir/lib/Dialect/Transform/IR/TransformOps.cpp
index b80fc09751d2aa..59524c4c14d4fe 100644
--- a/mlir/lib/Dialect/Transform/IR/TransformOps.cpp
+++ b/mlir/lib/Dialect/Transform/IR/TransformOps.cpp
@@ -16,10 +16,12 @@
 #include "mlir/Dialect/Transform/IR/TransformDialect.h"
 #include "mlir/Dialect/Transform/IR/TransformInterfaces.h"
 #include "mlir/Dialect/Transform/IR/TransformTypes.h"
+#include "mlir/IR/BuiltinAttributes.h"
 #include "mlir/IR/Diagnostics.h"
 #include "mlir/IR/Dominance.h"
 #include "mlir/IR/PatternMatch.h"
 #i...
[truncated]

Comment on lines 56 to 59
This transform only reads the target handles and only replaces the users of
the outputs with the results of the call. No handles are consumed and no
operations are removed. Users are expected to run cleanup separately if
desired.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a memory effect to indicate "replaces the uses"... I am pondering whether we want to invalidate handles to those users after this transformation (and we would need to take the list of users as an additional handle). They keep existing and have the same signature, so maybe we can indeed get away with not invalidating them. Throughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking the users are probably not worth invalidating, but I think it probably is worth invalidating the output value handles themselves. (Also users there is a typo, it should be uses). If I have a handle to some consumer I can't think of a case where I'd want to invalidate that handle because one of its producers changed. That said, the uses (outputs) probably are worth invalidating because at that point those values should have no users, unless they are used by the call itself, i.e.

func.func {
  %0 = foo
  %1 = bar %0
}

to

func.func {
  %0 = foo
  %1 = call @trace(%0)
  %2 = bar %1
}

We might not want to invalidate %0 in this case, but I can't think of a way to "accidentally" do this, so the caller should be aware of the IR structure at this point and can always rematch %0. You might still be right; I haven't worked with enough examples of this op to know that I'm getting the handle invalidation rules correct here.

I was also thinking of adding a cast_and_replace_with_call op or something like that which takes a slice of the program and does the full replacement, so users might prefer that transform if they care about being precise with handle invalidation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalidation is primarily meant to avoid the dangling-reference situation and secondarily to catch invariant violations. If we had a type along the lines of "a payload operation using a value produced by arithmetic ops" (some affine operations are like this), we'd want to invalidate that one since the invariant no longer holds. Similarly, I don't think that we necessarily want to invalidate a handle to the %0 value (we don't have handles to OpOperands, those would have to be invalidated): the value still exists, and we haven't made any strong promises about it. Maybe what I'm arguing for here is that we can occasionally re-run the verifier that checks if the contents of a handle corresponds to its type, which is currently different from invalidation. And maybe we should merge this process with invalidation, i.e., failing the conditions specified by the handle type invalidates the handle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this makes more sense then. IIUC there are no in-tree examples of something like "a payload operation using a value produced by arithmetic ops" then? (people could have written anything downstream). So would it suffice for now to leave a warning in the op description that such cases currently won't properly track invalidation? And then adding additional verification/invalidation mechanisms can come as a follow up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we are not using much of types upstream right now. And none of the downstreams I am aware of are using them either. So it's okay to leave a warning for now.

If you have an idea where to put a summary of our discussion above so it remains visible, it would be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a link to this discussion directly to the op description. We could also crystallize the warning here in the documentation as well; I suspect there are a number of other transform ops that could run into similar validation issues so adding a disclaimer for such cases could be useful. Maybe as a note here? https://mlir.llvm.org/docs/Dialects/Transform/#handle-invalidation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll send it as a follow up.

mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp Outdated Show resolved Hide resolved
mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp Outdated Show resolved Hide resolved
mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp Outdated Show resolved Hide resolved
mlir/lib/Dialect/Func/TransformOps/FuncTransformOps.cpp Outdated Show resolved Hide resolved
Adds `transform.func.cast_and_call` that takes a set of inputs and
outputs and replaces the uses of those outputs with a call to a function
at a specified insertion point.

The idea with this operation is to allow users to author independent IR
outside of a to-be-compiled module, and then match and replace a slice of
the program with a call to the external function.

Additionally adds a mechanism for populating a type converter with a set
of conversion materialization functions that allow insertion of
casts on the inputs/outputs to and from the types of the function
signature.
@qedawkins qedawkins requested a review from ftynse January 18, 2024 15:47
@qedawkins qedawkins merged commit 42b1603 into llvm:main Jan 19, 2024
3 of 4 checks passed
fadlyas07 pushed a commit to greenforce-project/llvm-project that referenced this pull request Jan 20, 2024
* llvm-project/main:
  [AArch64] Rename LDAPR<x>pre to LDAPR<x>post (#77340)
  [NFC] fix typo in clang/include/clang/Sema/ScopeInfo.h
  Fix typo in AttrDocs.td (__single_inhertiance => __single_inheritance) (#78838)
  [LLD] [COFF] Preserve directives and export names from LTO objects (#78802)
  [clang-format] Support of TableGen identifiers beginning with a number. (#78571)
  [AArch64][SME2] Preserve ZT0 state around function calls (#78321)
  [clang-tidy] fix modernize-use-auto incorrect fix hints for pointer (#77943)
  [AMDGPU] Add GFX12 llvm.amdgcn.s.wait.*cnt intrinsics (#78723)
  [gn build] Port 7a8f5d97afbf
  [MLIR][Presburger] Implement IntegerRelation::setId (#77872)
  [clang-tidy] Added new check to detect redundant inline keyword (#73069)
  [gn build] Port 1ad1f981a622
  [clang-tidy] Add readability-redundant-casting check (#70595)
  [llvm-jitlink] Allow optional stub-kind filter in stub_addr() expressions (#78369)
  [clang] Implement CWG1878 "`operator auto` template" (#78103)
  [clang-format] Add SkipMacroDefinitionBody option (#78682)
  [clang-tidy] Fix crash in modernize-loop-convert when int is used as iterator (#78796)
  [ELF] Improve LTO tests
  [fuzzer,test] Remove old debug logging
  Reapply "[sanitizer] Skip /include/c++/ from summary (#78534)"
  [mlir] Make `printAlias` hooks public (NFC) (#78833)
  [libc++][spaceship][NFC] Status page update: *libc++* Spaceship Operator Status (`operator<=>`)¶ (#78832)
  [Analysis] Use llvm::children and llvm::inverse_children (NFC)
  [ADT] Use llvm::is_contained (NFC)
  [IR] Use StringRef::consume_front (NFC)
  [CodeGen] Use a range-based for loop with llvm::predecessors (NFC)
  [Driver] Use SmallString::operator std::string (NFC)
  [ELF] Improve --thinlto-index-only and --thinlto-emit-index-files tests
  [lld][WebAssembly] Fix regression in function signature checking (#78831)
  [libc++][span] P2447R4: `std::span` over an initializer list (#78157)
  [libc++][any] LWG3305: `any_cast<void>` (#78215)
  [libc++][memory] P2868R1: Removing deprecated typedef `std::allocator::is_always_equal` (#78562)
  Reland "[clang] Fix CTAD for aggregates for nested template classes" (#78670)
  [clang][analyzer] Improve modeling of 'fdopen' in StdLibraryFunctionsChecker (#78680)
  [llvm] Use SmallString::operator std::string (NFC)
  [AArch64] Use StringRef::contains_insensitive (NFC)
  [Analysis] Use llvm::children and llvm::inverse_children (NFC)
  [ObjCopy] Use StringRef::consume_front (NFC)
  Apply clang-tidy fixes for llvm-qualified-auto in Promotion.cpp (NFC)
  Apply clang-tidy fixes for llvm-include-order in Fusion.cpp (NFC)
  Apply clang-tidy fixes for llvm-else-after-return in ElementwiseOpFusion.cpp (NFC)
  Apply clang-tidy fixes for llvm-else-after-return in DropUnitDims.cpp (NFC)
  Apply clang-tidy fixes for performance-unnecessary-value-param in DataLayoutPropagation.cpp (NFC)
  [llvm-objdump] Disables running pgo-analysis-map symbolizing on windows.
  Removed a late added test-case from the tests for #74629
  [MLGO] Upstream the corpus extraction tooling (#72319)
  Skip TestThreadLocal.py on darwin temporarily for linker issue
  Revert "[mlir][amdgpu] Shared memory access optimization pass" (#78822)
  Reland "[CodeGen] Support start/stop in CodeGenPassBuilder (#70912)" (#78570)
  [mlir][amdgpu] Fix bazel build (#78820)
  [lld][WebAssembly] Match the ELF linker in transitioning away from archive indexes. (#78658)
  [msan] Unpoison indirect outputs for userspace when -msan-handle-asm-conservative is specified (#77393)
  [libc++][test] Move format.functions ASCII tests to `libcxx/test/libcxx` (#78661)
  [ELF] Rename LazyObject to LazySymbol. NFC
  [lldb][DWARFUnit] Implement PeekDIEName query (#78486)
  [lldb] Fix build error in lldb-dap.cpp (NFC)
  [lldb-dap] Add a CMake variable for defining a welcome message (#78811)
  [mlir][index] Fold `cmp(x, x)` when `x` isn't a constant (#78812)
  [libc] Redo the install targets (#78795)
  [mlir][amdgpu] Shared memory access optimization pass (#75627)
  Revert "[InstrProf] Adding utility weights to BalancedPartitioning (#72717)"
  [dsymutil] Fix spurious warnings in MachODebugMapParser (#78794)
  [RISCV] Don't look for sext in RISCVCodeGenPrepare::visitAnd.
  [RISCV] Add test case for #78783. NFC
  [lld][WebAssembly] Use the archive offset with --whole-archive (#78791)
  [llvm-objdump] Add support for symbolizing PGOBBAddrMap Info (#76386)
  [X86] Refine X86DAGToDAGISel::isSExtAbsoluteSymbolRef() (#76191)
  Add a "don't override" mapping for -fvisibility-from-dllstorageclass (#74629)
  [lld][WebAssembly] Reset context object after each link (#78770)
  [libc++][hardening] In production hardening modes, trap rather than abort (#78561)
  [Statepoint][NFC] Use uint16_t and add an assert (#78717)
  [InstrProf] Adding utility weights to BalancedPartitioning (#72717)
  [CMake] Detect properly new linker introduced in Xcode 15 (#77806)
  Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)"
  [lld][ELF] Simplify handleLibcall. NFC (#78659)
  [libc] remove extra -Werror (#78761)
  Remove an unused API; NFC
  [RISCV] Prevent RISCVMergeBaseOffsetOpt from calling getVRegDef on a physical register. (#78762)
  [SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)
  [llvm-exegesis] Make duplicate snippet repetitor produce whole snippets (#77224)
  [SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374)
  [mlir][transform]: fix broken bazel build for TensorTransformOps (#78766)
  [RISCV] Add support for Smepmp 1.0 (#78489)
  [mlir][transform]: fix broken bazel build (#78757)
  [LLVM][NVPTX] Add cp.async.bulk.commit/wait intrinsics (#78698)
  [mlir][transform] Add an op for replacing values with function calls (#78398)
  Re-exec TSan with no ASLR if memory layout is incompatible on Linux (#78351)
  [lld][WebAssembly] Fix use of undefined funcs under --warn-unresolved-symbols (#78643)
  [AsmPrinter][DebugNames] Implement DW_IDX_parent entries (#77457)
  [libc] Add float.h header. (#78737)
  [lldb][test] Apply @expectedFailureAll/@skipIf early for debug_info tests (#73067)
  [libc] Fix test failing on GPU using deprecated 'add_unittest'
  [mlir][docs] Fix broken link
  [AArch64] NFC: Simplify discombobulating 'requiresSMChange' interface (#78703)
  [Clang] Refactor diagnostics for SME builtins. (#78258)
  [AMDGPU] Update comment on SIInstrInfo::isLegalFLATOffset for GFX12
  [AMDGPU] Make use of CPol::SWZ_* in SelectionDAG. NFC.
  [MLIR][OpenMP] Better error reporting for unsupported `nowait` (#78551)
  [AMDGPU] Remove gws feature from GFX12 (#78711)
  [AMDGPU] Update hazard recognition for new GFX12 wait counters (#78722)
  [AMDGPU] Do not widen scalar loads on GFX12 (#78724)
  [Flang][OpenMP] Consider renames when processing reduction intrinsics (#70822)
  [FileCheck]: Fix diagnostics for NOT prefixes (#78412)
  [AsmParser] Add support for reading incomplete IR (part 1) (#78421)
  [gn build] Port 9ff4be640fb1
  [X86] Fix -Wsign-compare in X86MCInstLower.cpp (NFC)
  [AMDGPU] Do not emit `V_DOT2C_F32_F16_e32` on GFX12 (#78709)
  LoopDeletion: Move EH pad check before the isLoopNeverExecuted Check (#78189)
  [InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636)
  [X86] movsd/movss/movd/movq - add support for constant comments (#78601)
  [Clang] Fix build with GCC 14 on ARM (#78704)
  [flang] use setsid to assign the child to prevent zombie as it will be clean up by init process (#77944)
  [flang] Expand parent component in procedure pointer component ref (#78593)
  [llvm-jitlink] Fix MSVC "not all control paths return a value" warning. NFC.
  [llvm-exegesis] Fix MSVC "not all control paths return a value" warning. NFC.
  [AsmParser] Deduplicate argument list parsing code (NFC)
  [builtins] Mark `int_lib.h` builtins as `static` (#69305)
  [DWARFLinker][NFC] Decrease DWARFLinker dependence on DwarfStreamer. (#77932)
  [AsmParser] Don't require value numbers to be consecutive (#78171)
  [AMDGPU] Misc formatting fixes. NFC.
  Fix typo "widended"
  [AArch64][SME] Remove combination of private-ZA and preserves_za. (#78563)
  [VPlan] Use DebugLoc from recipe in VPWidenCallRecipe (NFCI).
  [Clang] [NFC] Remove default argument in ASTUnit.h (#78566)
  [Statepoint] Optimize Location structure size (#78600)
  [Tooling] Fix FixedCompilationDatabase with header compile flags (#73913)
  [NFC][LV] Test precommit for interleaved linear args
  [clang-apply-replacements] Deduplicate Implementation of `collectReplacementsFromDirectory` (NFC) (#78630)
  [mlir][bufferization] Simplify helper `potentiallyAliasesMemref` (#78690)
  [AMDGPU] Remove GFX12 encoding hack (#78702)
  Revert "[llvm][AArch64] Copy all operands when expanding BLR_BTI bundle (#78267)"
  [AMDGPU] Fix predicates for BUFFER_ATOMIC_CSUB pattern (#78701)
  [LTO] Require asserts for discard-value-names.ll test.
  [llvm-jitlink] Refactor GOT and stubs registration (NFC) (#78365)
  [mlir][NFC] Remove unused variable.
  [llvm][AArch64] Copy all operands when expanding BLR_BTI bundle (#78267)
  [mlir][vector] Drop innermost unit dims on transfer_write. (#78554)
  [AMDGPU][GFX12] Add tests for flat_atomic_pk (#78683)
  [clang][ExtractAPI] Record availability information only for the target platform (#76823)
  Revert "Revert "[Flang][OpenMP] NFC: Minor refactoring of Reduction lowering code" (#73139)"
  [AMDGPU] Fix test for expensive-checks build (#78687)
  [VPlan] Introduce VPSingleDefRecipe. (#77023)
  [AMDGPU] Remove unnecessary add instructions in ctlz.i8 (#77615)
  [mlir][llvm] Drop unreachable basic block during import (#78467)
  [TableGen] Integrate TableGen-based macro fusion (#73115)
  [libc++] Implement LWG3940: std::expected<void, E>::value() also needs E to be copy constructible (#71819)
  [llvm-exegesis] Add support for validation counters (#76653)
  Revert "[sanitizer] Skip /include/c++/ from summary (#78534)"
  [flang][re-apply] Fix seg fault CodeGenAction::executeAction() (#78672)
  [mlir][bufferization][NFC] Clean up code (#78594)
  [gn build] Port 4e7cf1b1ed38
  [clang][Interp] Add an EvaluationResult class (#71315)
  [libc][NFC] Use Sign in NormalFloat (#78579)
  [mlir][vector] Add 2 invalid tests for vector.xfer Ops (#78608)
  [asan,test] Disable alloca_loop_unpoisoning.cpp on s390{{.*}}
  [libc] Fix is_subnormal for Intel Extended Precision (#78592)
  [libc][NFC] Simplify FPBits expressions (#78590)
  [clang][Interp] Implement integral->complex casts (#75590)
  [libc][NFC] Fix "type qualifiers ignored on cast result type"  GCC warning (#78509)
  [llvm] Use StringRef::contains (NFC)
  [Support] Use llvm::children and llvm::inverse_children (NFC)
  [Remarks] Use StringRef::consume_{front,back} (NFC)
  [tools] Use SmallString::operator std::string (NFC)
  Replace `exec_tools` with `tools` in bazel genrule. (#77510)
  [compiler-rt] Add a prefix on the windows mmap symbols (#78037)
  [Release Notes] [C++20] [Modules] Summarize modules related change to the release note for clang 18
  Clean up PlatformDarwinKernel::GetSharedModule, document (#78652)
  [coroutine] Create coroutine body in the correct eval context (#78589)
  [RISCV] Re-order riscv-target-features.c to put non-experimental extensions together. (#78675)
  [RISCV] Don't look through EXTRACT_ELEMENT in lowerScalarInsert if the element types are different. (#78668)
  [MLIR][ODS] Check hasProperties when generating populateDefaultAttrs (#78525)
  [libc] Provide sys/queue.h (#78081)
  [flang][openacc] Support multiple device_type when lowering (#78634)
  Revert "[flang] Fix seg fault `CodeGenAction::executeAction()` (#78269)" (#78667)
  [NFC][PowerPC] remove the redundant spill related flags setting
  [AMDGPU] Precommit lit test.
  [BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
  Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)"
  [BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
  [clangd] Don't collect templated decls for builtin templates (#78466)
  [Coroutines] Fix inline comment about frame layout (#78626)
  [RISCV] Key VectorIntrinsicCostTable by SEW [nfc-ish]
  [SLP][NFC]Add a test with extending the types for vectorized stores/insertelement instructions, NFC.
  Apply clang-tidy fixes for performance-unnecessary-value-param in ConvertConv2DToImg2Col.cpp (NFC)
  Apply clang-tidy fixes for llvm-qualified-auto in LinalgTransformOps.cpp (NFC)
  Apply clang-tidy fixes for bugprone-macro-parentheses in LinalgTransformOps.cpp (NFC)
  Apply clang-tidy fixes for llvm-else-after-return in LinalgOps.cpp (NFC)
  Apply clang-tidy fixes for performance-unnecessary-value-param in LinalgInterfaces.cpp (NFC)
  Revert "[ASan][libc++] Turn on ASan annotations for short strings" (#78627)
  [RISCV] Add statistic support for VSETVL insertion pass (#78543)
  [RISCV] Add support for new unprivileged extensions defined in profiles spec (#77458)
  [lld][WebAssembly] Move input vectors from symtab to ctx. NFC (#78640)
  [llvm-lib][Object][COFF] Use ARM64 machine type for import library descriptor objects. (#78537)
  Revert #76246 and #76083
  [dfsan] Don't clear shadow on dlopen(NULL, flags)
  [lld][WebAssembly] Move linker global state in to context object. NFC (#78629)
  [lld][WebAssembly] Rename fetch() to extract() to match ELF linker. NFC (#78625)
  [lldb] Stop creating BreakpointEventData raw pointers (#78508)
  [BOLT] Use continuous output addresses in delta encoding in BAT
  [AArch64] Add tests for operations on vectors with 3 elements.
  [X86] Fix RTTI proxy emission for 32-bit (#78622)
  [clang-repl][test] Suppress memory lease after #76218
  [Hexagon] Flip subreg bit for reverse pairs hvx .new (#75873)
  [libc] Use clang's scoped atomics if available from the compiler (#74769)
  [sanitizer] Skip /include/c++/ from summary (#78534)
  [lldb] Remove redundant severity substring within a diagnostic message. (#76111)
  [flang] Fix seg fault `CodeGenAction::executeAction()` (#78269)
  [clang] Pass `-n` to llvm-cxxfilt in even more codegen tests
  [CompilerRT][ASan] Add new substitutions for tests while using lto to (#78523)
  Revert "[lldb] Silence narrowing conversion warning with MSVC"
  [clang] Add size filter for stack auto init (#74777)
  [flang] Don't use \uXXXX encodings unless \-escapes are enabled (#78326)
  [ThinLTO][DebugInfo] Emit full type definitions when importing anonymous types. (#78461)
  [openmp] Revert 64874e5ab5fd102344d43ac9465537a44130bf19 since it was committed by mistake and the PR (https://github.com/llvm/llvm-project/pull/77853) wasn't approved yet.
  [clang] Fix parenthesized list initialization of arrays not working with `new` (#76976)
  [lld-macho] Fix for objc_msgSend stubs (#78557)
  [llvm][utils] Fix SmallString summary provider (#78527)
  [RISCV] Adjust select shuffle cost to reflect mask creation cost (#77963)
  [NVPTX][NFC] Remove unused parameter of getArgumentAlignment (#78604)
  [lldb] Silence narrowing conversion warning with MSVC
  [lldb] Silence warning with latest MSVC
  [lldb] Silence warning with latest MSVC on Windows
  [lldb] Silence warning when building with latest MSVC Fixes: ``` C:\git\llvm-project\lldb\unittests\Core\DumpDataExtractorTest.cpp(140): warning C4305: 'argument': truncation from 'double' to 'const std::complex<float>::_Ty' ```
  [HLSL][SPIR-V] Add Vulkan to target triple (#76749)
  [libc++] Renames ABI tag. (#78342)
  [libc++] Un-xfail module tests in picolibc tests (#78580)
  [clang] Pass `-n` to llvm-cxxfilt in codegen tests
  [SystemZ] i128 cost model (#78528)
  [LLVM][ADT] Convert llvm::hash_code to unsigned explicitly in DenseMapInfo (#77743)
  [ORC][MachO] Support common load commands in the platform's mach-o header builder
  [ORC] Specialize MachOBuilder support for the LC_ID_DYLIB command.
  [mlir][flang][openacc] Device type support on acc routine op (#78375)
  [CUDA] Disable registering surfaces and textures with the new driver
  [SPIR-V] improve performance of Module Analysis stage in the part of processing "other instructions" (#76047)
  [lldb] Correct function names in ProcessGDBRemote::ParseFlagsFields log messages
  [LinkerWrapper][Obvious] Fix move on temporary object
  [mlir][irdl] Add `irdl.base` op (#76400)
  [X86][MC] Fix wrong encoding of promoted BMI instructions due to missing NoCD8 (#78386)
  [X86] Fix failures on EXPENSIVE_CHECKS builds
  [libc][arm] add more math.h entrypoints (#77839)
  [libc] reverts for 32b arm (#78307)
  [lldb][Format] Fix missing inlined function names in frame formatting. (#78494)
  [CVP] Add test with nested cycle (NFC)
  [mlir][sparse] add a 3-d block and fiber test (#78529)
  [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (#78359)
  [X86] X86MCInstLower.cpp - fix spelling mistake
  [mlir][ROCDL] Stop setting amdgpu-implicitarg-num-bytes (#78498)
  [mlir][Math] Add pass to legalize math functions to f32-or-higher (#78361)
  [clang][CoverageMapping] Refactor setting MC/DC True/False Condition IDs (#78202)
  [DAG] Set nneg flag when forming zext in demanded bits (#72281)
  [Flang][MLIR][OpenMP] Remove the early outlining interface (#78450)
  [BranchFolding] Use isSuccessor to confirm fall through (#77923)
  [clang][Interp] IndirectMember initializers (#69900)
  [Profile][CoverageMapping] MC/DC Fix passing FileID for DecisionRegion
  [flang] Match the length size in comparison (NFC) (#78302)
  [flang] fix unsafe memory access using mlir::ValueRange (#78435)
  [formatv][FmtAlign] Use fill count of type size_t instead of uint32_t (#78459)
  [X86] Emit verbose (constant) comments before EVEX compression tag (#78585)
  [IR] Allow type change in ValueAsMetadata::handleRAUW (#76969)
  Fix typo (#78587)
  [RISCV] Use regexp to check negative extensions in test. NFC
  [NFC][OpenMP] Fix typo in CHECK line (#78586)
  [mlir][transform] Add transform.get_operand op (#78397)
  [Clang][NFC] Rename CXXMethodDecl::isPure -> is VirtualPure (#78463)
  [NFC][OpenMP][Flang] Add test for OpenMP target parallel do (#77776)
  [flang][driver] Fix Driver/isysroot.f90 test (#78478)
  [clang][Interp] Fix diagnosing non-const variables pre-C++11 (#76718)
  [AMDGPU] Add global_load_tr for GFX12 (#77772)
  [clang-repl] Add a interpreter-specific overload of operator new for C++ (#76218)
  [libc][NFC] Use the Sign type for DyadicFloat (#78577)
  [Flang][OpenMP][Lower] NFC: Combine two calls to ClauseProcessor::processTODO (#78451)
  [AMDGPU] Regenerate tests for #77892 after #77438
  [Flang][OpenMP] Push genEval closer to leaf lowering functions (#77760)
  [AMDGPU] Update uses of new VOP2 pseudos for GFX12 (#78155)
  [clang] Add test for CWG1807 (#77637)
  [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (#75917)
  AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions (#77892)
  [clang][Interp] Implement ComplexToReal casts (#77294)
  [libc][NFC] Introduce a Sign type for FPBits (#78500)
  [InstCombine] Recognize more rotation patterns (#78107)
  [InstCombine] combine mul(abs(x),abs(y)) to abs(mul(x,y)) (#78395)
  Revert "[CodeGen] Support start/stop in CodeGenPassBuilder" (#78567)
  [X86] Add X86::getConstantFromPool helper function to replace duplicate implementations.
  [DWARFLinker][NFC] Move common code into the base library: IndexedValuesMap. (#77437)
  [AMDGPU] Fix -Wunused-variable in SIInsertWaitcnts.cpp (NFC)
  [OpenMP][omp_lib] Restore compatibility with more restrictive Fortran compilers (#77780)
  [coroutines][coro_lifetimebound] Detect lifetime issues with lambda captures (#77066)
  [CGP] Avoid replacing a free ext with multiple other exts. (#77094)
  [AMDGPU] CodeGen for GFX12 S_WAIT_* instructions (#77438)
  [AMDGPU] Work around s_getpc_b64 zero extending on GFX12 (#78186)
  [AMDGPU] Add GFX12 __builtin_amdgcn_s_sleep_var (#77926)
  [AMDGPU] Allow potentially negative flat scratch offsets on GFX12 (#78193)
  [AMDGPU][ELF] Reserve 0x4f and 0x50 EFLAGS
  [dsymutil][llvm-dwarfutil] Rename command line options to avoid using vendor names. (#78135)
  [Clang][SME] Add missing IsStreamingCompatible flag to svget, svcreate & svset (#78430)
  [ConstantFold] Clean up binop identity folding
  [libc][NFC] Selectively disable GCC warnings (#78462)
  [AArch64][SME] Conditionally do smstart/smstop (#77113)
  [RISCV] Vectorize phi for loop carried @llvm.vector.reduce.fadd (#78244)
  [C++20] [Modules] Allow to merge enums with the same underlying interger types
  [gn build] Port 1d286ad59b90
  [AMDGPU][True16] Support V_FLOOR_F16. (#78446)
  [flang] Allow user to define free via BIND(C) (#78428)
  [AMDGPU] Add mark last scratch load pass (#75512)
  [LV][AArch64] LoopVectorizer allows scalable frem instructions (#76247)
  [clang][ASTImporter] Improve structural equivalence of overloadable operators. (#72242)
  [clang][dataflow] Consider `CXXDefaultInitExpr` to be an "original record ctor". (#78423)
  [clang][dataflow] Use `Formula::isLiteral()` in a couple more places. (#78404)
  [AMDGPU][NFC] Rename feature FP8Insts to FP8ConversionInsts (#78439)
  [AMDGPU] Use alias info to relax waitcounts for LDS DMA (#74537)
  [AsmPrinter] Fix gcc -Wparentheses warning [NFC]
  DAG: Fix chain mismanagement in SoftenFloatRes_FP_EXTEND (#74558)
  [Clang] NFC: Move Arm type attributes to separate trailing object. (#78424)
  [Path] Fix off-by-one in finding filename for win style paths (#78055)
  [CodeGen] Support start/stop in CodeGenPassBuilder (#70912)
  [clangd] Handle an expanded token range that ends in the `eof` token in TokenBuffer::spelledForExpanded() (#78092)
  [gn] attempt to port 8dfc67d6724e (__assertion_handler)
  [Clang] Fix dependency of SourceLocExpr. (#78436)
  [libc++] <experimental/simd> Fix vector_aligned_tag (#76611)
  [compiler-rt] making getrandom call blocking. (#78340)
  [CI] Add lld as compiler-rt dependecy (#78536)
  [clang] Upstream XROS support in Clang (#78392)
  [CodeGen] Port GlobalMerge to new pass manager (#77474)
  [ASan][libc++] Turn on ASan annotations for short strings (#75882)
  [Support] Use llvm::inverse_children (NFC)
  [DebugInfo] Use StringRef::consume_front (NFC)
  [Support] Use SmallString::operator std::string (NFC)
  Revert "[clang] Fix CTAD for aggregates for nested template classes" (#78541)
  DAG: Fix ABI lowering with FP promote in strictfp functions (#74405)
  [mlir][complex] Convert complex.tan to libm ctan call (#78250)
  [libc++][hardening] Rework how the assertion handler can be overridden. (#77883)
  Recommit "[RISCV][ISel] Combine scalable vector add/sub/mul with zero/sign extension." (#76785)
  [X86] Support lowering for APX promoted BMI instructions. (#77433)
  [X86][test] Add --show-mc-encoding for lowering tests of NDD arithmetic instructions (#78406)
  [AMDGPU] Reapply 'Sign extend simm16 in setreg intrinsic' (#78492)
  [X86] Support "f16c" and "avx512fp16" for __builtin_cpu_supports (#78384)
  workflows: Refactor release-tasks.yml (#69523)
  [dfsan] Make sprintf interceptor compatible with glibc 2.37+ and musl (#78363)
  [NVPTX] extend type support for nvvm.{min,max,mulhi,sad} (#78385)
  [RISCV] Add LUI/AUIPC+ADDI fusions to sifive-p450. (#78501)
  [AArch64] Fix -Wreturn-type in AArch64TargetParser.cpp (NFC)
  [lld/ELF] Hint if R_X86_64_PC32 overflows and references a SHF_X86_64_LARGE section (#73045)
  [X86] Don't respect large data threshold for globals with an explicit section (#78348)
  [Hurd] Fix -Wswitch in Hurd::getDynamicLinker (NFC)
  [Clang] Support MSPropertyRefExpr as placement arg to new-expression (#75883)
  [clang] Fix CTAD for aggregates for nested template classes (#78387)
  Add Variadic 'dropAttrs' (#78476)
  [clang] Disable gch-probe.c on AIX as `-gmodules` is not supported there yet. (#78513)
  [Driver] Add -fandroid-pad-segment/-fno-android-pad-segment (#77244)
  Hurd: Add x86_64 support (#78065)
  [LLD][RISCV] Report error for unsatisfiable RISCV_ALIGN (#74121)
  [WPD][LLD] Allow glob matching of --lto-known-safe-vtables (#78505)
  [libc][obvious] disable fabsf128 on aarch64 (#78511)
  [OpenACC] Implement 'bind' clause parsing.
  [flang] Avoid new spurious error under -fopenacc (#78504)
  [Clang] Update Unicode version to 15.1 (#77147)
  [llvm-readobj][Object][COFF] Include COFF import file machine type in format string. (#78366)
  [AArch64] Improve cost computations for odd vector mem ops. (#78181)
  [GISel][RISCV] Implement selectShiftMask. (#77572)
  [GlobalIsel][AArch64] more legal icmps (#78239)
  [RISCV] Prefer vsetivli for VLMAX when VLEN is exactly known (#75509)
  [CodeGen][MISched][NFC] Rename some instances of Cycle -> ReleaseAtCycle
  [WebAssembly] Use ValType instead of integer types to model wasm tables (#78012)
  Revert "[RISCV] Implement RISCVInsrInfo::getConstValDefinedInReg"
  [lldb] Support changes to TLS on macOS (#77988)
  BalancedPartitioning: minor updates (#77568)
  [mlir][openacc][NFC] Use interleaveComma in printers (#78347)
  [NVPTX] Add tex.grad.cube{array} intrinsics (#77693)
  [llvm] Teach MachO about XROS (#78373)
  [AMDGPU][GFX12] Add Atomic cond_sub_u32 (#76224)
  Remove maximum OSX version for sanitizers (#77543)
  Revert "[SimplifyCFG] `switch`: Do Not Transform the Default Case if the Condition is Too Wide" (#78469)
  [gn build] Port 3b6a8f823bf8
  [lldb] Upstream xros support in lldb (#78389)
  [Headers][X86] Add more descriptions to ia32intrin.h and immintrin.h (#77686)
  [lldb-dap] Adjusting how repl-mode auto determines commands vs variable expressions. (#78005)
  Apply clang-tidy fixes for llvm-else-after-return in IRDLVerifiers.cpp (NFC)
  Apply clang-tidy fixes for readability-identifier-naming in IRDLLoading.cpp (NFC)
  Apply clang-tidy fixes for llvm-qualified-auto in DecomposeMemrefs.cpp (NFC)
  Apply clang-tidy fixes for readability-identifier-naming in Utils.cpp (NFC)
  Apply clang-tidy fixes for performance-unnecessary-value-param in Utils.cpp (NFC)
  [gn] fix mistake from 92289db82fb2
  [flang] Lower BIND(C) module variables (#78279)
  Revert "AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis" (#78468)
  [reland][libc][NFC] Refactor FPBits and remove LongDoubleBits specialization (#78465)
  [Flang][MLIR] Add basic initial support for alloca and program address space handling in FIR->LLVMIR codegen (#77518)
  [BasicAA] Fix new test Analysis/BasicAA/separate_storage-alias-sets.ll
  [flang] Add structure constructor with allocatable component (#77845)
  [AArch64][Driver] Better handling of target feature dependencies (#78270)
  On Windows, make the release script work with the local git checkout (#78273)
  [bazel]Sort loads in llvm/BUILD.bazel
  Revert faecc736e2ac3cd8c77 #74443 [DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443)
  [AMDGPU] Src1 of VOP3 DPP instructions can be SGPR on GFX12 (#77929)
  Revert "Create overloads of debug intrinsic utilities for DPValues (#78313)"
  Revert "[reland][libc][NFC] Refactor FPBits and remove LongDoubleBits specialization" (#78457)
  [RemoveDIs][DebugInfo] Create overloads of debug intrinsic utilities for DPValues (#78313)
  [AMDGPU] Do not run GCNNSAReassign pass for GFX12 (#78185)
  [reland][libc][NFC] Refactor FPBits and remove LongDoubleBits specialization (#78447)
  [AST] Don't merge memory locations in AliasSetTracker (#65731)
  [MLIR][LLVM] Add explicit target_cpu attribute to llvm.func (#78287)
  [OpenACC] Implement 'collapse' clause parsing.
  [gn] fix minor mistake from f7cb1afa0633
  AMDGPU: Allocate special SGPRs before user SGPR arguments (#78234)
  [RISCV] Don't scale cost by LMUL for TCK_CodeSize in getMemoryOpCost (#78407)
  [AArch64] Fix a minor issue with AArch64LoopIdiomTransform (#78136)
  [Clang][Sema] Diagnose function/variable templates that shadow their own template parameters (#78274)
  [APINotes] Upstream dependencies of Sema logic to apply API Notes to decls
  Revert "[AMDGPU] Add InstCombine rule for ballot.i64 intrinsic in wave32 mode." (#78429)
  [NFC][Clang][Headers] Update refs to ACLE in comments (#78305)
  [Clang][Sema] improve sema check of clang::musttail attribute (#77727)
  Revert "[libc++] Fix `regex_search` to match `$` alone with `match_default` flag" (#78349)
  Revert "[RISCV] Remove dead early exit in performCombineVMergeAndVOps. NFC"
  [RISCV] Remove dead early exit in performCombineVMergeAndVOps. NFC
  [lldb] On Windows, silence warning with latest MSVC
  [compiler-rt] On Windows, silence warning when building with latest MSVC
  [openmp] Silence warnings when building the LLVM release with MSVC
  [lldb] Silence warning with Clang ToT
  [lldb] Use `LLVM_FALLTHROUGH` to avoid a compiler error when building with MSVC.
  [lldb] Replace deprecated `std::unique_ptr::unique()` to silence a warning with MS-STL. NFC.
  [compiler-rt][ORC] Silence warning when building on MSVC/Windows
  [compiler-rt] Fix fuzzer tests on Windows
  [clang][tools] Silence const cast warning when building with Clang ToT
  [ORC-RT] Silence warning when building with Clang ToT on Windows
  [compiler-rt] Silence warnings when building with Clang ToT
  [lldb] Silence warning when building with Clang ToT
  [third-party] Silence warning on benchmark when building with Clang ToT
  [compiler-rt] Silence warning with MSVC 19.38 (Visual Studio 2022 17.8.3)
  [openmp] Remove extra ';' outside of function
  [clang][CodeGen] Fix gcc warning about unused variable [NFC]
  [clang-format] TableGen multi line string support. (#78032)
  [RemoveDIs][DebugInfo] Make DIAssignID always replaceable (#78300)
  [AMDGPU] Fix llvm.amdgcn.s.wait.event.export.ready for GFX12 (#78191)
  [AMDGPU] Disable V_MAD_U64_U32/V_MAD_I64_I32 workaround for GFX12 (#77927)
  [clang][AST] Invalidate DecompositionDecl if it has invalid initializer. (#72428)
  AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#76145)
  [Clang][Sema][NFC] Remove unused Scope* parameter from Sema::GetTypeForDeclarator and Sema::ActOnTypeName (#78325)
  [AMDGPU] Add comments on SITargetLowering::widenLoad
  [lldb] Skip TestExecutableFirst.test_executable_is_first_before_run on ELF
  [GitHub][workflows] Replace curl with sparse checkout (#78303)
  [AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (#77633)
  [AMDGPU] Increase max scratch allocation for GFX12 (#77625)
  [AMDGPU] Fix hang caused by VS_CNT handling at calls (#78318)
  [mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260)
  [AMDGPU] Add InstCombine rule for ballot.i64 intrinsic in wave32 mode. (#71556)
  [AArch64] Use parseArchExtension function. NFC (#78158)
  [VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)
  AMDGPU: Remove fixed fixme from a test
  [InstCombine] Add log-pow simplification for FP exponent edge case. (#76641)
  Revert "[libc++] Clang-tidy enable modernize-use-nullptr. (#76659)" (#78409)
  [mlir] fix wording in transform dialect docs
  [GitHub] Add python 3.7 to libclang python test (#77219)
  [libclang/python] Bump minimum compatibility to Python 3.6 (#77228)
  [AsmParser] Implicitly declare intrinsics (#78251)
  [GISel] Add debug counter to force sdag fallback (#78257)
  [DAGCombine] Add debug counter (#78259)
  [BasicAA] Remove incorrect rule about constant pointers (#76815)
  [GlobalISel] Improve combines for extend operation by taking hint ins… (#74125)
  [RISCV][test] Test showing missed optimisation for spills/fills of GPR<->FPR moves
  Return high address masks correctly in Process (#78379)
  [RISCV] Add run line for code size in rvv load/store cost model test. NFC
  [X86] Add "Ws" constraint and "p" modifier for symbolic address/label reference (#77886)
  [ASan][libc++][NFC] refactor vector annotations arguments (#78322)
  [libc++][modules] Fixes clang-tidy exports. (#76288)
  [libc++] Clang-tidy enable modernize-use-nullptr. (#76659)
  [libc++][modules] Increase clang-tidy version used. (#76268)
  [libc++][modules] Adds module testing.  (#76246)
  [libc++][modules] Removes module testing. (#76083)
  [mlir][Tosa]: Add folder to ReciprocalOp of splat constant inputs (#78137)
  [clang][docs] Improve "Obtaining Clang" section (#71313)
  [RISCV] Overwrite cpu target features for full arch string in target attribute (#77426)
  [RISCV][GISel] Don't create generic virtual registers in selectSHXADDOp/selectSHXADD_UWOp. (#78396)
  [SelectionDAG] Fix isKnownNeverZeroFloat for vectors (#78308)
  [AMDGPU,test] Change llc -march= to -mtriple= (#75982)
  [lldb] Fix trailing whitespace & formatting in Core/Module.cpp (NFC)
  [lldb] Store SupportFile in CompileUnit (NFC)
  [Clang] Implement CWG2598: Union of non-literal types (#78195)
  [lldb] Store SupportFile in LineEntry (NFC) (#77999)
  [clang-cl] document correct defaults for `-fms-compatibility-version` / `-fmsc-version` (#76418)
  Sanitizer/MIPS: Use $t9 for preemptible function call (#76894)
  Disable ConstraintSystemTest for now
  [X86] Use vXi1 for `k` constraint in inline asm (#77733)
  [LoongArch] Add LoongArch V1.1 instructions definitions and MC tests (#78238)
  [mlir][arith] Add overflow flags support to arith ops (#78376)
  [X86][BF16] Improve float -> bfloat lowering under AVX512BF16 and AVXNECONVERT (#78042)
  [clang] Emit error for invalid friend functions under [temp.friend]p9 (#78083)
  Add the Linux "you can use this binary" bits to run_to_source_breakpoint (#78377)
  [Clang][Sema] Extract ellipsis location from CXXFoldExpr for reattaching constraints on NTTPs (#78080)
  Ensure that the executable module is ModuleList[0] (#78360)
  [compiler-rt] Drop COMPILER_RT_BUILD_CRT workaround (#78331)
  Revert "[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)"
  Revert "[Fix] Disable fdefine-target-os-macros for now" (#78353)
  Revert "[AMDGPU] Sign extend simm16 in setreg intrinsic" (#78372)
  [lldb] Remove unused LineEntry ctor (NFC)
  [llvm][MC] silence xros platform warnings, NFC
  [llvm] Introduce XROS platform  (#77707)
  [lldb] Build the TestRosetta.py executable with system stdlib (#78370)
  [lldb] Store SupportFile as shared_ptr (NFC)
  [JITLink][AArch32] Refactor StubsManager (NFC)
  [JITLink][AArch32] Rename stubs flavor Thumbv7 to v7 (NFC)
  [JITLink][AArch32] Fix typos in Thumb stubs test (NFC)
  [JITLink][AArch32] Streamline file-names of tests (NFC)
  [JITLink][AArch32] In warning output add decimal value for CPUArch and missing newline
  [clang][Driver] Don't ignore -gmodules .gch files (#77711)
  [llvm] Indirect symbol replacement with GOTPCREL for aarch64 and risc… (#78003)
  [lldb][Progress] Fix test for trimmed progress reports (#78357)
  [Clang] Implement the 'counted_by' attribute (#76348)
  [lldb] Hoist SupportFile out of FileSpecList (NFC)
  [SHT_LLVM_BB_ADDR_MAP,NFC] Add SCOPED_TRACE for convenient mapping of failures to test cases. (#78335)
  [BasicBlockSections] Always keep the entry block in the beginning of the function. (#74696)
  [InstCombine] Only fold bitcast(fptrunc) if destination type matches fptrunc result type. (#77046)
  Recommit "[AST] Use APIntStorage to fix memory leak in EnumConstantDecl. (#78311)"
  [BoundsSafety] Initial documentation for -fbounds-safety (#70749)
  [AMDGPU] Fix predicates for V_DOT instructions. (#78198)
  [ASAN][sanitizers][win] Allow windows-asan to be built with /MDd and intercept functions from the debug runtimes. (#77840)
  Revert "[AST] Use APIntStorage to fix memory leak in EnumConstantDecl. (#78311)"
  [clang-format] Add parse tests for SeparateDefinitionBlocks option (#78256)
  [AST] Use APIntStorage to fix memory leak in EnumConstantDecl. (#78311)
  Revert "[libc][NFC] Refactor FPBits and remove LongDoubleBits specialization (#78192)" (#78329)
  [gn build] Manually port 8566cd61
  [MC][ARM] Fix test
  [llvm-exegesis] Fix snippet value scaling (#77226)
  [OpenACC] Implement 'reduction' clause parsing.
  Work around a bug in the interaction between newer dyld's and older simulator dyld's (#78004)
  [MemProf][NFC] Explicitly specify llvm version of function_ref (#77783)
  [libc++][modules] Fixes RTTI build.
  [NVPTX] Fix generating permute bytes from register pair when the initial values are undefined (#74437)
  [libc++][CI] Fixes documentation builder. (#78327)
  [CMake] Include riscv32-unknown-elf runtimes in Fuchsia toolchain (#78323)
  [libc++] Rewrite the IWYU generation (#78295)
  [libc++][NFC] Add deprecated mention for _LIBCPP_ENABLE_CXX20_REMOVED_ALLOCATOR_MEMBERS in the docs
  [libc++][print] Enables it on Apple backdeployment. (#76293)
  [libc++] Improves _LIBCPP_HAS_NO_THREADS guards. (#76624)
  [libc++] Deprecate removed features macros. (#77879)
  [clang][dataflow] Fix bug in `Value` comparison. (#76746)
  [mlir][sparse][codeowners] add MLIR sparsifier team to codeowners (#78319)
  [OpenACC] Implement 'self' clause parsing on 'update'.
  [X86] Add test case for Issue #77805
  [NFC] sentinal -> sentinel
  [SLP]Fix PR78298: Assertion `GEP->getNumIndices() == 1 && !isa<Constant>(GEPIdx)' failed.
  [AMDGPU] Sign extend simm16 in setreg intrinsic (#77997)
  [MC][ARM] Fix test.
  [flang] Add install target to install flang headers (#78151)
  [RISCV] Add scheduler model for sifive-p450. (#77989)
  [RISCV] Bump Zfbfmin, Zvfbfmin, and Zvfbfwma to 1.0. (#78021)
  [Coroutines] Fix incorrect attribute name coroutine.presplit (NFC) (#78296)
  [X86] Add test case for Issue #78109
  [libc++][print] Includes <format>. (#76333)
  [libc++][print] Renames __use_unicode. (#76290)
  [libc++][doc] Removes LLVM-17 release notes. (#78062)
  [AArch64][SME2] Add ZT0 attributes to SMEAttrs (#77607)
  [libc][cmake] reset COMPILE_DEFINITIONS (#77810)
  [clang][ASTMatcher] Add matchers for CXXFoldExpr (#71245)
  [gn build] Port 9fa9d9a7e1cd
  [gn build] Port 8e514c572e44
  [gn build] Port 8e21557d0401
  [X86][NFC] Simplify the patterns of BMI shift/rotate instructions in X86InstrCompiler.td
  [lldb][Progress] Separate title and details (#77547)
  [libc++abi] Fix typo in CMake error message
  [lldb] Add LLDB_BUG_REPORT_URL macro to allow a different URL for lldb bug reporting. (#78210)
  [LV] Make DL optional argument for VPBuilder member functions (NFCI).
  [OpenACC] Implement copyin/create clause parsing.
  [SPARC] Prefer RDPC over CALL to implement GETPCX for 64-bit target
  [mlir][llvm] Fix loop annotation parser (#78266)
  [libunwind] Move errno.h and signal.h includes under the block where they're needed (#78054)
  [LV] Fix indent for loop in adjustRecipesForReductions (NFC).
  [flang][Driver] Support -pthread in the frontend (#77360)
  [OpenACC] Implement 'copyout' clause parsing.
  rename to 'try' isntead of 'Try'x
  [OpenACC} Improve diagnostics for 'tag's on clauses/directives
  Revert "[bazel][llvm] Sort load statements"
  [X86][NFC] Simplify the patterns of BMI shift/rotate instructions in X86InstrShiftRotate.td
  [SLP]Fix PR78236: correctly track external values, replaced several times during reduction vectorization.
  [TLI][AArch64] Add extra SLEEF mappings and tests (#78140)
  [clang][dataflow] Use `ignoreCFGOmittedNodes()` in `setValue()`. (#78245)
  [gn] port 8e7f073eb42c (-gen-clang-regular-keyword-attr-info)
  [libc][NFC] Refactor FPBits and remove LongDoubleBits specialization (#78192)
  [VPlan] Use start value of reduction phi to determine type (NFCI).
  [RemoveDIs][DebugInfo] Explicitly convert iterator to pointer for std::distance
  [OpenACC} Improve diagnostics for 'tag's on clauses/directives (#77957)
  [RISCV] Remove -riscv-v-vector-bits-min flag that was left behind. NFC
  [bazel][llvm] Sort load statements
  [libunwind][WebAssembly] Fix libunwind.cpp guard (#78230)
  [RemoveDIs][DebugInfo] Add DPVAssign variant of DPValue (#77912)
  [libc] Fix libc-hdrgen crosscompiling (#78227)
  Improve modeling of two functions in StdLibraryFunctionsChecker (#78079)
  [Flang][OpenMP] Remove space before :: in member function definition,… (#78205)
  [RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777)
  [mlir][Arm] Fix invalid rewrite pattern API violations (#78246)
  [X86][test] Add test for lowering NDD AND
  [X86][NFC] Simplify the definitions of BMI shift/rotate instructions
  [clang][dataflow] Tighten checking for existence of a function body. (#78163)
  [Clang][AArch64] Remove unnecessary and incorrect attributes from arm_sme.h.
  [bazel] Add dependencies for 8e514c572e44eda237417236b4c92176dfce9cd9
  [libc++][utility][NFC] Refactored safe integer `cmp_xxxx` functions to use the `__libcpp_is_integer` concept (#78115)
  [MoveAutoInit] Ignore unreachable basicblocks and handle catchswitch (#78232)
  Revert "Simplify `(a % b) lt/ge (b-1)` into `(a % b) eq/ne (b-1)` (#72504)"
  [Clang] Make sdot builtins available to SME (#77792)
  [ARM] Add missing earlyclobber to sqrshr and uqrshl instructions. (#77782)
  [ARM] Fix phi operand order issue in MVEGatherScatterLowering (#78208)
  [AMDGPU] Remove VT helpers isFloatType, isPackedType, simplify isIntType (#77987)
  [AArch64] Fix a typo in predicate expression (NFC) (#78162)
  Reapply [TLI] Fix replace-with-veclib crash with invalid arguments (#77945)
  [llvm-exegesis] Refactor individual counter data to ConfiguredEvent (#77900)
  [clang-tidy] Fix missing parentheses in readability-implicit-bool-conversion fixes (#74891)
  [X86][NFC] Simplify the definitions of double precision shift instructions
  [NFC][clang-tidy]improve performance for misc-unused-using-decls check (#78231)
  [AArch64] Add native CPU detection for Microsoft Azure Cobalt 100. (#77793)
  [RemoveDIs][DebugInfo][NFC] Add Instruction and convenience functions to DPValue (#77896)
  [llvm-exegesis] Refactor Counter to CounterGroup (#77887)
  [BasicAA] Handle disjoint or as add in DecomposeGEP. (#78209)
  Simplify `(a % b) lt/ge (b-1)` into `(a % b) eq/ne (b-1)` (#72504)
  [mlir][bufferization] Add `BufferizableOpInterface::hasTensorSemantics` (#75273)
  [GlobalISel] Fix buildCopyFromRegs for split vectors (#77448)
  [clang-tidy] Handle C++ structured bindings in `performance-for-range-copy` (#77105)
  [clang][analyzer] Improve modeling of 'fseeko' and 'ftello' in StdLibraryFunctionsChecker (#77902)
  [clang] Fix CTAD not work for function-type and array-type arguments. (#78159)
  [NameAnonGlobals] Mark the pass as required (#78161)
  [mlir][vector] Fix invalid IR in `RewriteBitCastOfTruncI` (#78146)
  [mlir][vector] Fix invalid IR in `ContractionOpLowering` (#78130)
  AMDGPU/GlobalISel: Handle inreg arguments as SGPRs (#78123)
  [mlir][Transforms] `GreedyPatternRewriteDriver`: Better expensive checks encapsulation (#78175)
  [clang][dataflow] Add an early-out to `flowConditionImplies()` / `flowConditionAllows()`. (#78172)
  [bazel] Fix build after 9fa9d9a7e1cd0a7fd8c35bdfc642793447bf70aa
  [bazel] Fix build after 9fa9d9a7e1cd0a7fd8c35bdfc642793447bf70aa
  [MachineScheduler] Add option to control reordering for store/load clustering (#75338)
  [RISCV] Implement RISCVInsrInfo::getConstValDefinedInReg (#77610)
  [bazel] Fix build after 9fa9d9a7e1cd0a7fd8c35bdfc642793447bf70aa
  [RISCV] Remove vmv.s.x and vmv.x.s lmul pseudo variants (#71501)
  [dsymutil] Use StringRef::consume_front (NFC)
  [TableGen] Use llvm::drop_begin (NFC)
  [BOLT] Use SmallString::operator std::string (NFC)
  [modularize] Use SmallString::operator std::string (NFC)
  [Transforms] Use a range-based for loop (NFC)
  [Basic] Use StringRef::consume_front (NFC)
  [LoongArch] Add relaxDwarfLineAddr and relaxDwarfCFA to handle the mutable label diff in dwarfinfo (#77728)
  [X86] Fix error: unused variable 'isMemOp' after #78019, NFCI
  Apply clang-tidy fixes for bugprone-macro-parentheses in Utils.cpp (NFC)
  Apply clang-tidy fixes for readability-simplify-boolean-expr in LegalizeForLLVMExport.cpp (NFC)
  Apply clang-tidy fixes for performance-move-const-arg in IntRangeOptimizations.cpp (NFC)
  Apply clang-tidy fixes for readability-identifier-naming in CLOptionsSetup.cpp (NFC)
  Apply clang-tidy fixes for readability-simplify-boolean-expr in VectorToGPU.cpp (NFC)
  [mlir] Attribute add printStripped (#78008)
  [DFAJumpThreading][NFC] Reduce tests
  [AMDGPU][MC] Remove incorrect `_e32` suffix from `v_dot2c_f32_f16` and `v_dot4c_i32_i8` (#77993)
  [X86] Fix -Wunused-variable in X86InstrInfo.cpp (NFC)
  [SPIR-V] Do not emit spv_ptrcast if GEP result is of expected type (#78122)
  [NFC] Improve test for clang/test/Modules/GH77953.cpp
  [X86][NFC] Simplify the definitions of rotate instructions
  [LV] Skipping all debug instructions when native vplan is enabled (#77413)
  [X86] Add MI-layer routine for getting the index of the first address operand, NFC (#78019)
  Use Log2_64_Ceil to compute PowerOf2Ceil (#67580)
  Revert "[Clang] Implement the 'counted_by' attribute (#76348)"
  [GlobalISel] Fix the select->minmax combine from trying to operate on pointer types.
  [libunwind]  fix dynamic .eh_frame registration (#77185)
  [DFAJumpThreading] Handle circular determinator (#78177)
  [Libomptarget] Remove temporary files in AMDGPU JIT impl (#77980)
  [Clang] Only compare template params of potential overload after checking their decl context (#78139)
  [NFC]add - at the beginning for alignment
  [mlir] Fix -Wsign-compare in MeshOps.cpp (NFC)
  [mlir][gpu] Add an offloading handler attribute to `gpu.module` (#78047)
  [flang] allow _POSIX_SOURCE to be defined without a value (#78179)
  [LSR] Add test showing incorrectly adding nuw with #77827.
  [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (#78057)
  [flang][NFC] Restore documentation (#78211)
  [lldb][ValueObject][NFC] Further remove redundant parameters to ReadPointedString (#78029)
  [flang] Catch more initialization errors (#77850)
  [sanitizer] Fix builds after #77991
  [LLD] [COFF] Prefer paths specified with -libpath: over toolchain paths (#78039)
  [flang][runtime] Emit leading space before zero-length list-directed … (#77835)
  [flang] Catch name resolution error due to global scoping (#77683)
  [flang] Add portability warning for F'2008 feature (#77526)
  Hashpin sensitive dependencies and configure dependabot to update them automatically (#75859)
  [flang] Don't change size of allocatable in error situation (#77386)
  [flang] More support for assumed-size Cray pointees (#77381)
  [InstCombine] Add folds for `(add/sub/disjoint_or/icmp C, (ctpop (not x)))`
  [InstCombine] Add tests for folding `(add/sub/disjoint_or/icmp C, (ctpop (not x)))`; NFC
  [flang] Fix semantic checks for MOVE_ALLOC (#77362)
  [flang][runtime] Better real MOD/MODULO results (#77167)
  [flang] Refine IMPORT processing in module file generation (#77133)
  [ARM] Extra test for MVE gather optimization with commuted phi operands. NFC
  [flang] Weird restrictions on index variables (#77019)
  [llvm][MC][ARM] Don't autoresolve fixups (#76574)
  [flang][runtime] Resume rounding hexadecimal floating-point input (#77006)
  [SystemZ] Don't use FP Load and Test as comparisons to same reg (#78074)
  [flang][runtime] Treatment of NaN in MAXVAL/MAXLOC/MINVAL/MINLOC (#76999)
  [AArch64][GlobalISel] Combine vecreduce(ext) to {U/S}ADDLV (#75832)
  [AMDGPU] Do not generate s_set_inst_prefetch_distance for GFX12 (#78190)
  [AMDGPU] Disable hasVALUPartialForwardingHazard for GFX12 (#78188)
  [AMDGPU] Disable hasVALUMaskWriteHazard for GFX12 (#78187)
  [flang][runtime] Fix total MAXLOC/MINLOC over CHARACTER data (#76880)
  [libc] Give more functions restrict qualifiers (NFC) (#78061)
  Fix test output for 3b16d8c
  [flang][runtime] Emit leading spaces in NAMELIST output (#76846)
  [AArch64][GlobalISel] Pre-commit for Combine vecreduce(ext) to {U/S}ADDLV
  [flang][runtime] Extension: allow a comma to terminate a fixed input … (#76768)
  [DebugInfo][RemoveDIs][NFC] Split findDbgDeclares into two functions (#77478)
  [flang] Support \u Unicode escape sequences (#76757)
  [Flang][OpenMP] Handle SECTION construct from within SECTIONS (#77759)
  [Flang] Remove dead -mvscale-{min,max} logic from getVScaleRange. NFCI (#78133)
  [flang] Allow use of COMMON in PURE (#76741)
  Reland [mlir][ExecutionEngine] Add support for global constructors and destructors #78070  (#78170)
  [flang][runtime] Clean up code to unblock development (#78063)
  [mlir][SCF] Do not verify step size of `scf.for` (#78141)
  [GlobalISel] Refactor extractParts() (#75223)
  [SystemZ] Don't assert for i128 vectors in getInterleavedMemoryOpCost() (#78009)
  [GlobalISel] Make IRTranslator able to handle PHIs with empty types. (#73235)
  Require asserts for llvm/test/CodeGen/PowerPC/fence.ll
  [SystemZ] Don't crash on undef source in shouldCoalesce() (#78056)
  [AMDGPU] Remove functions with incompatible gws attribute (#78143)
  [AArch64][GlobalISel] Fix not extending GPR32->GPR64 result of anyext indexed load.
  [clang] Reword apologetic Clang diagnostic messages (#76310)
  [Flang][OpenMP] Avoid default none errors for seq loop indices in par… (#76258)
  [TargetParser] Define AEK_FCMA and AEK_JSCVT for tsv110 (#75516)
  [mlir][mesh] Remove rank attribute and rename dim_sizes to shape in ClusterOp (#77838)
  [SystemZ][z/OS] Add support for recognizing z/OS personality function in Clang (#76073)
  [libc++abi] Improve error message when libunwind is missing from LLVM_ENABLE_RUNTIMES (#77991)
  [Clang] Rename and enable boolean get, set, create and undef for sme2 (#77338)
  [clang][Interp] Support __real/__imag on primitives (#75485)
  [ARM] add execute-only Armv6-M support to the release notes (#77937)
  [Flang][OpenMP] Minor cosmetic changes post-PR#77758, NFC
  Revert "[mlir][ExecutionEngine] Add support for global constructors and destructors" (#78164)
  [Flang][OpenMP] Push genEval calls to individual operations, NFC (#77758)
  [Format] Fix isStartOfName to recognize attributes (#76804)
  [MLIR][NVVM] Add support for aligned variants of cluster barriers (#78142)
  [Support] Windows Filesystem fs::status Conditionally Call GetFileAttributes (#78118)
  [clang][test][NFC] Enable more tests with new constexpr interpreter
  [lldb] Skip part of TestDataFormatterAdv (#72233)
  [bazel] Port 8e7f073eb42c92aa7a2b651ca314d7fcebf296e3
  [AMDGPU] Simplify GFX12 FLAT Real instruction definitions. NFC. (#78147)
  [AArch64] Add costmodel tests for vectors with non-power-of-2 elements.
  [clang][ASTImporter] Fix import of variable template redeclarations. (#72841)
  [InstCombine] pow-1.ll - regenerate checks
  [Flang] Clean up LoopVersioning LLVM_DEBUG blocks. NFC (#77818)
  [clang][Interp][NFC] Remove outdated FIXME comment
  [flang][runtime] Fix seg fault in intrinsic execute_command_line (#78126)
  [mlir][nvgpu] Add `nvgpu.tma.async.store` (#77811)
  [SPIR-V] Strip convergence intrinsics before ISel (#75948)
  [DFAJumpThreading] Extends the bitwidth of state from uint64_t to APInt (#78134)
  [clang-tidy]Add new check readability-avoid-nested-conditional-operator (#78022)
  [mlir] Reformat whitespace in dependent dialects codegen (#78090)
  [AMDGPU][NFC] Add GFX numbers to DefaultComponent feature (#77894)
  [X86][NFC] Simplify the definitions of shift instructions
  [Clang][AArch64] Change SME attributes for shared/new/preserved state. (#76971)
  [MLIR][transform][python] Introduce abstractions for handles to values and parameters (#77305)
  [PowerPC] Fix shuffle combine with undef elements (#77787)
  [C++20] [Modules] [Serialization] Don't record '#pragma once' information in named modules
  [Clang][SME2] Fix PSEL builtin predicates (#77097)
  [Flang][RISCV] Set vscale_range based off zvl*b (#77277)
  [flang][driver] Limit the usage of -mvscale-max and -mvscale-min (#77905)
  [clang][Interp] Implement IntegralAP::{div, rem} (#72614)
  [flang][fir] update block argument types in boxed-procedure pass (#77914)
  [mlir][Mesh] Fix invalid IR in rewrite pattern (#78094)
  Fix crash with modules and constexpr destructor (#69076)
  [NFC] Pre-commit case of ppcf128 extractelt soften
  [RISCV] Don't check haveNoCommonBitsSet in RISCVGatherScatterLowering
  [RISCV] Add disjoint flag to or ops in RISCVGatherScatterLowering tests. NFC
  [mlir][Bazel] Add missing dependency needed after a1eaed7a21e1cc750e78420f298514edee1cb1ad
  [RISCV] Handle disjoint or in RISCVGatherScatterLowering (#77800)
  [MLIR][LLVM] Enable export of DISubprograms on function declarations (#78026)
  [RISCV] Add sifive-p450 to release notes. NFC
  [RISCV] Lower vfmv.s.f intrinsics to VFMV_S_F_VL first (#76699)
  [PowerPC] Implement fence builtin (#76495)
  [clang-format] Stop aligning the to continuation lines (#76378)
  [mlir][ExecutionEngine] Add support for global constructors and destructors (#78070)
  [mlir][gpu] Fix GPU YieldOP format and traits (#78006)
  [CMake] Fix building on Haiku and Solaris after c0d5d36dda04cdd409aabc015da0beb810842fcd (#78084)
  [clang-tidy]fix readability-implicit-bool-conversion false-positives when comparison bool bitfield (#77878)
  [RISCV] Combine repeated calls to MachineFunction::getSubtarget. NFC
  [mlir] fix IRPrinterInstrumentation to use the user-provided IRPrinting config (#70023)
  [libc++][concepts] Implements  concept helper `__libcpp_integer` (#78086)
  [clang-doc] Use SmallString::operator std::string (NFC)
  [CodeGen] Use a range-based for loop (NFC)
  [clang-tidy] Use StringRef::consume_front (NFC)
  [LV] Add test case where variable induction step needs truncating.
  [clang-format] Add PenaltyBreakScopeResolution option. (#78015)
  [clang-tidy] Fix false-positives in readability-container-size-empty (#74140)
  [clang-tidy] Add support for in-class initializers in readability-redundant-member-init (#77206)
  [Analysis] 'static' function 'shortenFileName' should be declared 'static inline' (NFC)
  PR#72453 : Exceeding maximum file name length (#72654)
  [mlir][Transforms] `OneToNTypeConversion.cpp`: Fix invalid IR (#77922)
  [clang] SyntaxWarning: invalid escape sequence '\s' with Python3.12 (#78036)
  [clang-tidy][DOC] Fix some speling mistakes in release notes
  Fix #75686: add iter_swap and iter_move to the matched name (#76117)
  [clang-format][NFC] Use FileCheck for clang-format-ignore lit test (#77977)
  [mlir][ArmSME] Workaround for old versions of GCC (NFC) (#78046)
  [Target] Use getConstantOperandVal (NFC)
  [IR] Use range-based for loops (NFC)
  [IR] Use StringRef::consume_front (NFC)
  [X86] Fix SLH crash on llvm.eh.sjlh.longjmp (#77959)
  [CodeGen] Use getConstantOperandVal (NFC)
  [Support] Use StringRef::consume_front (NFC)
  [llvm] Use range-based for loops with llvm::drop_begin (NFC)
  [AVX10][Doc] Add documentation about AVX10 options and their attentions (#77925)
  [clang-tidy] Add option to ignore macros in `readability-simplify-boolean-expr` check (#78043)
  [clang][NFC] Improve formatting in C++ DR tests
  [WebAssembly] Use DebugValueManager only when subprogram exists (#77978)
  [clang] Add tests for DRs about complete-class context (#77444)
  [clang] Add test for CWG1350 (#78040)
  Revert "[InstCombine] Fold `icmp pred (inttoptr X), (inttoptr Y) -> icmp pred X, Y`" (#78023)
  [libc++][doc] Bump required GCC version.
  [libc++][NFC] Release notes: fixed formatting (#78058)
  [clang-tidy] Invalid Fix-It generated for implicit-widening-multiplication-result (#76315)

Change-Id: I61c76cff95fdb35f971d2561bd23f636aa7a2f29
Signed-off-by: greenforce-auto-merge <greenforce-auto-merge@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants