Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reland [AMDGPU] Add AMDGPU specific variadic operation MCExprs #84562

Merged
merged 13 commits into from
Mar 14, 2024

Conversation

JanekvO
Copy link
Contributor

@JanekvO JanekvO commented Mar 8, 2024

Adds AMDGPU specific variadic MCExpr operations 'max' and 'or'. Required for moving function/program resource usage information propagation to MC layer as propagation of the maximum and logic OR values will go through MCExprs. Propagation of said resource usage info through these MCExpr operations compared to current handling in AMDGPUResourceUsageAnalysis will ease the constraint on the order for a function's resource usage analysis as the symbolic definition of a resource can occur after its use.

Previous merge with main triggered asan memory leaks as shown in #82022 (comment) by @fmayer. The leak was caused by SmallVector's grow functionality that got invoked as the variadic argument count exceeds the local storage limit (and results in heap allocation). However, MCTargetExpr and all its derived classes do not deallocate meaning any normal heap allocation within AMDGPUVariadicMCExpr does not have a paired deallocation. As a solution, the storage used by AMDGPUVariadicMCExpr for its variadic MCExpr container is now bump-allocated as well through the MCContext allocation.

Relands #82022 with fixes

@JanekvO JanekvO requested review from arsenm and Pierre-vh March 8, 2024 20:37
@llvmbot llvmbot added backend:AMDGPU mc Machine (object) code labels Mar 8, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 8, 2024

@llvm/pr-subscribers-mc

Author: Janek van Oirschot (JanekvO)

Changes

Adds AMDGPU specific variadic MCExpr operations 'max' and 'or'. Required for moving function/program resource usage information propagation to MC layer as propagation of the maximum and logic OR values will go through MCExprs. Propagation of said resource usage info through these MCExpr operations compared to current handling in AMDGPUResourceUsageAnalysis will ease the constraint on the order for a function's resource usage analysis as the symbolic definition of a resource can occur after its use.

Previous merge with main triggered asan memory leaks as shown in #82022 (comment) by @fmayer. The leak was caused by SmallVector's grow functionality that got invoked as the variadic argument count exceeds the local storage limit (and results in heap allocation). However, MCTargetExpr and all its derived classes do not deallocate meaning any normal heap allocation within AMDGPUVariadicMCExpr does not have a paired deallocation. As a solution, the storage used by AMDGPUVariadicMCExpr for its variadic MCExpr container is now bump-allocated as well through the MCContext allocation.

Relands #82022 with fixes


Full diff: https://github.com/llvm/llvm-project/pull/84562.diff

7 Files Affected:

  • (modified) llvm/docs/AMDGPUUsage.rst (+19)
  • (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+55)
  • (added) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp (+100)
  • (added) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h (+73)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt (+1)
  • (added) llvm/test/MC/AMDGPU/mcexpr_amd.s (+130)
  • (added) llvm/test/MC/AMDGPU/mcexpr_amd_err.s (+53)
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 7f39f69cae60db..c7cb06ffbef1e0 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1534,6 +1534,25 @@ The AMDGPU backend supports the following calling conventions:
 
      =============================== ==========================================================
 
+AMDGPU MCExpr
+-------------
+
+As part of the AMDGPU MC layer, AMDGPU provides the following target specific
+``MCExpr``\s.
+
+  .. table:: AMDGPU MCExpr types:
+     :name: amdgpu-mcexpr-table
+
+     =================== ================= ========================================================
+     MCExpr              Operands          Return value
+     =================== ================= ========================================================
+     ``max(arg, ...)``   1 or more         Variadic signed operation that returns the maximum
+                                           value of all its arguments.
+
+     ``or(arg, ...)``    1 or more         Variadic signed operation that returns the bitwise-or
+                                           result of all its arguments.
+
+     =================== ================= ========================================================
 
 .. _amdgpu-elf-code-object:
 
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index cb4eddfe5320fa..16a5d6879ce777 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "AMDKernelCodeT.h"
+#include "MCTargetDesc/AMDGPUMCExpr.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
 #include "MCTargetDesc/AMDGPUTargetStreamer.h"
 #include "SIDefines.h"
@@ -1816,6 +1817,7 @@ class AMDGPUAsmParser : public MCTargetAsmParser {
 
 public:
   void onBeginOfFile() override;
+  bool parsePrimaryExpr(const MCExpr *&Res, SMLoc &EndLoc) override;
 
   ParseStatus parseCustomOperand(OperandVector &Operands, unsigned MCK);
 
@@ -8277,6 +8279,59 @@ void AMDGPUAsmParser::onBeginOfFile() {
     getTargetStreamer().EmitDirectiveAMDGCNTarget();
 }
 
+/// Parse AMDGPU specific expressions.
+///
+///  expr ::= or(expr, ...) |
+///           max(expr, ...)
+///
+bool AMDGPUAsmParser::parsePrimaryExpr(const MCExpr *&Res, SMLoc &EndLoc) {
+  using AGVK = AMDGPUVariadicMCExpr::VariadicKind;
+
+  if (isToken(AsmToken::Identifier)) {
+    StringRef TokenId = getTokenStr();
+    AGVK VK = StringSwitch<AGVK>(TokenId)
+                  .Case("max", AGVK::AGVK_Max)
+                  .Case("or", AGVK::AGVK_Or)
+                  .Default(AGVK::AGVK_None);
+
+    if (VK != AGVK::AGVK_None && peekToken().is(AsmToken::LParen)) {
+      SmallVector<const MCExpr *, 4> Exprs;
+      uint64_t CommaCount = 0;
+      lex(); // Eat 'max'/'or'
+      lex(); // Eat '('
+      while (true) {
+        if (trySkipToken(AsmToken::RParen)) {
+          if (Exprs.empty()) {
+            Error(getToken().getLoc(),
+                  "empty " + Twine(TokenId) + " expression");
+            return true;
+          }
+          if (CommaCount + 1 != Exprs.size()) {
+            Error(getToken().getLoc(),
+                  "mismatch of commas in " + Twine(TokenId) + " expression");
+            return true;
+          }
+          Res = AMDGPUVariadicMCExpr::create(VK, Exprs, getContext());
+          return false;
+        }
+        const MCExpr *Expr;
+        if (getParser().parseExpression(Expr, EndLoc))
+          return true;
+        Exprs.push_back(Expr);
+        bool LastTokenWasComma = trySkipToken(AsmToken::Comma);
+        if (LastTokenWasComma)
+          CommaCount++;
+        if (!LastTokenWasComma && !isToken(AsmToken::RParen)) {
+          Error(getToken().getLoc(),
+                "unexpected token in " + Twine(TokenId) + " expression");
+          return true;
+        }
+      }
+    }
+  }
+  return getParser().parsePrimaryExpr(Res, EndLoc, nullptr);
+}
+
 ParseStatus AMDGPUAsmParser::parseOModSI(OperandVector &Operands) {
   StringRef Name = getTokenStr();
   if (Name == "mul") {
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
new file mode 100644
index 00000000000000..f0fa014547f1af
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
@@ -0,0 +1,100 @@
+//===- AMDGPUMCExpr.cpp - AMDGPU specific MC expression classes -----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPUMCExpr.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/MC/MCValue.h"
+#include "llvm/Support/Allocator.h"
+#include "llvm/Support/raw_ostream.h"
+#include <optional>
+
+using namespace llvm;
+
+const AMDGPUVariadicMCExpr *
+AMDGPUVariadicMCExpr::create(VariadicKind Kind, ArrayRef<const MCExpr *> Args,
+                             MCContext &Ctx) {
+  // Storage for the argument's 'const MCExpr*' allocated through MCContext new placement which means that AMDGPUVariadicMCExpr objects and all of its contents will now be allocated through MCContext new placement.
+  //
+  // Will result in an asan failure if allocated on the heap (e.g., through SmallVector's grow).
+  const MCExpr **CtxArgs = new (Ctx) const MCExpr*[Args.size()];
+  for (size_t i = 0; i < Args.size(); ++i)
+    CtxArgs[i] = Args[i];
+  return new (Ctx) AMDGPUVariadicMCExpr(Kind, ArrayRef(CtxArgs, Args.size()));
+}
+
+const MCExpr *AMDGPUVariadicMCExpr::getSubExpr(size_t Index) const {
+  assert(Index < Args.size() &&
+         "Indexing out of bounds AMDGPUVariadicMCExpr sub-expr");
+  return Args[Index];
+}
+
+void AMDGPUVariadicMCExpr::printImpl(raw_ostream &OS,
+                                     const MCAsmInfo *MAI) const {
+  switch (Kind) {
+  default:
+    llvm_unreachable("Unknown AMDGPUVariadicMCExpr kind.");
+  case AGVK_Or:
+    OS << "or(";
+    break;
+  case AGVK_Max:
+    OS << "max(";
+    break;
+  }
+  for (auto It = Args.begin(); It != Args.end(); ++It) {
+    (*It)->print(OS, MAI, /*InParens=*/false);
+    if ((It + 1) != Args.end())
+      OS << ", ";
+  }
+  OS << ')';
+}
+
+static int64_t op(AMDGPUVariadicMCExpr::VariadicKind Kind, int64_t Arg1,
+                  int64_t Arg2) {
+  switch (Kind) {
+  default:
+    llvm_unreachable("Unknown AMDGPUVariadicMCExpr kind.");
+  case AMDGPUVariadicMCExpr::AGVK_Max:
+    return std::max(Arg1, Arg2);
+  case AMDGPUVariadicMCExpr::AGVK_Or:
+    return Arg1 | Arg2;
+  }
+}
+
+bool AMDGPUVariadicMCExpr::evaluateAsRelocatableImpl(
+    MCValue &Res, const MCAsmLayout *Layout, const MCFixup *Fixup) const {
+  std::optional<int64_t> Total;
+
+  for (const MCExpr *Arg : Args) {
+    MCValue ArgRes;
+    if (!Arg->evaluateAsRelocatable(ArgRes, Layout, Fixup) ||
+        !ArgRes.isAbsolute())
+      return false;
+
+    if (!Total.has_value())
+      Total = ArgRes.getConstant();
+    Total = op(Kind, *Total, ArgRes.getConstant());
+  }
+
+  Res = MCValue::get(*Total);
+  return true;
+}
+
+void AMDGPUVariadicMCExpr::visitUsedExpr(MCStreamer &Streamer) const {
+  for (const MCExpr *Arg : Args)
+    Streamer.visitUsedExpr(*Arg);
+}
+
+MCFragment *AMDGPUVariadicMCExpr::findAssociatedFragment() const {
+  for (const MCExpr *Arg : Args) {
+    if (Arg->findAssociatedFragment())
+      return Arg->findAssociatedFragment();
+  }
+  return nullptr;
+}
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
new file mode 100644
index 00000000000000..cb28556e8be537
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
@@ -0,0 +1,73 @@
+//===- AMDGPUMCExpr.h - AMDGPU specific MC expression classes ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIB_TARGET_AMDGPU_MCTARGETDESC_AMDGPUMCEXPR_H
+#define LLVM_LIB_TARGET_AMDGPU_MCTARGETDESC_AMDGPUMCEXPR_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/MC/MCExpr.h"
+
+namespace llvm {
+
+/// AMDGPU target specific variadic MCExpr operations.
+///
+/// Takes in a minimum of 1 argument to be used with an operation. The supported
+/// operations are:
+///   - (bitwise) or
+///   - max
+///
+/// \note If the 'or'/'max' operations are provided only a single argument, the
+/// operation will act as a no-op and simply resolve as the provided argument.
+///
+class AMDGPUVariadicMCExpr : public MCTargetExpr {
+public:
+  enum VariadicKind { AGVK_None, AGVK_Or, AGVK_Max };
+
+private:
+  VariadicKind Kind;
+  ArrayRef<const MCExpr *> Args;
+
+  AMDGPUVariadicMCExpr(VariadicKind Kind, ArrayRef<const MCExpr *> Args)
+      : Kind(Kind), Args(Args) {
+    assert(Args.size() >= 1 && "Needs a minimum of one expression.");
+    assert(Kind != AGVK_None &&
+           "Cannot construct AMDGPUVariadicMCExpr of kind none.");
+  }
+
+public:
+  static const AMDGPUVariadicMCExpr *
+  create(VariadicKind Kind, ArrayRef<const MCExpr *> Args, MCContext &Ctx);
+
+  static const AMDGPUVariadicMCExpr *createOr(ArrayRef<const MCExpr *> Args,
+                                              MCContext &Ctx) {
+    return create(VariadicKind::AGVK_Or, Args, Ctx);
+  }
+
+  static const AMDGPUVariadicMCExpr *createMax(ArrayRef<const MCExpr *> Args,
+                                               MCContext &Ctx) {
+    return create(VariadicKind::AGVK_Max, Args, Ctx);
+  }
+
+  VariadicKind getKind() const { return Kind; }
+  const MCExpr *getSubExpr(size_t Index) const;
+
+  void printImpl(raw_ostream &OS, const MCAsmInfo *MAI) const override;
+  bool evaluateAsRelocatableImpl(MCValue &Res, const MCAsmLayout *Layout,
+                                 const MCFixup *Fixup) const override;
+  void visitUsedExpr(MCStreamer &Streamer) const override;
+  MCFragment *findAssociatedFragment() const override;
+  void fixELFSymbolsInTLSFixups(MCAssembler &) const override{};
+
+  static bool classof(const MCExpr *E) {
+    return E->getKind() == MCExpr::Target;
+  }
+};
+
+} // end namespace llvm
+
+#endif // LLVM_LIB_TARGET_AMDGPU_MCTARGETDESC_AMDGPUMCEXPR_H
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt b/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
index 5dc76071b0594e..0842a58f794b32 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
@@ -5,6 +5,7 @@ add_llvm_component_library(LLVMAMDGPUDesc
   AMDGPUInstPrinter.cpp
   AMDGPUMCAsmInfo.cpp
   AMDGPUMCCodeEmitter.cpp
+  AMDGPUMCExpr.cpp
   AMDGPUMCTargetDesc.cpp
   AMDGPUTargetStreamer.cpp
   R600InstPrinter.cpp
diff --git a/llvm/test/MC/AMDGPU/mcexpr_amd.s b/llvm/test/MC/AMDGPU/mcexpr_amd.s
new file mode 100644
index 00000000000000..a9639c3acc305a
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/mcexpr_amd.s
@@ -0,0 +1,130 @@
+// RUN: llvm-mc -triple amdgcn-amd-amdhsa < %s | FileCheck --check-prefix=ASM %s
+// RUN: llvm-mc -triple amdgcn-amd-amdhsa -filetype=obj < %s > %t
+// RUN: llvm-objdump --syms %t | FileCheck --check-prefix=OBJDUMP %s
+
+// OBJDUMP: SYMBOL TABLE:
+// OBJDUMP-NEXT: 0000000000000000 l       *ABS*  0000000000000000 zero
+// OBJDUMP-NEXT: 0000000000000001 l       *ABS*  0000000000000000 one
+// OBJDUMP-NEXT: 0000000000000002 l       *ABS*  0000000000000000 two
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 three
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 i64_max
+// OBJDUMP-NEXT: 8000000000000000 l       *ABS*  0000000000000000 i64_min
+// OBJDUMP-NEXT: 0000000000000005 l       *ABS*  0000000000000000 max_expression_all
+// OBJDUMP-NEXT: 0000000000000005 l       *ABS*  0000000000000000 five
+// OBJDUMP-NEXT: 0000000000000004 l       *ABS*  0000000000000000 four
+// OBJDUMP-NEXT: 0000000000000002 l       *ABS*  0000000000000000 max_expression_two
+// OBJDUMP-NEXT: 0000000000000001 l       *ABS*  0000000000000000 max_expression_one
+// OBJDUMP-NEXT: 000000000000000a l       *ABS*  0000000000000000 max_literals
+// OBJDUMP-NEXT: 000000000000000f l       *ABS*  0000000000000000 max_with_max_sym
+// OBJDUMP-NEXT: 000000000000000f l       *ABS*  0000000000000000 max
+// OBJDUMP-NEXT: ffffffffffffffff l       *ABS*  0000000000000000 neg_one
+// OBJDUMP-NEXT: ffffffffffffffff l       *ABS*  0000000000000000 max_neg_numbers
+// OBJDUMP-NEXT: ffffffffffffffff l       *ABS*  0000000000000000 max_neg_number
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 max_with_subexpr
+// OBJDUMP-NEXT: 0000000000000006 l       *ABS*  0000000000000000 max_as_subexpr
+// OBJDUMP-NEXT: 0000000000000005 l       *ABS*  0000000000000000 max_recursive_subexpr
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 max_expr_one_max
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 max_expr_two_max
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 max_expr_three_max
+// OBJDUMP-NEXT: 8000000000000000 l       *ABS*  0000000000000000 max_expr_one_min
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 max_expr_two_min
+// OBJDUMP-NEXT: 0000000000989680 l       *ABS*  0000000000000000 max_expr_three_min
+// OBJDUMP-NEXT: 0000000000000007 l       *ABS*  0000000000000000 or_expression_all
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 or_expression_two
+// OBJDUMP-NEXT: 0000000000000001 l       *ABS*  0000000000000000 or_expression_one
+// OBJDUMP-NEXT: 000000000000000f l       *ABS*  0000000000000000 or_literals
+// OBJDUMP-NEXT: 0000000000000000 l       *ABS*  0000000000000000 or_false
+// OBJDUMP-NEXT: 00000000000000ff l       *ABS*  0000000000000000 or_with_or_sym
+// OBJDUMP-NEXT: 00000000000000ff l       *ABS*  0000000000000000 or
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 or_with_subexpr
+// OBJDUMP-NEXT: 0000000000000008 l       *ABS*  0000000000000000 or_as_subexpr
+// OBJDUMP-NEXT: 0000000000000007 l       *ABS*  0000000000000000 or_recursive_subexpr
+
+// ASM: .set zero, 0
+// ASM: .set one, 1
+// ASM: .set two, 2
+// ASM: .set three, 3
+// ASM: .set i64_max, 9223372036854775807
+// ASM: .set i64_min, -9223372036854775808
+
+.set zero, 0
+.set one, 1
+.set two, 2
+.set three, 3
+.set i64_max, 0x7FFFFFFFFFFFFFFF
+.set i64_min, 0x8000000000000000
+
+// ASM: .set max_expression_all, max(1, 2, five, 3, four)
+// ASM: .set max_expression_two, 2
+// ASM: .set max_expression_one, 1
+// ASM: .set max_literals, 10
+// ASM: .set max_with_max_sym, max(max, 4, 3, 1, 2)
+
+.set max_expression_all, max(one, two, five, three, four)
+.set max_expression_two, max(one, two)
+.set max_expression_one, max(one)
+.set max_literals, max(1,2,3,4,5,6,7,8,9,10)
+.set max_with_max_sym, max(max, 4, 3, one, two)
+
+// ASM: .set max_neg_numbers, -1
+// ASM: .set max_neg_number, -1
+
+.set neg_one, -1
+.set max_neg_numbers, max(-5, -4, -3, -2, neg_one)
+.set max_neg_number, max(neg_one)
+
+// ASM: .set max_with_subexpr, 3
+// ASM: .set max_as_subexpr, 1+(max(4, 3, five))
+// ASM: .set max_recursive_subexpr, max(max(1, four), 3, max_expression_all)
+
+.set max_with_subexpr, max(((one | 3) << 3) / 8)
+.set max_as_subexpr, 1 + max(4, 3, five)
+.set max_recursive_subexpr, max(max(one, four), three, max_expression_all)
+
+// ASM: .set max_expr_one_max, 9223372036854775807
+// ASM: .set max_expr_two_max, max(9223372036854775807, five)
+// ASM: .set max_expr_three_max, max(9223372036854775807, five, 10000000)
+
+.set max_expr_one_max, max(i64_max)
+.set max_expr_two_max, max(i64_max, five)
+.set max_expr_three_max, max(i64_max, five, 10000000)
+
+// ASM: .set max_expr_one_min, -9223372036854775808
+// ASM: .set max_expr_two_min, 3
+// ASM: .set max_expr_three_min, 10000000
+
+.set max_expr_one_min, max(i64_min)
+.set max_expr_two_min, max(i64_min, three)
+.set max_expr_three_min, max(i64_min, three, 10000000)
+
+// ASM: .set or_expression_all, or(1, 2, five, 3, four)
+// ASM: .set or_expression_two, 3
+// ASM: .set or_expression_one, 1
+// ASM: .set or_literals, 15
+// ASM: .set or_false, 0
+// ASM: .set or_with_or_sym, or(or, 4, 3, 1, 2)
+
+.set or_expression_all, or(one, two, five, three, four)
+.set or_expression_two, or(one, two)
+.set or_expression_one, or(one)
+.set or_literals, or(1,2,3,4,5,6,7,8,9,10)
+.set or_false, or(zero, 0, (2-2), 5 > 6)
+.set or_with_or_sym, or(or, 4, 3, one, two)
+
+// ASM: .set or_with_subexpr, 3
+// ASM: .set or_as_subexpr, 1+(or(4, 3, five))
+// ASM: .set or_recursive_subexpr, or(or(1, four), 3, or_expression_all)
+
+.set or_with_subexpr, or(((one | 3) << 3) / 8)
+.set or_as_subexpr, 1 + or(4, 3, five)
+.set or_recursive_subexpr, or(or(one, four), three, or_expression_all)
+
+// ASM: .set four, 4
+// ASM: .set five, 5
+// ASM: .set max, 15
+// ASM: .set or, 255
+
+.set four, 4
+.set five, 5
+.set max, 0xF
+.set or, 0xFF
diff --git a/llvm/test/MC/AMDGPU/mcexpr_amd_err.s b/llvm/test/MC/AMDGPU/mcexpr_amd_err.s
new file mode 100644
index 00000000000000..ea02e013627218
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/mcexpr_amd_err.s
@@ -0,0 +1,53 @@
+// RUN: not llvm-mc -triple amdgcn-amd-amdhsa %s 2>&1 | FileCheck --check-prefix=ASM %s
+
+.set one, 1
+.set two, 2
+.set three, 3
+
+.set max_empty, max()
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: empty max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set or_empty, or()
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: empty or expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_post_aux_comma, max(one,)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: mismatch of commas in max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_pre_aux_comma, max(,one)
+// asm: :[[@line-1]]:{{[0-9]+}}: error: unknown token in expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_double_comma, max(one,, two)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unknown token in expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_no_comma, max(one two)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unexpected token in max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_missing_paren, max(two
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unexpected token in max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_expression_one, max(three, four,
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unknown token in expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set or_expression_one, or(four, five
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unexpected token in or expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_no_lparen, max four, five)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: expected newline
+
+.set max_no_paren, max one, two, three
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: expected newline
+
+.set max_rparen_only, max)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: expected newline
+
+.set four, 4
+.set five, 5

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 8, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Janek van Oirschot (JanekvO)

Changes

Adds AMDGPU specific variadic MCExpr operations 'max' and 'or'. Required for moving function/program resource usage information propagation to MC layer as propagation of the maximum and logic OR values will go through MCExprs. Propagation of said resource usage info through these MCExpr operations compared to current handling in AMDGPUResourceUsageAnalysis will ease the constraint on the order for a function's resource usage analysis as the symbolic definition of a resource can occur after its use.

Previous merge with main triggered asan memory leaks as shown in #82022 (comment) by @fmayer. The leak was caused by SmallVector's grow functionality that got invoked as the variadic argument count exceeds the local storage limit (and results in heap allocation). However, MCTargetExpr and all its derived classes do not deallocate meaning any normal heap allocation within AMDGPUVariadicMCExpr does not have a paired deallocation. As a solution, the storage used by AMDGPUVariadicMCExpr for its variadic MCExpr container is now bump-allocated as well through the MCContext allocation.

Relands #82022 with fixes


Full diff: https://github.com/llvm/llvm-project/pull/84562.diff

7 Files Affected:

  • (modified) llvm/docs/AMDGPUUsage.rst (+19)
  • (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+55)
  • (added) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp (+100)
  • (added) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h (+73)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt (+1)
  • (added) llvm/test/MC/AMDGPU/mcexpr_amd.s (+130)
  • (added) llvm/test/MC/AMDGPU/mcexpr_amd_err.s (+53)
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index 7f39f69cae60db..c7cb06ffbef1e0 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1534,6 +1534,25 @@ The AMDGPU backend supports the following calling conventions:
 
      =============================== ==========================================================
 
+AMDGPU MCExpr
+-------------
+
+As part of the AMDGPU MC layer, AMDGPU provides the following target specific
+``MCExpr``\s.
+
+  .. table:: AMDGPU MCExpr types:
+     :name: amdgpu-mcexpr-table
+
+     =================== ================= ========================================================
+     MCExpr              Operands          Return value
+     =================== ================= ========================================================
+     ``max(arg, ...)``   1 or more         Variadic signed operation that returns the maximum
+                                           value of all its arguments.
+
+     ``or(arg, ...)``    1 or more         Variadic signed operation that returns the bitwise-or
+                                           result of all its arguments.
+
+     =================== ================= ========================================================
 
 .. _amdgpu-elf-code-object:
 
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index cb4eddfe5320fa..16a5d6879ce777 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "AMDKernelCodeT.h"
+#include "MCTargetDesc/AMDGPUMCExpr.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
 #include "MCTargetDesc/AMDGPUTargetStreamer.h"
 #include "SIDefines.h"
@@ -1816,6 +1817,7 @@ class AMDGPUAsmParser : public MCTargetAsmParser {
 
 public:
   void onBeginOfFile() override;
+  bool parsePrimaryExpr(const MCExpr *&Res, SMLoc &EndLoc) override;
 
   ParseStatus parseCustomOperand(OperandVector &Operands, unsigned MCK);
 
@@ -8277,6 +8279,59 @@ void AMDGPUAsmParser::onBeginOfFile() {
     getTargetStreamer().EmitDirectiveAMDGCNTarget();
 }
 
+/// Parse AMDGPU specific expressions.
+///
+///  expr ::= or(expr, ...) |
+///           max(expr, ...)
+///
+bool AMDGPUAsmParser::parsePrimaryExpr(const MCExpr *&Res, SMLoc &EndLoc) {
+  using AGVK = AMDGPUVariadicMCExpr::VariadicKind;
+
+  if (isToken(AsmToken::Identifier)) {
+    StringRef TokenId = getTokenStr();
+    AGVK VK = StringSwitch<AGVK>(TokenId)
+                  .Case("max", AGVK::AGVK_Max)
+                  .Case("or", AGVK::AGVK_Or)
+                  .Default(AGVK::AGVK_None);
+
+    if (VK != AGVK::AGVK_None && peekToken().is(AsmToken::LParen)) {
+      SmallVector<const MCExpr *, 4> Exprs;
+      uint64_t CommaCount = 0;
+      lex(); // Eat 'max'/'or'
+      lex(); // Eat '('
+      while (true) {
+        if (trySkipToken(AsmToken::RParen)) {
+          if (Exprs.empty()) {
+            Error(getToken().getLoc(),
+                  "empty " + Twine(TokenId) + " expression");
+            return true;
+          }
+          if (CommaCount + 1 != Exprs.size()) {
+            Error(getToken().getLoc(),
+                  "mismatch of commas in " + Twine(TokenId) + " expression");
+            return true;
+          }
+          Res = AMDGPUVariadicMCExpr::create(VK, Exprs, getContext());
+          return false;
+        }
+        const MCExpr *Expr;
+        if (getParser().parseExpression(Expr, EndLoc))
+          return true;
+        Exprs.push_back(Expr);
+        bool LastTokenWasComma = trySkipToken(AsmToken::Comma);
+        if (LastTokenWasComma)
+          CommaCount++;
+        if (!LastTokenWasComma && !isToken(AsmToken::RParen)) {
+          Error(getToken().getLoc(),
+                "unexpected token in " + Twine(TokenId) + " expression");
+          return true;
+        }
+      }
+    }
+  }
+  return getParser().parsePrimaryExpr(Res, EndLoc, nullptr);
+}
+
 ParseStatus AMDGPUAsmParser::parseOModSI(OperandVector &Operands) {
   StringRef Name = getTokenStr();
   if (Name == "mul") {
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
new file mode 100644
index 00000000000000..f0fa014547f1af
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
@@ -0,0 +1,100 @@
+//===- AMDGPUMCExpr.cpp - AMDGPU specific MC expression classes -----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPUMCExpr.h"
+#include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCStreamer.h"
+#include "llvm/MC/MCSymbol.h"
+#include "llvm/MC/MCValue.h"
+#include "llvm/Support/Allocator.h"
+#include "llvm/Support/raw_ostream.h"
+#include <optional>
+
+using namespace llvm;
+
+const AMDGPUVariadicMCExpr *
+AMDGPUVariadicMCExpr::create(VariadicKind Kind, ArrayRef<const MCExpr *> Args,
+                             MCContext &Ctx) {
+  // Storage for the argument's 'const MCExpr*' allocated through MCContext new placement which means that AMDGPUVariadicMCExpr objects and all of its contents will now be allocated through MCContext new placement.
+  //
+  // Will result in an asan failure if allocated on the heap (e.g., through SmallVector's grow).
+  const MCExpr **CtxArgs = new (Ctx) const MCExpr*[Args.size()];
+  for (size_t i = 0; i < Args.size(); ++i)
+    CtxArgs[i] = Args[i];
+  return new (Ctx) AMDGPUVariadicMCExpr(Kind, ArrayRef(CtxArgs, Args.size()));
+}
+
+const MCExpr *AMDGPUVariadicMCExpr::getSubExpr(size_t Index) const {
+  assert(Index < Args.size() &&
+         "Indexing out of bounds AMDGPUVariadicMCExpr sub-expr");
+  return Args[Index];
+}
+
+void AMDGPUVariadicMCExpr::printImpl(raw_ostream &OS,
+                                     const MCAsmInfo *MAI) const {
+  switch (Kind) {
+  default:
+    llvm_unreachable("Unknown AMDGPUVariadicMCExpr kind.");
+  case AGVK_Or:
+    OS << "or(";
+    break;
+  case AGVK_Max:
+    OS << "max(";
+    break;
+  }
+  for (auto It = Args.begin(); It != Args.end(); ++It) {
+    (*It)->print(OS, MAI, /*InParens=*/false);
+    if ((It + 1) != Args.end())
+      OS << ", ";
+  }
+  OS << ')';
+}
+
+static int64_t op(AMDGPUVariadicMCExpr::VariadicKind Kind, int64_t Arg1,
+                  int64_t Arg2) {
+  switch (Kind) {
+  default:
+    llvm_unreachable("Unknown AMDGPUVariadicMCExpr kind.");
+  case AMDGPUVariadicMCExpr::AGVK_Max:
+    return std::max(Arg1, Arg2);
+  case AMDGPUVariadicMCExpr::AGVK_Or:
+    return Arg1 | Arg2;
+  }
+}
+
+bool AMDGPUVariadicMCExpr::evaluateAsRelocatableImpl(
+    MCValue &Res, const MCAsmLayout *Layout, const MCFixup *Fixup) const {
+  std::optional<int64_t> Total;
+
+  for (const MCExpr *Arg : Args) {
+    MCValue ArgRes;
+    if (!Arg->evaluateAsRelocatable(ArgRes, Layout, Fixup) ||
+        !ArgRes.isAbsolute())
+      return false;
+
+    if (!Total.has_value())
+      Total = ArgRes.getConstant();
+    Total = op(Kind, *Total, ArgRes.getConstant());
+  }
+
+  Res = MCValue::get(*Total);
+  return true;
+}
+
+void AMDGPUVariadicMCExpr::visitUsedExpr(MCStreamer &Streamer) const {
+  for (const MCExpr *Arg : Args)
+    Streamer.visitUsedExpr(*Arg);
+}
+
+MCFragment *AMDGPUVariadicMCExpr::findAssociatedFragment() const {
+  for (const MCExpr *Arg : Args) {
+    if (Arg->findAssociatedFragment())
+      return Arg->findAssociatedFragment();
+  }
+  return nullptr;
+}
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
new file mode 100644
index 00000000000000..cb28556e8be537
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.h
@@ -0,0 +1,73 @@
+//===- AMDGPUMCExpr.h - AMDGPU specific MC expression classes ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIB_TARGET_AMDGPU_MCTARGETDESC_AMDGPUMCEXPR_H
+#define LLVM_LIB_TARGET_AMDGPU_MCTARGETDESC_AMDGPUMCEXPR_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/MC/MCExpr.h"
+
+namespace llvm {
+
+/// AMDGPU target specific variadic MCExpr operations.
+///
+/// Takes in a minimum of 1 argument to be used with an operation. The supported
+/// operations are:
+///   - (bitwise) or
+///   - max
+///
+/// \note If the 'or'/'max' operations are provided only a single argument, the
+/// operation will act as a no-op and simply resolve as the provided argument.
+///
+class AMDGPUVariadicMCExpr : public MCTargetExpr {
+public:
+  enum VariadicKind { AGVK_None, AGVK_Or, AGVK_Max };
+
+private:
+  VariadicKind Kind;
+  ArrayRef<const MCExpr *> Args;
+
+  AMDGPUVariadicMCExpr(VariadicKind Kind, ArrayRef<const MCExpr *> Args)
+      : Kind(Kind), Args(Args) {
+    assert(Args.size() >= 1 && "Needs a minimum of one expression.");
+    assert(Kind != AGVK_None &&
+           "Cannot construct AMDGPUVariadicMCExpr of kind none.");
+  }
+
+public:
+  static const AMDGPUVariadicMCExpr *
+  create(VariadicKind Kind, ArrayRef<const MCExpr *> Args, MCContext &Ctx);
+
+  static const AMDGPUVariadicMCExpr *createOr(ArrayRef<const MCExpr *> Args,
+                                              MCContext &Ctx) {
+    return create(VariadicKind::AGVK_Or, Args, Ctx);
+  }
+
+  static const AMDGPUVariadicMCExpr *createMax(ArrayRef<const MCExpr *> Args,
+                                               MCContext &Ctx) {
+    return create(VariadicKind::AGVK_Max, Args, Ctx);
+  }
+
+  VariadicKind getKind() const { return Kind; }
+  const MCExpr *getSubExpr(size_t Index) const;
+
+  void printImpl(raw_ostream &OS, const MCAsmInfo *MAI) const override;
+  bool evaluateAsRelocatableImpl(MCValue &Res, const MCAsmLayout *Layout,
+                                 const MCFixup *Fixup) const override;
+  void visitUsedExpr(MCStreamer &Streamer) const override;
+  MCFragment *findAssociatedFragment() const override;
+  void fixELFSymbolsInTLSFixups(MCAssembler &) const override{};
+
+  static bool classof(const MCExpr *E) {
+    return E->getKind() == MCExpr::Target;
+  }
+};
+
+} // end namespace llvm
+
+#endif // LLVM_LIB_TARGET_AMDGPU_MCTARGETDESC_AMDGPUMCEXPR_H
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt b/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
index 5dc76071b0594e..0842a58f794b32 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
@@ -5,6 +5,7 @@ add_llvm_component_library(LLVMAMDGPUDesc
   AMDGPUInstPrinter.cpp
   AMDGPUMCAsmInfo.cpp
   AMDGPUMCCodeEmitter.cpp
+  AMDGPUMCExpr.cpp
   AMDGPUMCTargetDesc.cpp
   AMDGPUTargetStreamer.cpp
   R600InstPrinter.cpp
diff --git a/llvm/test/MC/AMDGPU/mcexpr_amd.s b/llvm/test/MC/AMDGPU/mcexpr_amd.s
new file mode 100644
index 00000000000000..a9639c3acc305a
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/mcexpr_amd.s
@@ -0,0 +1,130 @@
+// RUN: llvm-mc -triple amdgcn-amd-amdhsa < %s | FileCheck --check-prefix=ASM %s
+// RUN: llvm-mc -triple amdgcn-amd-amdhsa -filetype=obj < %s > %t
+// RUN: llvm-objdump --syms %t | FileCheck --check-prefix=OBJDUMP %s
+
+// OBJDUMP: SYMBOL TABLE:
+// OBJDUMP-NEXT: 0000000000000000 l       *ABS*  0000000000000000 zero
+// OBJDUMP-NEXT: 0000000000000001 l       *ABS*  0000000000000000 one
+// OBJDUMP-NEXT: 0000000000000002 l       *ABS*  0000000000000000 two
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 three
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 i64_max
+// OBJDUMP-NEXT: 8000000000000000 l       *ABS*  0000000000000000 i64_min
+// OBJDUMP-NEXT: 0000000000000005 l       *ABS*  0000000000000000 max_expression_all
+// OBJDUMP-NEXT: 0000000000000005 l       *ABS*  0000000000000000 five
+// OBJDUMP-NEXT: 0000000000000004 l       *ABS*  0000000000000000 four
+// OBJDUMP-NEXT: 0000000000000002 l       *ABS*  0000000000000000 max_expression_two
+// OBJDUMP-NEXT: 0000000000000001 l       *ABS*  0000000000000000 max_expression_one
+// OBJDUMP-NEXT: 000000000000000a l       *ABS*  0000000000000000 max_literals
+// OBJDUMP-NEXT: 000000000000000f l       *ABS*  0000000000000000 max_with_max_sym
+// OBJDUMP-NEXT: 000000000000000f l       *ABS*  0000000000000000 max
+// OBJDUMP-NEXT: ffffffffffffffff l       *ABS*  0000000000000000 neg_one
+// OBJDUMP-NEXT: ffffffffffffffff l       *ABS*  0000000000000000 max_neg_numbers
+// OBJDUMP-NEXT: ffffffffffffffff l       *ABS*  0000000000000000 max_neg_number
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 max_with_subexpr
+// OBJDUMP-NEXT: 0000000000000006 l       *ABS*  0000000000000000 max_as_subexpr
+// OBJDUMP-NEXT: 0000000000000005 l       *ABS*  0000000000000000 max_recursive_subexpr
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 max_expr_one_max
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 max_expr_two_max
+// OBJDUMP-NEXT: 7fffffffffffffff l       *ABS*  0000000000000000 max_expr_three_max
+// OBJDUMP-NEXT: 8000000000000000 l       *ABS*  0000000000000000 max_expr_one_min
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 max_expr_two_min
+// OBJDUMP-NEXT: 0000000000989680 l       *ABS*  0000000000000000 max_expr_three_min
+// OBJDUMP-NEXT: 0000000000000007 l       *ABS*  0000000000000000 or_expression_all
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 or_expression_two
+// OBJDUMP-NEXT: 0000000000000001 l       *ABS*  0000000000000000 or_expression_one
+// OBJDUMP-NEXT: 000000000000000f l       *ABS*  0000000000000000 or_literals
+// OBJDUMP-NEXT: 0000000000000000 l       *ABS*  0000000000000000 or_false
+// OBJDUMP-NEXT: 00000000000000ff l       *ABS*  0000000000000000 or_with_or_sym
+// OBJDUMP-NEXT: 00000000000000ff l       *ABS*  0000000000000000 or
+// OBJDUMP-NEXT: 0000000000000003 l       *ABS*  0000000000000000 or_with_subexpr
+// OBJDUMP-NEXT: 0000000000000008 l       *ABS*  0000000000000000 or_as_subexpr
+// OBJDUMP-NEXT: 0000000000000007 l       *ABS*  0000000000000000 or_recursive_subexpr
+
+// ASM: .set zero, 0
+// ASM: .set one, 1
+// ASM: .set two, 2
+// ASM: .set three, 3
+// ASM: .set i64_max, 9223372036854775807
+// ASM: .set i64_min, -9223372036854775808
+
+.set zero, 0
+.set one, 1
+.set two, 2
+.set three, 3
+.set i64_max, 0x7FFFFFFFFFFFFFFF
+.set i64_min, 0x8000000000000000
+
+// ASM: .set max_expression_all, max(1, 2, five, 3, four)
+// ASM: .set max_expression_two, 2
+// ASM: .set max_expression_one, 1
+// ASM: .set max_literals, 10
+// ASM: .set max_with_max_sym, max(max, 4, 3, 1, 2)
+
+.set max_expression_all, max(one, two, five, three, four)
+.set max_expression_two, max(one, two)
+.set max_expression_one, max(one)
+.set max_literals, max(1,2,3,4,5,6,7,8,9,10)
+.set max_with_max_sym, max(max, 4, 3, one, two)
+
+// ASM: .set max_neg_numbers, -1
+// ASM: .set max_neg_number, -1
+
+.set neg_one, -1
+.set max_neg_numbers, max(-5, -4, -3, -2, neg_one)
+.set max_neg_number, max(neg_one)
+
+// ASM: .set max_with_subexpr, 3
+// ASM: .set max_as_subexpr, 1+(max(4, 3, five))
+// ASM: .set max_recursive_subexpr, max(max(1, four), 3, max_expression_all)
+
+.set max_with_subexpr, max(((one | 3) << 3) / 8)
+.set max_as_subexpr, 1 + max(4, 3, five)
+.set max_recursive_subexpr, max(max(one, four), three, max_expression_all)
+
+// ASM: .set max_expr_one_max, 9223372036854775807
+// ASM: .set max_expr_two_max, max(9223372036854775807, five)
+// ASM: .set max_expr_three_max, max(9223372036854775807, five, 10000000)
+
+.set max_expr_one_max, max(i64_max)
+.set max_expr_two_max, max(i64_max, five)
+.set max_expr_three_max, max(i64_max, five, 10000000)
+
+// ASM: .set max_expr_one_min, -9223372036854775808
+// ASM: .set max_expr_two_min, 3
+// ASM: .set max_expr_three_min, 10000000
+
+.set max_expr_one_min, max(i64_min)
+.set max_expr_two_min, max(i64_min, three)
+.set max_expr_three_min, max(i64_min, three, 10000000)
+
+// ASM: .set or_expression_all, or(1, 2, five, 3, four)
+// ASM: .set or_expression_two, 3
+// ASM: .set or_expression_one, 1
+// ASM: .set or_literals, 15
+// ASM: .set or_false, 0
+// ASM: .set or_with_or_sym, or(or, 4, 3, 1, 2)
+
+.set or_expression_all, or(one, two, five, three, four)
+.set or_expression_two, or(one, two)
+.set or_expression_one, or(one)
+.set or_literals, or(1,2,3,4,5,6,7,8,9,10)
+.set or_false, or(zero, 0, (2-2), 5 > 6)
+.set or_with_or_sym, or(or, 4, 3, one, two)
+
+// ASM: .set or_with_subexpr, 3
+// ASM: .set or_as_subexpr, 1+(or(4, 3, five))
+// ASM: .set or_recursive_subexpr, or(or(1, four), 3, or_expression_all)
+
+.set or_with_subexpr, or(((one | 3) << 3) / 8)
+.set or_as_subexpr, 1 + or(4, 3, five)
+.set or_recursive_subexpr, or(or(one, four), three, or_expression_all)
+
+// ASM: .set four, 4
+// ASM: .set five, 5
+// ASM: .set max, 15
+// ASM: .set or, 255
+
+.set four, 4
+.set five, 5
+.set max, 0xF
+.set or, 0xFF
diff --git a/llvm/test/MC/AMDGPU/mcexpr_amd_err.s b/llvm/test/MC/AMDGPU/mcexpr_amd_err.s
new file mode 100644
index 00000000000000..ea02e013627218
--- /dev/null
+++ b/llvm/test/MC/AMDGPU/mcexpr_amd_err.s
@@ -0,0 +1,53 @@
+// RUN: not llvm-mc -triple amdgcn-amd-amdhsa %s 2>&1 | FileCheck --check-prefix=ASM %s
+
+.set one, 1
+.set two, 2
+.set three, 3
+
+.set max_empty, max()
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: empty max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set or_empty, or()
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: empty or expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_post_aux_comma, max(one,)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: mismatch of commas in max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_pre_aux_comma, max(,one)
+// asm: :[[@line-1]]:{{[0-9]+}}: error: unknown token in expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_double_comma, max(one,, two)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unknown token in expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_no_comma, max(one two)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unexpected token in max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_missing_paren, max(two
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unexpected token in max expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_expression_one, max(three, four,
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unknown token in expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set or_expression_one, or(four, five
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: unexpected token in or expression
+// ASM: :[[@LINE-2]]:{{[0-9]+}}: error: missing expression
+
+.set max_no_lparen, max four, five)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: expected newline
+
+.set max_no_paren, max one, two, three
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: expected newline
+
+.set max_rparen_only, max)
+// ASM: :[[@LINE-1]]:{{[0-9]+}}: error: expected newline
+
+.set four, 4
+.set five, 5

Copy link

github-actions bot commented Mar 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Comment on lines 26 to 28
const MCExpr **CtxArgs = new (Ctx) const MCExpr*[Args.size()];
for (size_t i = 0; i < Args.size(); ++i)
CtxArgs[i] = Args[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any special low level management of this should be down in the constructor; the SmallVector's allocator should be the Context's allocator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping so as well, but when I looked into it SmallVector did not have any way to provide an alternate allocator nor does MCContext provide the llvm Allocator interface. The BumpPtrAllocator used in MCContext is also not directly accessible leaving not that many options.

The AMDGPUVariadicMCExpr::create function is currently the only place that does the direct creation of the AMDGPUVariadicMCExpr and uses MCContext placement new so it seems the most fitting place for this copy code. For now I've replaced the manual copy with std::copy.

Do let me know if I'm missing something obvious in terms of allocator behaviour in SmallVector or ArrayRef.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just use a regular old array, plus Ctx.allocate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regular old array

Sorry, I'm not sure I understand for what aspect I'd use the array. I can, however, replace the placement new approach with one that uses Ctx.allocate.

Comment on lines 29 to 31
const MCExpr **CtxArgs = static_cast<const MCExpr **>(
Ctx.allocate(sizeof(const MCExpr *) * Args.size()));
std::uninitialized_copy(Args.begin(), Args.end(), CtxArgs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done in the AMDGPUVariadicMCExpr constructor. Also, I see MCContext does have operator new/delete wrappers around allocate.

Plus you're missing the deallocate which needs to go in the destructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm following convention for MCExpr to the extend that I can for AMDGPUVariadicMCExpr: no (target specific) MCExpr is provided the MCContext object and all explicitly have create functions to ensure the MCExpr object is allocated through MCContext (i.e., through the MCContext placement new). Additionally, MCTargetExpr requires subclasses to have trivial destructors as it is never deallocated. This is further reinforced as MCContext's deallocate function has an empty body.

Of course, I could be missing something but I do believe this is as good as it gets for supporting variadic arguments in a (target specific) MCExpr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regular non-target MCExpr's all have an MCContext argument, I don't see why target MCExprs shouldn't have one. It's not clear to me exactly what that comment means by "trivial". I would presume a destructor that calls a deallocate function which does nothing to be trivial. I think it's much better to properly express the alloc/dealloc in the constructor/destructor than relying on whatever the allocation behavior is today

Copy link
Contributor Author

@JanekvO JanekvO Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regular non-target MCExpr's all have an MCContext argument

I believe it's only their create functions that have it but it is not forwarded beyond that (i.e., to the MCExpr constructor).

It's not clear to me exactly what that comment means by "trivial"

I assume it's referring to the C++ standard definition of trivial destructors but I don't believe the MCTargetExpr class itself has a trivial destructor according to those rules.

I'm okay with going off-script with the AMDGPUVariadicMCExpr and adding the MCContext as member + user defined destructor, it doesn't trigger the asan leak either way.

@JanekvO JanekvO merged commit f7bebc1 into llvm:main Mar 14, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants