Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++17] Support __GCC_[CON|DE]STRUCTIVE_SIZE #89446

Merged
merged 14 commits into from
Apr 26, 2024

Conversation

AaronBallman
Copy link
Collaborator

These macros are used by STL implementations to support implementation of std::hardware_destructive_interference_size and std::hardware_constructive_interference_size

Fixes #60174

These macros are used by STL implementations to support implementation
of std::hardware_destructive_interference_size and
std::hardware_constructive_interference_size

Fixes llvm#60174
@AaronBallman AaronBallman added c++17 clang:frontend Language frontend issues, e.g. anything involving "Sema" extension:gnu labels Apr 19, 2024
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Apr 19, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 19, 2024

@llvm/pr-subscribers-libcxx
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-backend-sparc
@llvm/pr-subscribers-backend-m68k
@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-systemz

@llvm/pr-subscribers-clang

Author: Aaron Ballman (AaronBallman)

Changes

These macros are used by STL implementations to support implementation of std::hardware_destructive_interference_size and std::hardware_constructive_interference_size

Fixes #60174


Full diff: https://github.com/llvm/llvm-project/pull/89446.diff

6 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+13)
  • (modified) clang/include/clang/Basic/TargetInfo.h (+5)
  • (modified) clang/lib/Frontend/InitPreprocessor.cpp (+7)
  • (modified) clang/test/Preprocessor/init-aarch64.c (+7-5)
  • (modified) clang/test/Preprocessor/init.c (+29-20)
  • (modified) clang/test/Preprocessor/predefined-win-macros.c (+4-2)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index aea8fda35bb29c..7414856fd6a03e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -89,6 +89,19 @@ sections with improvements to Clang's support for those languages.
 C++ Language Changes
 --------------------
 
+C++17 Feature Support
+^^^^^^^^^^^^^^^^^^^^^
+- Clang now exposes ``__GCC_DESTRUCTIVE_SIZE`` and ``__GCC_CONSTRUCTIVE_SIZE``
+  predefined macros to support standard library implementations of
+  ``std::hardware_destructive_interference_size`` and
+  ``std::hardware_constructive_interference_size``, respectively. These macros
+  are predefined in all C and C++ language modes. These macros can be
+  overridden on the command line with ``-D``, if desired. The values the macros
+  expand to are not stable between releases of Clang and do not need to match
+  the values produced by GCC, so these macros should not be used from header
+  files because they may not be stable across multiple TUs (the values may vary
+  based on compiler version as well as CPU tuning). #GH60174
+
 C++20 Feature Support
 ^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index e1ef7454f01669..a7b112a41f1b99 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1792,6 +1792,11 @@ class TargetInfo : public TransferrableTargetInfo,
   /// Whether to support HIP image/texture API's.
   virtual bool hasHIPImageSupport() const { return true; }
 
+  /// The minimum offset between two objects to avoid false sharing.
+  virtual unsigned hardwareDestructiveInterferenceSize() const { return 64; }
+  // The maximum size of contiguous memory to promote true sharing.
+  virtual unsigned hardwareConstructiveInterferenceSize() const { return 64; }
+
 protected:
   /// Copy type and layout related info.
   void copyAuxTarget(const TargetInfo *Aux);
diff --git a/clang/lib/Frontend/InitPreprocessor.cpp b/clang/lib/Frontend/InitPreprocessor.cpp
index 4f44c3b7b89d4d..b88cd86e09fa1e 100644
--- a/clang/lib/Frontend/InitPreprocessor.cpp
+++ b/clang/lib/Frontend/InitPreprocessor.cpp
@@ -1309,6 +1309,13 @@ static void InitializePredefinedMacros(const TargetInfo &TI,
     Builder.defineMacro("__GCC_ATOMIC_TEST_AND_SET_TRUEVAL", "1");
   }
 
+  // GCC defines these macros in both C and C++ modes despite them being needed
+  // mostly for STL implementations in C++.
+  Builder.defineMacro("__GCC_DESTRUCTIVE_SIZE",
+                      Twine(TI.hardwareDestructiveInterferenceSize()));
+  Builder.defineMacro("__GCC_CONSTRUCTIVE_SIZE",
+                      Twine(TI.hardwareConstructiveInterferenceSize()));
+
   auto addLockFreeMacros = [&](const llvm::Twine &Prefix) {
     // Used by libc++ and libstdc++ to implement ATOMIC_<foo>_LOCK_FREE.
 #define DEFINE_LOCK_FREE_MACRO(TYPE, Type)                                     \
diff --git a/clang/test/Preprocessor/init-aarch64.c b/clang/test/Preprocessor/init-aarch64.c
index cf96870b27acb3..f0845985c9efc3 100644
--- a/clang/test/Preprocessor/init-aarch64.c
+++ b/clang/test/Preprocessor/init-aarch64.c
@@ -119,6 +119,8 @@
 // AARCH64-NEXT: #define __FP_FAST_FMA 1
 // AARCH64-NEXT: #define __FP_FAST_FMAF 1
 // AARCH64-NEXT: #define __GCC_ASM_FLAG_OUTPUTS__ 1
+// AARCH64-NEXT: #define __GCC_CONSTRUCTIVE_SIZE {{.+}}
+// AARCH64-NEXT: #define __GCC_DESTRUCTIVE_SIZE {{.+}}
 // AARCH64-NEXT: #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1 1
 // AARCH64-NEXT: #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 1
 // AARCH64-NEXT: #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 1
@@ -220,11 +222,11 @@
 // AARCH64-NEXT: #define __LONG_MAX__ 9223372036854775807L
 // AARCH64-NEXT: #define __LONG_WIDTH__ 64
 // AARCH64-NEXT: #define __LP64__ 1
-// AARCH64-NEXT: #define __MEMORY_SCOPE_DEVICE 1 
-// AARCH64-NEXT: #define __MEMORY_SCOPE_SINGLE 4 
-// AARCH64-NEXT: #define __MEMORY_SCOPE_SYSTEM 0 
-// AARCH64-NEXT: #define __MEMORY_SCOPE_WRKGRP 2 
-// AARCH64-NEXT: #define __MEMORY_SCOPE_WVFRNT 3 
+// AARCH64-NEXT: #define __MEMORY_SCOPE_DEVICE 1
+// AARCH64-NEXT: #define __MEMORY_SCOPE_SINGLE 4
+// AARCH64-NEXT: #define __MEMORY_SCOPE_SYSTEM 0
+// AARCH64-NEXT: #define __MEMORY_SCOPE_WRKGRP 2
+// AARCH64-NEXT: #define __MEMORY_SCOPE_WVFRNT 3
 // AARCH64-NEXT: #define __NO_INLINE__ 1
 // AARCH64-NEXT: #define __NO_MATH_ERRNO__ 1
 // AARCH64-NEXT: #define __OBJC_BOOL_IS_BOOL 0
diff --git a/clang/test/Preprocessor/init.c b/clang/test/Preprocessor/init.c
index c4a55efca6f712..2641fee940231f 100644
--- a/clang/test/Preprocessor/init.c
+++ b/clang/test/Preprocessor/init.c
@@ -1,3 +1,10 @@
+// RUN: %clang_cc1 -E -dM < /dev/null | FileCheck -match-full-lines -check-prefix INTERFERENCE %s
+//
+// We purposefully do not test the values produced, only that the macros are
+// predefined to some value.
+// INTERFERENCE:#define __GCC_CONSTRUCTIVE_SIZE {{.+}}
+// INTERFERENCE:#define __GCC_DESTRUCTIVE_SIZE {{.+}}
+
 // RUN: %clang_cc1 -E -dM -x assembler-with-cpp < /dev/null | FileCheck -match-full-lines -check-prefix ASM %s
 //
 // ASM:#define __ASSEMBLER__ 1
@@ -1697,6 +1704,8 @@
 // WEBASSEMBLY-NEXT:#define __GCC_ATOMIC_SHORT_LOCK_FREE 2
 // WEBASSEMBLY-NEXT:#define __GCC_ATOMIC_TEST_AND_SET_TRUEVAL 1
 // WEBASSEMBLY-NEXT:#define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
+// WEBASSEMBLY-NEXT:#define __GCC_CONSTRUCTIVE_SIZE {{.+}}
+// WEBASSEMBLY-NEXT:#define __GCC_DESTRUCTIVE_SIZE {{.+}}
 // WEBASSEMBLY-NEXT:#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1 1
 // WEBASSEMBLY-NEXT:#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 1
 // WEBASSEMBLY-NEXT:#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
@@ -1806,11 +1815,11 @@
 // WEBASSEMBLY64-NEXT:#define __LONG_MAX__ 9223372036854775807L
 // WEBASSEMBLY64-NEXT:#define __LONG_WIDTH__ 64
 // WEBASSEMBLY64-NEXT:#define __LP64__ 1
-// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_DEVICE 1 
-// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_SINGLE 4 
-// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_SYSTEM 0 
-// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_WRKGRP 2 
-// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_WVFRNT 3 
+// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_DEVICE 1
+// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_SINGLE 4
+// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_SYSTEM 0
+// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_WRKGRP 2
+// WEBASSEMBLY-NEXT:#define __MEMORY_SCOPE_WVFRNT 3
 // WEBASSEMBLY-NEXT:#define __NO_INLINE__ 1
 // WEBASSEMBLY-NEXT:#define __NO_MATH_ERRNO__ 1
 // WEBASSEMBLY-NEXT:#define __OBJC_BOOL_IS_BOOL 0
@@ -2126,11 +2135,11 @@
 // AVR:#define __LDBL_MIN__ 1.17549435e-38L
 // AVR:#define __LONG_LONG_MAX__ 9223372036854775807LL
 // AVR:#define __LONG_MAX__ 2147483647L
-// AVR:#define __MEMORY_SCOPE_DEVICE 1 
-// AVR:#define __MEMORY_SCOPE_SINGLE 4 
-// AVR:#define __MEMORY_SCOPE_SYSTEM 0 
-// AVR:#define __MEMORY_SCOPE_WRKGRP 2 
-// AVR:#define __MEMORY_SCOPE_WVFRNT 3 
+// AVR:#define __MEMORY_SCOPE_DEVICE 1
+// AVR:#define __MEMORY_SCOPE_SINGLE 4
+// AVR:#define __MEMORY_SCOPE_SYSTEM 0
+// AVR:#define __MEMORY_SCOPE_WRKGRP 2
+// AVR:#define __MEMORY_SCOPE_WVFRNT 3
 // AVR:#define __NO_INLINE__ 1
 // AVR:#define __ORDER_BIG_ENDIAN__ 4321
 // AVR:#define __ORDER_LITTLE_ENDIAN__ 1234
@@ -2422,11 +2431,11 @@
 // RISCV32: #define __LITTLE_ENDIAN__ 1
 // RISCV32: #define __LONG_LONG_MAX__ 9223372036854775807LL
 // RISCV32: #define __LONG_MAX__ 2147483647L
-// RISCV32: #define __MEMORY_SCOPE_DEVICE 1 
-// RISCV32: #define __MEMORY_SCOPE_SINGLE 4 
-// RISCV32: #define __MEMORY_SCOPE_SYSTEM 0 
-// RISCV32: #define __MEMORY_SCOPE_WRKGRP 2 
-// RISCV32: #define __MEMORY_SCOPE_WVFRNT 3 
+// RISCV32: #define __MEMORY_SCOPE_DEVICE 1
+// RISCV32: #define __MEMORY_SCOPE_SINGLE 4
+// RISCV32: #define __MEMORY_SCOPE_SYSTEM 0
+// RISCV32: #define __MEMORY_SCOPE_WRKGRP 2
+// RISCV32: #define __MEMORY_SCOPE_WVFRNT 3
 // RISCV32: #define __NO_INLINE__ 1
 // RISCV32: #define __POINTER_WIDTH__ 32
 // RISCV32: #define __PRAGMA_REDEFINE_EXTNAME 1
@@ -2634,11 +2643,11 @@
 // RISCV64: #define __LONG_LONG_MAX__ 9223372036854775807LL
 // RISCV64: #define __LONG_MAX__ 9223372036854775807L
 // RISCV64: #define __LP64__ 1
-// RISCV64: #define __MEMORY_SCOPE_DEVICE 1 
-// RISCV64: #define __MEMORY_SCOPE_SINGLE 4 
-// RISCV64: #define __MEMORY_SCOPE_SYSTEM 0 
-// RISCV64: #define __MEMORY_SCOPE_WRKGRP 2 
-// RISCV64: #define __MEMORY_SCOPE_WVFRNT 3 
+// RISCV64: #define __MEMORY_SCOPE_DEVICE 1
+// RISCV64: #define __MEMORY_SCOPE_SINGLE 4
+// RISCV64: #define __MEMORY_SCOPE_SYSTEM 0
+// RISCV64: #define __MEMORY_SCOPE_WRKGRP 2
+// RISCV64: #define __MEMORY_SCOPE_WVFRNT 3
 // RISCV64: #define __NO_INLINE__ 1
 // RISCV64: #define __POINTER_WIDTH__ 64
 // RISCV64: #define __PRAGMA_REDEFINE_EXTNAME 1
diff --git a/clang/test/Preprocessor/predefined-win-macros.c b/clang/test/Preprocessor/predefined-win-macros.c
index b830dc39d477dd..14e2f584bd0938 100644
--- a/clang/test/Preprocessor/predefined-win-macros.c
+++ b/clang/test/Preprocessor/predefined-win-macros.c
@@ -3,7 +3,7 @@
 // RUN: %clang_cc1 %s -x c++ -E -dM -triple x86_64-pc-win32 -fms-extensions -fms-compatibility \
 // RUN:     -fms-compatibility-version=19.00 -std=c++14 -o - | FileCheck -match-full-lines %s --check-prefix=CHECK-MS64
 // RUN: %clang_cc1 %s -x c++ -E -dM -triple x86_64-pc-win32 -fms-extensions -fms-compatibility \
-// RUN:     -fms-compatibility-version=19.00 -std=c++14 -o - | grep GCC | count 5
+// RUN:     -fms-compatibility-version=19.00 -std=c++14 -o - | grep GCC | count 7
 // CHECK-MS64: #define _INTEGRAL_MAX_BITS 64
 // CHECK-MS64: #define _ISO_VOLATILE 1
 // CHECK-MS64: #define _MSC_EXTENSIONS 1
@@ -26,7 +26,7 @@
 // RUN: %clang_cc1 %s -x c++ -E -dM -triple i686-pc-win32 -fms-extensions -fms-compatibility \
 // RUN:     -fms-compatibility-version=19.00 -std=c++17 -o - | FileCheck -match-full-lines %s --check-prefix=CHECK-MS
 // RUN: %clang_cc1 %s -x c++ -E -dM -triple i686-pc-win32 -fms-extensions -fms-compatibility \
-// RUN:     -fms-compatibility-version=19.00 -std=c++17 -o - | grep GCC | count 5
+// RUN:     -fms-compatibility-version=19.00 -std=c++17 -o - | grep GCC | count 7
 // CHECK-MS: #define _INTEGRAL_MAX_BITS 64
 // CHECK-MS: #define _ISO_VOLATILE 1
 // CHECK-MS: #define _MSC_EXTENSIONS 1
@@ -39,6 +39,8 @@
 // CHECK-MS-NOT: GNU
 // CHECK-MS-NOT: GXX
 // CHECK-MS: #define __GCC_ASM_FLAG_OUTPUTS__ 1
+// CHECK-MS: #define __GCC_CONSTRUCTIVE_SIZE {{.+}}
+// CHECK-MS: #define __GCC_DESTRUCTIVE_SIZE {{.+}}
 // CHECK-MS: #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1 1
 // CHECK-MS: #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 1
 // CHECK-MS: #define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1

@AaronBallman
Copy link
Collaborator Author

During today's office hours, @philnik777 mentioned that support for these macros would be very beneficial to libc++ folks. This is intended as a starting point for the support in that it does not currently have any special logic for any particular target and it does not have a diagnostic for using these macros within a header file.

In support of getting some target-specific values that make sense, I'm hopeful that the following folks can suggest proper values for the targets they know about (I pulled this list from the LLVM code owners list, so if there are others I should CC, please add them): @arsenm @asb @asl @jholewinski @mshockwave @SixWeining @benshi001 @kaz7 @TNorthover @nigelp-xmos @jpienaar @phoebewang @andreisfr @markschimmel @uweigand @wuzish @iliya-diyachkov

If any of you are able to let me know what the minimum offset is between two objects to avoid false sharing and what the maximum size of contiguous memory is to promote true sharing for your target, I can add it as part of this PR. If you don't know or are happy with the defaults I picked, that's fine too. I figure we can improve these values as we see fit. And if the answer is "it's complicated", I'd appreciate if we could handle that in a follow-up patch so we can get libc++ folks unblocked.

@phoebewang
Copy link
Contributor

Sorry, I know nothing about it. But it looks to me it's to match with GCC, why don't borrow the value from GCC as a beginning?

@cor3ntin
Copy link
Contributor

@AaronBallman Did you read https://discourse.llvm.org/t/rfc-c-17-hardware-constructive-destructive-interference-size/48674/42 ? I want to make sure we do abide by it and it's unclear a consensus was ever formed or called

@uweigand
Copy link
Member

For SystemZ the correct value is 256. In general I agree it makes sense to look at the GCC implementation as a source of reasonable values. Also, I think there probably should be no generic default value at all - it there is no platform-specific value known, it seems better to not define those values rather than define them to some incorrect value ...

@arsenm
Copy link
Contributor

arsenm commented Apr 22, 2024

For AMDGPU 64 is probably right

@AaronBallman
Copy link
Collaborator Author

@AaronBallman Did you read https://discourse.llvm.org/t/rfc-c-17-hardware-constructive-destructive-interference-size/48674/42 ? I want to make sure we do abide by it and it's unclear a consensus was ever formed or called

Yes, I looked at the thread and while there are very valid criticisms of what was standardized, the end result is still that we need this support 1) for Clang to support libc++ implementing a C++17 feature for conformance reasons and 2) for GCC compatibility (re: the macros that are exposed). We can either let libc++ solve this problem entirely on their own and not aim for GCC compatibility here, or we can bite the bullet and do our best. Given that GCC exposes these macros, I think it makes more sense for Clang to predefine them than for libc++ to try to do this from their end.

@AaronBallman
Copy link
Collaborator Author

Sorry, I know nothing about it. But it looks to me it's to match with GCC, why don't borrow the value from GCC as a beginning?

Because the values depend on the target and the compiler options picked. I don't have that information myself, and given how many targets and options we support, I am hoping target owners can help rather than having to guess.

@AaronBallman
Copy link
Collaborator Author

For SystemZ the correct value is 256.

Thanks! Double-checking: for both constructive and destructive?

In general I agree it makes sense to look at the GCC implementation as a source of reasonable values. Also, I think there probably should be no generic default value at all - it there is no platform-specific value known, it seems better to not define those values rather than define them to some incorrect value ...

On the one hand, I agree. But then libc++ will still have to pick default values to expose (these are constexpr size_t variables in libc++ rather than macros or std::optional values), so that just moves the problem elsewhere. Also, as best I can tell, that doesn't seem to be how GCC behaves (at least, from trying various GCC flavors on Compiler Explorer).

@uweigand
Copy link
Member

For SystemZ the correct value is 256.

Thanks! Double-checking: for both constructive and destructive?

Yes, for both. Every SystemZ model (supported by GCC/LLVM) has a L1 cache line size of 256 bytes.

In general I agree it makes sense to look at the GCC implementation as a source of reasonable values. Also, I think there probably should be no generic default value at all - it there is no platform-specific value known, it seems better to not define those values rather than define them to some incorrect value ...

On the one hand, I agree. But then libc++ will still have to pick default values to expose (these are constexpr size_t variables in libc++ rather than macros or std::optional values), so that just moves the problem elsewhere. Also, as best I can tell, that doesn't seem to be how GCC behaves (at least, from trying various GCC flavors on Compiler Explorer).

Looks like GCC's libstdc++ will simply not provide those variables either if GCC does not know those values for the current target:

#ifdef __cpp_lib_hardware_interference_size // C++ >= 17 && defined(gcc_dest_sz)
  inline constexpr size_t hardware_destructive_interference_size = __GCC_DESTRUCTIVE_SIZE;
  inline constexpr size_t hardware_constructive_interference_size = __GCC_CONSTRUCTIVE_SIZE;
#endif // __cpp_lib_hardware_interference_size

@AaronBallman
Copy link
Collaborator Author

For SystemZ the correct value is 256.

Thanks! Double-checking: for both constructive and destructive?

Yes, for both. Every SystemZ model (supported by GCC/LLVM) has a L1 cache line size of 256 bytes.

Excellent, thank you!

In general I agree it makes sense to look at the GCC implementation as a source of reasonable values. Also, I think there probably should be no generic default value at all - it there is no platform-specific value known, it seems better to not define those values rather than define them to some incorrect value ...

On the one hand, I agree. But then libc++ will still have to pick default values to expose (these are constexpr size_t variables in libc++ rather than macros or std::optional values), so that just moves the problem elsewhere. Also, as best I can tell, that doesn't seem to be how GCC behaves (at least, from trying various GCC flavors on Compiler Explorer).

Looks like GCC's libstdc++ will simply not provide those variables either if GCC does not know those values for the current target:

#ifdef __cpp_lib_hardware_interference_size // C++ >= 17 && defined(gcc_dest_sz)
  inline constexpr size_t hardware_destructive_interference_size = __GCC_DESTRUCTIVE_SIZE;
  inline constexpr size_t hardware_constructive_interference_size = __GCC_CONSTRUCTIVE_SIZE;
#endif // __cpp_lib_hardware_interference_size

@ldionne do you have a preference? I can always skip defining the predefined macros unless a target has opted into providing that information.

@cor3ntin
Copy link
Contributor

I think it makes sense to go with this patch and let target maintainers tweak the value subsequently.
It seems that trying to not provide a value unless explicitly specified by the target would be worse for library vendors as
hardware_destructive_interference_size is specified to exist unconditionally.

``std::hardware_destructive_interference_size`` and
``std::hardware_constructive_interference_size``, respectively. These macros
are predefined in all C and C++ language modes. These macros can be
overridden on the command line with ``-D``, if desired. The values the macros
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These macros can be overridden on the command line with -D, if desired.

I would rather we don't say that at it could lead to ODR violations.
IE consider alignas(std::hardware_destructive_interference_size()) struct S{}; with different values in different TUs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, given that I'm not implementing the on-by-default diagnostic at this time, it's probably reasonable to remove that bit. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this resolution.

In order for these values to be at all useful, we MUST reserve the ability to change these values arbitrarily based on CPU tuning flags or in new compiler versions. If we end up having a release (of clang OR libc++) that exposes these values without a warning, I think that is indeed likely to result in users triggering ODR problems.

But, the answer to this problem must be to implement a diagnostic to try to avoid ODR issues, not to hide that the values might vary across TUs in the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was only to not encourage using -D. The fact it's defined through a macro is really an implementation details, the macro is not meant to be user facing.

As for whether we need something specific for ODR here... we certainly do not for other abi-impacting feature.
you can already get into issues if the target is different in different TUs. Or if any define is different across TUs
(C++ modules already catch that sort of ODR issues when use in alignas etc)

I'm also not sure whether we usually warn when undocumented macros are used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make sure I'm on the same page with what you're asking.

struct alignas(__GCC_CONSTRUCTIVE_SIZE) S {}; // #1
struct alignas(std::hardware_constructive_interference_size) T {}; // #2

static_assert(__GCC_CONSTRUCTIVE_SIZE > 0);  // #3
static_assert(std::hardware_constructive_interference_size > 0); // #4

There is nothing wrong with either (3) or (4) and I don't think we should diagnose that kind of use. So "any use in a header file" is too strict IMO. The problem with (1) and (2) is that it's a fragile diagnostic unless we're willing to put in significant effort.

The easy way to do (1) is to see if the operand to alignas (within a header) comes from a macro and see if the macro was named __GCC_CONSTRUCTIVE_SIZE (or destructive). However, that misses these kinds of scenarios:

#define DERP __GCC_CONSTRUCTIVE_SIZE
struct alignas(DERP) S {};
struct alignas(__GCC_CONSTRUCTIVE_SIZE + 0) T {};

(I'm not worried about people doing arithmetic in alignas getting false negatives, they earned them. But I am worried we're going to have to do macro chasing because of #if use.)

Resolving names in the STL is non-trivial thanks to inline namespaces, which makes (2) a bit of a challenge. But (2) runs into even worse issues than (1):

constexpr size_t MySize = std::hardware_constructive_interference_size;
struct alignas(MySize) S {};

and we're no longer doing macro chasing, now we're trying to unpack constant expressions.

Further, if the request is "ABI issues in general", all bets are off because people can do fantastic things like:

struct S {
  int array[std::hardware_constructive_interference_size];
};

and now we're playing whack-a-mole trying to identify safe vs unsafe uses in header files.

So I think the only thing that's reasonable to consider would be

struct alignas(__GCC_CONSTRUCTIVE_SIZE) S {};
struct alignas(std::hardware_constructive_interference_size) T{};

and nothing else, but that doesn't actually address your concern about catching ABI issues. FWIW, I think @cor3ntin is correct in that we don't really make special cases for all the other potentially ABI-terrible ideas people can come up with in a header file and there's nothing particularly special about this one.

We already have ODR checking facilities and that seems sufficient to catch issues that happen in practice. (Note, I'm not saying we should never add a diagnostic to help users catch potential ODR issues earlier. But I think it's not necessary for this PR to move forward and we should consider such diagnostics in a broader context than just this one narrow case.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was only to not encourage using -D. The fact it's defined through a macro is really an implementation details, the macro is not meant to be user facing.

There should to be a user-facing way to set this, in order to pin it to a particular value. If not -D, then e.g. copy GCC's --param hardware_destructive_interference_size=N command-line.

we don't really make special cases for all the other potentially ABI-terrible ideas people can come up with in a header file

IMO, the special thing about this is that it's exposed by the standard library in a way that invites and suggests the precise usage which is ABI-unstable. The purpose of hardware_destructive_interference_size is to put alignas(std::hardware_destructive_interference_size) on a struct. Yet, putting that into a header is a terrible idea almost all of the time -- it should be affected by -mtune, and may be changed by compiler upgrades. This is quite non-obvious issue for users, since the C++ stdlib generally goes to a huge amount of effort to remain ABI-stable for users. How do they know that this constant is not ABI-stable, unlike the entire rest of the library?

The way GCC handles it is to warn for any use of std::hardware_destructive_interference_size in a header -- but only that one, not hardware_constructive_interference_size. I think the rationale is that the documented/expected usage of destructive is to use to add padding and/or alignment, while the documented/expected usage of constructive is to static_assert that a struct is smaller than it. It actually doesn't
bother to warn on the internal _GCC* macros at all.

That may not be ideal, but it's a relatively simple solution that seems likely to be good enough. I think Clang could quite reasonably do the same thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should to be a user-facing way to set this, in order to pin it to a particular value. If not -D, then e.g. copy GCC's --param hardware_destructive_interference_size=N command-line.

-D is my preference, mostly because --param is not something we already support for options. I could put something into the extensions documentation to mention this?

How do they know that this constant is not ABI-stable, unlike the entire rest of the library?

I would assume that anyone using this interface is aware of the fact that it's tightly coupled with the CPU tuning used by the program and I would hope that folks are compiling with sanitizers enabled so they'd catch ABI issues more broadly than just this one narrow case. At the same time, defense in depth is not a bad thing.

That may not be ideal, but it's a relatively simple solution that seems likely to be good enough. I think Clang could quite reasonably do the same thing.

I'll take a run at implementing that; it feels a bit user hostile because there are benign uses within a header file and the macro is exposed in C while the STL interface isn't, but it does provide some measure of protection.

@ldionne
Copy link
Member

ldionne commented Apr 22, 2024

Looks like GCC's libstdc++ will simply not provide those variables either if GCC does not know those values for the current target:

#ifdef __cpp_lib_hardware_interference_size // C++ >= 17 && defined(gcc_dest_sz)
  inline constexpr size_t hardware_destructive_interference_size = __GCC_DESTRUCTIVE_SIZE;
  inline constexpr size_t hardware_constructive_interference_size = __GCC_CONSTRUCTIVE_SIZE;
#endif // __cpp_lib_hardware_interference_size

@ldionne do you have a preference? I can always skip defining the predefined macros unless a target has opted into providing that information.

Not defining these variables when we don't have a good value for them would make us non-conforming, so I agree with @cor3ntin : I think we should define these macros all the time and simply document that it's a best effort thing. This whole feature is a best effort thing anyways.

I think it makes sense to go with this patch and let target maintainers tweak the value subsequently. It seems that trying to not provide a value unless explicitly specified by the target would be worse for library vendors as hardware_destructive_interference_size is specified to exist unconditionally.

Exactly. It's arguable whether this was a good decision on WG21's end or not, but it was specified as provided unconditionally by a conforming implementation, therefore that's what we need to do (IMO).

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want more feedback on that but otherwise, LGTM

@AaronBallman
Copy link
Collaborator Author

Here are the values that I found for GCC that don't match the defaults.

Platform Constructive Destructive Link
ARM64 64 256 https://godbolt.org/z/G7arf9nPP
AVR 32 32 https://godbolt.org/z/Yq64f9Koj
BPF 32 32 https://godbolt.org/z/zezK71zoe
M68K 32 32 https://godbolt.org/z/h9f4vaYcs
mips 32 32 https://godbolt.org/z/v3nfPMx9b
mips64 32 32 https://godbolt.org/z/7G66rc4nY
power 32 32 https://godbolt.org/z/bWjdosE3f
power64 128 128 https://godbolt.org/z/jT86TMYTK
RISC-V 32-bit 32 32 https://godbolt.org/z/neErxEbWP
RISV-V 64-bit 32 32 https://godbolt.org/z/nG5zjjxdo
s390x 256 256 https://godbolt.org/z/j34f3eEMz
sparc 32 32 https://godbolt.org/z/96hscbesM

Note, this is testing default configuration with no CPU tuning. I'll nab the values from here that I can map to our targets but target owners should validate these values and let me know if anything looks amiss.

ARM64
AVR
BPF
M68K
mips32
mips64
Power
Power64
RISC-V
SPARC
@philnik777
Copy link
Contributor

@AaronBallman thanks for picking this up! I agree with @cor3ntin and @ldionne that we should provide the macros unconditionally. This brings us one step closer to finally making libc++ fully conforming in C++17. With the PSTL finally making steady progress, we're probably only missing a few audits and the math special functions to be fully conforming.

@AaronBallman
Copy link
Collaborator Author

Precommit CI is currently failing on libc++ tests that are impacted by this change; should I solve those as part of this patch or are you okay with fixing those after this patch lands? (Or would you like to push a fix to this branch?)

@AaronBallman AaronBallman requested a review from a team as a code owner April 23, 2024 14:50
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Apr 23, 2024
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather keep the libc++ tests green while landing this PR. I tried something out, let's see if that works.

Otherwise LGTM, thanks for tackling this long-standing item on our TODO list :-)

Copy link
Member

@jyknight jyknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the on-by-default diagnostic before we release this functionality is critically important -- but the primarily-useful part of that is going to be for the public libc++ APIs which expose these values (and, that already landed in mainline). This PR is almost an internal implementation detail, so doesn't really make things much worse than they are, so I don't feel like it needs to be held up by that.

I do really hope that diagnostic does get added soon.

I concur with previous comments that getting the values exactly right for each target in the first commit isn't super-important: each target maintainer may adjust as desired in the future, since these are abi-unstable values.

However, I do note that we already almost have the required data in the LLVM backend for many targets, via TargetTransformInfo::getCacheLineSize. That is only a single value, so it's only correct if requesting tuning for a single CPU-model, vs a generic tuning which must express the full range of possibilities. But, when collecting the values to use here, that existing data has been ignored, which I think is unfortunate. Ideally, we would actually share the data, rather than duplicating it in two different places. Though, due to architectural decisions made way-back-in-time, we often have trouble achieving actual sharing of backend-specific data...

But at the least, if we cannot actually share the same code, maybe we should attempt to be internally-consistent?

@@ -1792,6 +1792,11 @@ class TargetInfo : public TransferrableTargetInfo,
/// Whether to support HIP image/texture API's.
virtual bool hasHIPImageSupport() const { return true; }

/// The minimum offset between two objects to avoid false sharing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might add a comment here that this is not considered part of the ABI, and is okay to change, for clarity of future target maintainers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Done.

@@ -1792,6 +1792,11 @@ class TargetInfo : public TransferrableTargetInfo,
/// Whether to support HIP image/texture API's.
virtual bool hasHIPImageSupport() const { return true; }

/// The minimum offset between two objects to avoid false sharing.
virtual unsigned hardwareDestructiveInterferenceSize() const { return 64; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vaguely wonder if we should make this a single "hardwareInterferenceSizes" function, returning a std::pair<unsigned, unsigned>, because otherwise, when we start add per-CPU tuning to the values, it's very likely to result in duplicating the same series of conditionals in each of the two functions.

Also, with a single function, we are less likely to accidentally end up with an override that returns values such that destructive < constructive.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, thanks! Done.

Also, updated the comments on the base function to make it clear that
the values returned are not part of the ABI.
@cor3ntin
Copy link
Contributor

For the record, I am fairly opposed to a warning on normal use of a standard function.

I'm not very concerned about users using different target flags across TUs because this is already a massive foot gun and I do not buy that we are making things materially worse.

Instead, by having opinionated on-by-default warnings we encourage users to not care or proactively distrust our warnings. Users will use that functions in headers as a lot of people like to put most of their implementation in headers.

Don't get me wrong, I think diagnosing actual ODR-violation is a worthwhile goal, and as @AaronBallman mentioned
there are ways to do that (modules, asan, build system warning on inconsistent flags), but from within a single TU
we don't have a way to detect actual ODR issues, so we would end up forcing our opinions on users, which I'm not sure is marginally better than not implementing the feature at all.

I don't think this feature warrants deviating from our policy that on-by default warnings should be actionable and have a low false-positive rate, nor the complexity to actually produce the warning.

If we get users reporting issue with that feature, then we can always emit a deprecation warning.

@AaronBallman
Copy link
Collaborator Author

I don't think this feature warrants deviating from our policy that on-by default warnings should be actionable and have a low false-positive rate, nor the complexity to actually produce the warning.

Specific to this point, what concerns me is that there are uses within header files that are reasonable so long as the project is built for the same target. For example, we have internal header files in Clang and LLVM that could potentially make use of this; those headers are not exposed as part of our library interfaces, but we'd get these warnings and they would be false positives which would encourage us to disable the warning. So it's worth keeping in mind that there are false positives with this as well as true positives and I don't know of a good way to tell them apart aside from cross-TU checking like what is done with sanitizers.

@AaronBallman
Copy link
Collaborator Author

I'd rather keep the libc++ tests green while landing this PR. I tried something out, let's see if that works.

It looks like some further adjustments may be needed as some stage1 builders are failing:

# .---command stderr------------
# | /home/runner/_work/llvm-project/llvm-project/libcxx/test/std/language.support/support.dynamic/hardware_inference_size.compile.pass.cpp:16:32: error: no member named 'hardware_destructive_interference_size' in namespace 'std'
# |    16 | ASSERT_SAME_TYPE(decltype(std::hardware_destructive_interference_size), const std::size_t);
# |       |                           ~~~~~^
# | /home/runner/_work/llvm-project/llvm-project/libcxx/test/support/test_macros.h:271:48: note: expanded from macro 'ASSERT_SAME_TYPE'
# |   271 |     static_assert((test_macros_detail::is_same<__VA_ARGS__>::value), \
# |       |                                                ^~~~~~~~~~~
# | /home/runner/_work/llvm-project/llvm-project/libcxx/test/std/language.support/support.dynamic/hardware_inference_size.compile.pass.cpp:17:32: error: no member named 'hardware_constructive_interference_size' in namespace 'std'
# |    17 | ASSERT_SAME_TYPE(decltype(std::hardware_constructive_interference_size), const std::size_t);
# |       |                           ~~~~~^
# | /home/runner/_work/llvm-project/llvm-project/libcxx/test/support/test_macros.h:271:48: note: expanded from macro 'ASSERT_SAME_TYPE'
# |   271 |     static_assert((test_macros_detail::is_same<__VA_ARGS__>::value), \
# |       |                                                ^~~~~~~~~~~
# | 2 errors generated.
# `-----------------------------

I'm not quite certain what's causing the failure though

Otherwise LGTM, thanks for tackling this long-standing item on our TODO list :-)

Happy to help!

@ldionne
Copy link
Member

ldionne commented Apr 25, 2024

I'd rather keep the libc++ tests green while landing this PR. I tried something out, let's see if that works.

It looks like some further adjustments may be needed as some stage1 builders are failing:

[...]

Ah, yes, we're using the nightly clang which is labeled as clang-19 but doesn't contain the change yet. I'll mark the test as UNSUPPORTED on all Clang flavors for now (which is how the test should have been marked), and we will fix that in the patch that implements the library feature.

@AaronBallman
Copy link
Collaborator Author

I think this resolves all the outstanding issues. I plan to land these changes sometime today unless I hear otherwise (we can also address additional concerns post-commit).

Comment on lines 5586 to 5589
**Note: the values the macros expand to are not stable between releases of Clang
and do not need to match the values produced by GCC, so these macros should not
be used from header files because they may not be stable across multiple TUs
(the values may vary based on compiler version as well as CPU tuning).**
Copy link
Contributor

@cor3ntin cor3ntin Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Note: the values the macros expand to are not stable between releases of Clang
and do not need to match the values produced by GCC, so these macros should not
be used from header files because they may not be stable across multiple TUs
(the values may vary based on compiler version as well as CPU tuning).**
**Note: the values the macros expand to are not guaranteed to be stable. They are are affected by architectures and CPU tuning flags, can change between releases of Clang and will not match the values defined by other compilers such as GCC.**
**Compiling different TUs depending on these flags (or `std::hardware_constructive_interference` , `std::hardware_destructive_interference`) with different compilers, defines or architecture flags will lead to ODR violations bugs and should be avoided.**

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

@cor3ntin cor3ntin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AaronBallman AaronBallman merged commit 72c373b into llvm:main Apr 26, 2024
51 of 52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM backend:m68k backend:PowerPC backend:RISC-V backend:Sparc backend:SystemZ c++17 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category extension:gnu libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: Add cacheline size macros sufficient to support C++17 std::hardware_{con,de}structive_interference_size
9 participants