Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][driver] Allow unaligned access on ARMv7 and higher by default #82400

Conversation

ilovepi
Copy link
Contributor

@ilovepi ilovepi commented Feb 20, 2024

ARM's Clang and GCC embedded compilers default to allowing unaligned
access for ARMv7+. This patch changes the Clang driver default to match.
Users can opt out with -mno-unaligned-access.

Fixes #59560

Created using spr 1.3.4
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Feb 20, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 20, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Paul Kirth (ilovepi)

Changes

ARM's Clang and GCC embedded compilers default to allowing unaligned
access for ARMv7+. This patch changes the Clang driver default to match.
Users can opt out with -mno-unaligned-access.

Fixes #59560


Full diff: https://github.com/llvm/llvm-project/pull/82400.diff

2 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Arch/ARM.cpp (+6-8)
  • (modified) clang/test/Driver/arm-alignment.c (+8)
diff --git a/clang/lib/Driver/ToolChains/Arch/ARM.cpp b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
index e6ee2f88a84edf..3bf6056f0c3ecc 100644
--- a/clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -895,20 +895,18 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
     // defaults this bit to 0 and handles it as a system-wide (not
     // per-process) setting. It is therefore safe to assume that ARMv7+
     // Linux targets support unaligned accesses. The same goes for NaCl
-    // and Windows.
-    //
-    // The above behavior is consistent with GCC.
+    // and Windows. However, ARM's forks of GCC and Clang both allow
+    // unaligned accesses by default for all targets. We follow this
+    // behavior and enable unaligned accesses by default for ARMv7+ targets.
+    // Users can disable behavior via compiler options (-mno-unaliged-access).
+    // See https://github.com/llvm/llvm-project/issues/59560 for more info.
     int VersionNum = getARMSubArchVersionNumber(Triple);
     if (Triple.isOSDarwin() || Triple.isOSNetBSD()) {
       if (VersionNum < 6 ||
           Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m)
         Features.push_back("+strict-align");
-    } else if (Triple.isOSLinux() || Triple.isOSNaCl() ||
-               Triple.isOSWindows()) {
-      if (VersionNum < 7)
+    } else if (VersionNum < 7)
         Features.push_back("+strict-align");
-    } else
-      Features.push_back("+strict-align");
   }
 
   // llvm does not support reserving registers in general. There is support
diff --git a/clang/test/Driver/arm-alignment.c b/clang/test/Driver/arm-alignment.c
index 9177b625729b85..6d0084451e82c7 100644
--- a/clang/test/Driver/arm-alignment.c
+++ b/clang/test/Driver/arm-alignment.c
@@ -22,6 +22,14 @@
 // RUN: %clang -target armv7-windows -### %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-UNALIGNED-ARM < %t %s
 
+/// Ensure that by default before ARMv7 we default to +strict-align
+// RUN: %clang -target armv6 -### %s 2> %t
+// RUN: FileCheck --check-prefix=CHECK-ALIGNED-ARM < %t %s
+
+/// After ARMv7 by default we allow unaligned accesses for all targets
+// RUN: %clang -target armv7 -### %s 2> %t
+// RUN: FileCheck --check-prefix=CHECK-UNALIGNED-ARM < %t %s
+
 // RUN: %clang --target=aarch64 -munaligned-access -### %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-UNALIGNED-AARCH64 < %t %s
 

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 20, 2024

@llvm/pr-subscribers-backend-arm

Author: Paul Kirth (ilovepi)

Changes

ARM's Clang and GCC embedded compilers default to allowing unaligned
access for ARMv7+. This patch changes the Clang driver default to match.
Users can opt out with -mno-unaligned-access.

Fixes #59560


Full diff: https://github.com/llvm/llvm-project/pull/82400.diff

2 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/Arch/ARM.cpp (+6-8)
  • (modified) clang/test/Driver/arm-alignment.c (+8)
diff --git a/clang/lib/Driver/ToolChains/Arch/ARM.cpp b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
index e6ee2f88a84edf..3bf6056f0c3ecc 100644
--- a/clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -895,20 +895,18 @@ llvm::ARM::FPUKind arm::getARMTargetFeatures(const Driver &D,
     // defaults this bit to 0 and handles it as a system-wide (not
     // per-process) setting. It is therefore safe to assume that ARMv7+
     // Linux targets support unaligned accesses. The same goes for NaCl
-    // and Windows.
-    //
-    // The above behavior is consistent with GCC.
+    // and Windows. However, ARM's forks of GCC and Clang both allow
+    // unaligned accesses by default for all targets. We follow this
+    // behavior and enable unaligned accesses by default for ARMv7+ targets.
+    // Users can disable behavior via compiler options (-mno-unaliged-access).
+    // See https://github.com/llvm/llvm-project/issues/59560 for more info.
     int VersionNum = getARMSubArchVersionNumber(Triple);
     if (Triple.isOSDarwin() || Triple.isOSNetBSD()) {
       if (VersionNum < 6 ||
           Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m)
         Features.push_back("+strict-align");
-    } else if (Triple.isOSLinux() || Triple.isOSNaCl() ||
-               Triple.isOSWindows()) {
-      if (VersionNum < 7)
+    } else if (VersionNum < 7)
         Features.push_back("+strict-align");
-    } else
-      Features.push_back("+strict-align");
   }
 
   // llvm does not support reserving registers in general. There is support
diff --git a/clang/test/Driver/arm-alignment.c b/clang/test/Driver/arm-alignment.c
index 9177b625729b85..6d0084451e82c7 100644
--- a/clang/test/Driver/arm-alignment.c
+++ b/clang/test/Driver/arm-alignment.c
@@ -22,6 +22,14 @@
 // RUN: %clang -target armv7-windows -### %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-UNALIGNED-ARM < %t %s
 
+/// Ensure that by default before ARMv7 we default to +strict-align
+// RUN: %clang -target armv6 -### %s 2> %t
+// RUN: FileCheck --check-prefix=CHECK-ALIGNED-ARM < %t %s
+
+/// After ARMv7 by default we allow unaligned accesses for all targets
+// RUN: %clang -target armv7 -### %s 2> %t
+// RUN: FileCheck --check-prefix=CHECK-UNALIGNED-ARM < %t %s
+
 // RUN: %clang --target=aarch64 -munaligned-access -### %s 2> %t
 // RUN: FileCheck --check-prefix=CHECK-UNALIGNED-AARCH64 < %t %s
 

Copy link

github-actions bot commented Feb 20, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Created using spr 1.3.4
@davemgreen
Copy link
Collaborator

davemgreen commented Feb 20, 2024

Hi - I like the change. We have this code in the downstream compiler, which also enables this for Armv6, but specifically disables it for v6m and v8m.baseline.

      if (VersionNum < 6 ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v8m_baseline) {
        Features.push_back("+strict-align");
      }

I don't have a strong opinion about what happens with ARMv6, but this deserves a release note. And v6m/v8m.baseline probably deserve specific code and a test.

} else if (Triple.isOSLinux() || Triple.isOSNaCl() ||
Triple.isOSWindows()) {
if (VersionNum < 7)
} else if (VersionNum < 7)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite right. Armv6 (but not Armv6-M) can support unaligned accesses. V8-M.main (logical extension of Armv7-M) supports unaligned accesses but v8-M.base (logical extension of Armv6-M) does not. Yet both will get VersionNum of 8.

Although not quite at the same point Arm's downstream fork uses

if (VersionNum < 6 ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v8m_baseline) {
        Features.push_back("+strict-align");

I don't suppose many care about Arm v6 but we'll need to make sure that strict-align is added for v8-m.base.

Arm's fork of clang mentions this in the documentation for -mno-unaligned-access https://developer.arm.com/documentation/101754/0621/armclang-Reference/armclang-Command-line-Options/-munaligned-access---mno-unaligned-access

@ilovepi
Copy link
Contributor Author

ilovepi commented Feb 20, 2024

Hi - I like the change. We have this code in the downstream compiler, which also enables this for Armv6, but specifically disables it for v6m and v8m.baseline.

      if (VersionNum < 6 ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v8m_baseline) {
        Features.push_back("+strict-align");
      }

I don't have a strong opinion about what happens with ARMv6, but this deserves a release note. And v6m/v8m.baseline probably deserve specific code and a test.

Those are good points, echoed by @smithp35. I'll update the patch accordingly.

Off hand, do either of you know if there are other differences in the driver that that we would want to consider adopting?

@smithp35
Copy link
Collaborator

Off the top of my head we default to -ffunction-sections and -fdata-sections as this helps Garbage Collection, and we have a linker feature that can merge constant pool entries between functions that needs the gaps between the functions.

@@ -22,6 +22,14 @@
// RUN: %clang -target armv7-windows -### %s 2> %t
// RUN: FileCheck --check-prefix=CHECK-UNALIGNED-ARM < %t %s

/// Ensure that by default before ARMv7 we default to +strict-align
// RUN: %clang -target armv6 -### %s 2> %t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--target= for new tests.

-target has been deprecated since about Clang 3.4

Created using spr 1.3.4
@efriedma-quic
Copy link
Collaborator

Unaligned accesses require that SCTLR.A is 0, SCTLR.U is 1, and that the memory in question is configured as "normal" memory. Almost all operating systems do in fact configure their registers/memory this way, but on baremetal it's not really a safe assumption. Changing the default here is basically guaranteed to break someone's code.

We could make the value judgement that the performance gain outweighs breaking someone's code. Disabled unaligned access is a performance hit users have a hard time discovering; incorrectly enabled unaligned access will generate an obvious alignment fault on v7 and up.

On pre-v7 processors, you can get the old "rotate" behavior instead of a fault. This is controlled by SCTLR.U on v6, but I don't think there's any reason to expect the bit is configured the "right" way on baremetal. So changing the default for v6 is a bit more dubious.

The tradeoffs might be a bit different for M-class processors; I think unaligned access works by default there (except for v6m and v8m.baseline).

This change needs a release note.

Created using spr 1.3.4
Comment on lines 308 to 317
- ARMv6+ targets now default to allowing unaligned access, except Armv6-M, and
Armv8-M without the Main Extension. Baremetal targets should check that the
new default will work with their system configurations, since it requires
that SCTLR.A is 0, SCTLR.U is 1, and that the memory in question is
configured as "normal" memory. We've made the value judgment that the
performance gains here outweigh breakages, since it is difficult to identify
performance loss from disabling unaligned access, but incorrect enabling
unaligned access will generate an obvious alignment fault on ARMv7+. This is
also the default setting for ARM's downstream compilers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efriedma-quic @davemgreen Is this more in line with what you would expect in the release notes?

Comment on lines 895 to 898
if (VersionNum < 6 ||
Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m ||
Triple.getSubArch() ==
llvm::Triple::SubArchType::ARMSubArch_v8m_baseline)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efriedma-quic I'm not too sure what the correct check would be here given your comment. Maybe this?

if (Triple.isOSDarwin() || Triple.isOSNetBSD()) {
      if (VersionNum < 6 ||
          Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m)
        Features.push_back("+strict-align");
    } else if (VersionNum < 7 ||
        Triple.getSubArch() == llvm::Triple::SubArchType::ARMSubArch_v6m ||
        Triple.getSubArch() ==
            llvm::Triple::SubArchType::ARMSubArch_v8m_baseline)
        Features.push_back("+strict-align");
    } 

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on how we treat v6a. I guess I'd lean towards being conservative, I guess, like what you wrote?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good start. We can discuss relaxing our treatment of v6 later without much trouble. I'll go ahead and update the comments and release notes to match this a bit more closely.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unaligned accesses require that SCTLR.A is 0, SCTLR.U is 1, and that the memory in question is configured as "normal" memory. Almost all operating systems do in fact configure their registers/memory this way, but on baremetal it's not really a safe assumption. Changing the default here is basically guaranteed to break someone's code.

We could make the value judgement that the performance gain outweighs breaking someone's code. Disabled unaligned access is a performance hit users have a hard time discovering; incorrectly enabled unaligned access will generate an obvious alignment fault on v7 and up.

On pre-v7 processors, you can get the old "rotate" behavior instead of a fault. This is controlled by SCTLR.U on v6, but I don't think there's any reason to expect the bit is configured the "right" way on baremetal. So changing the default for v6 is a bit more dubious.

The tradeoffs might be a bit different for M-class processors; I think unaligned access works by default there (except for v6m and v8m.baseline).

This change needs a release note.

The issue that we have found is that as both GCC and Arm Compiler default to -munaligned-access, users expect it to be the default. They only notice the much bigger codesize/worse performance and don't understand the reason without a lot of digging. You are certainly right that someone who has been using clang bare metal in the past might hit problems with the new default, but there is a high chance they are just using the wrong option without noticing. And I believe aligning with GCC/Arm Compiler is a better default going forward, as more people start using Clang in bare metal. Hopefully the release note can at least make it clear.

M-Profile needs a bit set in the bootcode, IIRC.

Comment on lines 308 to 317
- ARMv7+ targets now default to allowing unaligned access, except Armv6-M, and
Armv8-M without the Main Extension. Baremetal targets should check that the
new default will work with their system configurations, since it requires
that SCTLR.A is 0, SCTLR.U is 1, and that the memory in question is
configured as "normal" memory. We've made the value judgment that the
performance gains here outweigh breakages, since it is difficult to identify
performance loss from disabling unaligned access, but incorrect enabling
unaligned access will generate an obvious alignment fault on ARMv7+. This is
also the default setting for ARM's downstream compilers. We have not changed
the default behavior for ARMv6, but may revisit that decision in the future.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would up-play the compatibility argument and downplay the judgement call a little. And mention the way to disable it. Maybe something like

- ARMv7+ targets now default to allowing unaligned access, except Armv6-M, and
  Armv8-M without the Main Extension. Baremetal targets should check that the
  new default will work with their system configurations, since it requires
  that SCTLR.A is 0, SCTLR.U is 1, and that the memory in question is
  configured as "normal" memory. This brings clang in-line with the default settings
  for GCC and Arm Compiler. The old behaviour can be restored with
  -mno-unaligned-access.

But you might want to re-add the reasoning about the performance/codesize loss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggested edits. I've mostly kept them, but as you mentioned added a bit about the performance/size benefits.

// RUN: %clang --target=armv6 -### %s 2> %t
// RUN: FileCheck --check-prefix=CHECK-ALIGNED-ARM < %t %s

// RUN: %clang --target=armv7 -### %s 2> %t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some extra tests for things like 8-m.main, 8-m.base and 6-m, to make sure we have coverage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out. I had misread the RUN lines at the bottom of the file, and missed that they were emitting an error rather than checking the cc1 options. I used those same targets. Please let me know if I've missed any.

@davemgreen
Copy link
Collaborator

I would also personally add Armv6 too for consistency, but don't have a strong opinion.

@ilovepi
Copy link
Contributor Author

ilovepi commented Feb 22, 2024

I would also personally add Armv6 too for consistency, but don't have a strong opinion.

I think we should too, but everyone agrees this is a strict improvement, so I'd prefer to land something now and we can hash out the right thing for ARMv6 as a follow up. Does that sound good to you?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ilovepi ilovepi merged commit e314622 into main Feb 22, 2024
4 of 5 checks passed
@ilovepi ilovepi deleted the users/ilovepi/spr/clangdriver-allow-unaligned-access-on-armv7-and-higher-by-default branch February 22, 2024 22:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

-mcpu=cortex-m33 does not set -munaligned-access
7 participants