Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/cpu: support darwin/arm64 CPU feature detection [freeze exception] #42747

Closed
martisch opened this issue Nov 20, 2020 · 20 comments · May be fixed by golang/sys#114
Closed

internal/cpu: support darwin/arm64 CPU feature detection [freeze exception] #42747

martisch opened this issue Nov 20, 2020 · 20 comments · May be fixed by golang/sys#114
Labels
arch-arm64 FrozenDueToAge NeedsFix OS-Darwin release-blocker Security
Milestone

Comments

@martisch
Copy link
Contributor

martisch commented Nov 20, 2020

darwin/arm64 does not support CPUID or HWCAP CPU feature detection like other operating systems on arm64 currently supported by Go.

A new way using sysctlbyname will need to be implemented to detect if for example hw.optional.armv8_1_atomics is set to 1.

https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code

If someone has an M1 Apple Silicon Mac please post the output of:
sysctl -a | grep "hw\.optional"

@martisch martisch added the NeedsFix label Nov 20, 2020
@martisch martisch self-assigned this Nov 20, 2020
@martisch martisch added this to the Go1.17 milestone Nov 20, 2020
@FiloSottile
Copy link
Contributor

FiloSottile commented Nov 20, 2020

hw.optional.floatingpoint: 1
hw.optional.watchpoint: 4
hw.optional.breakpoint: 6
hw.optional.neon: 1
hw.optional.neon_hpfp: 1
hw.optional.neon_fp16: 1
hw.optional.armv8_1_atomics: 1
hw.optional.armv8_crc32: 1
hw.optional.armv8_2_fhm: 1
hw.optional.armv8_2_sha512: 1
hw.optional.armv8_2_sha3: 1
hw.optional.amx_version: 2
hw.optional.ucnormal_mem: 1
hw.optional.arm64: 1

from my MacBook Pro M1.

@FiloSottile FiloSottile modified the milestones: Go1.17, Go1.16 Nov 20, 2020
@FiloSottile FiloSottile changed the title internal/cpu: support darwin/arm64 CPU feature detection internal/cpu: support darwin/arm64 CPU feature detection [freeze exception] Nov 20, 2020
@FiloSottile
Copy link
Contributor

FiloSottile commented Nov 20, 2020

I'd like to request a freeze exception for this. Without hardware support, AES is not only extremely slow, but also not constant time, which is a security issue. crypto/tls will switch away from it, but not all applications can dynamically fallback, and not all peers support good alternatives.

The implementation is Go 1.16 would have to be scoped to only darwin/arm64 (not replacing the one in the other OSes), but if it is I think it would have very limited chances of affecting other targets, and darwin/arm64 will inevitably be in flux during the freeze anyway.

/cc @rsc @golang/release

@mundaym
Copy link
Member

mundaym commented Nov 20, 2020

Can we just set aes, pmull, sha1 and sha2 to true for the darwin/arm64 target? It doesn't look like they are optional. That seems like a low risk change for Go 1.16 and then some sort of sysctl parsing could be added for Go 1.17 for atomics etc.

@martisch
Copy link
Contributor Author

martisch commented Nov 20, 2020

A low complexity workaround since there is only one Apple Silicon Chip (except the DTK which should be similar and testable) is that for darwin/arm64 we just hardcode the supported arm64 features in internal/cpu and @FiloSottile + @cherrymui can test that this works on their Apple Silicon Macs with ./all.bash. internal/cpu is only used by the standard library and I would expect any code dependent on those CPU features to have tests.

Then for go1.17 we can follow up with runtime CPU feature checks.

Im working on a CL to test the hard coding of CPU features.

@FiloSottile
Copy link
Contributor

FiloSottile commented Nov 20, 2020

We have to consider that Go 1.16 will have to work correctly until 2022, and while I don't expect new Apple Silicon chips to drop such core features, we don't really know in advance. Binaries built with Go 1.16 might also stick around longer than Go 1.16 itself, and this is logic that would end up embedded in all of them.

@gopherbot
Copy link

gopherbot commented Nov 20, 2020

Change https://golang.org/cl/271986 mentions this issue: internal/cpu: set features used by Apple Silicon M1 for darwin/arm64

@martisch
Copy link
Contributor Author

martisch commented Nov 20, 2020

I see these options to prevent regressions with future darwin/arm64 CPUs if we hard code the darwin/arm64 supported CPU features in Go 1,16:

  1. Create a minor version of Go 1.16 that sets specific features to always false again if Apple Silicon is sold that does not support the feature.

  2. GODEBUG=cpu.all=off will work to turn them off too. GODEBUG only refuses to turn on features that it didn't detect present.

  3. Find a way to reliably detect that Go 1.16 is running on Apple Silicon M1 (may be easier than CPU feature detection) and only then enable all CPU features accordingly without individual detection.

Note that we likely need an allowlist for AES or some kind of different check anyway in the future as AES is not shown in the hw.optional. list.

@dmitshur
Copy link
Contributor

dmitshur commented Nov 20, 2020

@FiloSottile Thanks for letting us know. We've discussed this on the release team, and based on the rationale you provided and the external constraints that could not be worked around in advance, we are leaning towards this freeze exception being granted.

Please keep the release schedule in mind and re-evaluate this as needed if this cannot be resolved without causing a significant additional delay or risk for Go 1.16.

Update: Approved.

@martisch martisch assigned martisch and unassigned martisch Nov 21, 2020
@martisch
Copy link
Contributor Author

martisch commented Nov 21, 2020

Looking further how to have runtime feature detection in darwin/arm64 we can do instead of hardcoding:

UPDATE: I have it working on darwin/amd64. Now need to wire it up for the posted arm64 sysctl names and then somebody needs to test it.

@gopherbot
Copy link

gopherbot commented Nov 21, 2020

Change https://golang.org/cl/271989 mentions this issue: internal/cpu: add darwin/arm64 CPU feature detection support

@martisch
Copy link
Contributor Author

martisch commented Nov 22, 2020

Since we still have some hardcoded CPU features (AES) for which no hw.optional setting seems to be available are there machdep.cpu fields that identify the CPU in sysctl -a on arm64? If so please post them.

What is the output of sysctl machdep.cpu.features on M1?

On amd64 Mac OS X I see:

machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz

@mlflr
Copy link

mlflr commented Nov 22, 2020

What is the output of sysctl machdep.cpu.features on M1?

Nothing (Mac Mini M1):

$ sysctl machdep.cpu.features
$ sysctl machdep.cpu
machdep.cpu.cores_per_package: 8
machdep.cpu.core_count: 8
machdep.cpu.logical_per_package: 8
machdep.cpu.thread_count: 8
machdep.cpu.brand_string: Apple processor

@markmentovai
Copy link

markmentovai commented Nov 23, 2020

@mundaym #42747 (comment) :

Can we just set aes, pmull, sha1 and sha2 to true for the darwin/arm64 target? It doesn't look like they are optional. That seems like a low risk change for Go 1.16 and then some sort of sysctl parsing could be added for Go 1.17 for atomics etc.

I’ll go a step further and suggest that you can assume the presence of these instructions not just for Go 1.16 but for all time. There shouldn’t be any need for run-time detection of these features on mac-arm64. For that matter, although ios-arm64 has a lower baseline CPU than mac-arm64, these specific instructions are also guaranteed to be available on ios-arm64 as well.

In support, I offer clang and llvm, which in code contributed by Apple engineers:

In further support, none of the instructions under consideration (aes, pmull, sha1 or sha2) have a corresponding node in hw.optional. Detection isn’t necessary! As you’ll see from the above, cryptographic extensions (FeatureCrypto) are available on all Apple arm64 CPUs back to the very first one, the A7. All of these instructions fall under the ARMv8 cryptographic extension (ARM DDI 0487F.c §A2.3). If Apple clang is happy to emit these instructions (even if only via intrinsics without issuing a warning), golang should have no qualms either.

Similarly, even for some of the hw.optional nodes that are available, it’s completely safe to assume the presence of the feature without the need for a run-time check. hw.optional.floatingpoint and hw.optional.neon both appear, but they’re never going to be absent on mac-arm64.

@markmentovai
Copy link

markmentovai commented Nov 23, 2020

…and I’ll put my money where my mouth is: I assume the presence of crc32 in Chromium on mac-arm64.

The only reason I don’t assume pmull is that we don’t currently have any code that uses it, but there’s something in development that I expect to approve.

We’re also assuming ARMv8.3 as a baseline, for example via our use of fjcvtzs.

The clang/llvm references above support all of this, and in fact, if you’re using Apple clang (as opposed to open-source, which hasn’t caught up), you’ll even find the __ARM_ARCH_8_3__ macro defined.

@martisch
Copy link
Contributor Author

martisch commented Nov 23, 2020

@markmentovai Thanks for the information. I think https://go-review.googlesource.com/c/go/+/271989/7/src/internal/cpu/cpu_arm64_darwin.go that I came up with before your comment aligns with your comments. Go already assumes all arm64 support e.g. asimd and fp. The aes, pmull, sha1 or sha2 are set unconditionally to true on darwin and atomics and crc32 will be set depending on the existing hw.optional value.

I do not think we need to assume crc32 unconditionally as it would only save a syscall at runtime startup and checking for it keeps the property that for each CPU feature Go cares about to detect at runtime if there exist a hw.optional value it is checked.

@clausecker
Copy link

clausecker commented Nov 25, 2020

If this lands, similar changes should be made to x/sys/cpu as it suffers from the same issue. Should I open a new issue for that?

@FiloSottile
Copy link
Contributor

FiloSottile commented Dec 2, 2020

I find the arguments for none of these features being optional pretty convincing, and this late in the cycle I think the smaller CL 271986 would be a safer bet. If Apple's LLVM is happy assuming those features, we're fine doing the same.

@davidben
Copy link
Contributor

davidben commented Dec 3, 2020

In BoringSSL (and thus crypto in Chromium), we also use static armcaps for Apple ARM64 platforms. We look at __ARM_FEATURE_CRYPTO, __ARM_NEON, etc., preprocessor defines from the C compiler and exercise NEON, AES, PMULL, SHA-1, and SHA-2. (Whatever instructions the compiler believes exist in a build are baseline by construction, since the compiler feels free to emit them.) NEON's also baseline for all of ARMv8-A, not just Apple ARM64.

My impression so far (prior to ARM macOS) was that they largely used static armcaps, dispatching with universal binaries and the app store. Now they're on another generation of aarch64 features and macOS has an ARM port, it makes sense that they'd would shift more to sysctlbyname beyond that initial baseline.

(The joys of ARM's platform-specific feature detection...)

@dmitshur
Copy link
Contributor

dmitshur commented Dec 3, 2020

I've edited an earlier #42747 (comment) to clarify this freeze exception has been approved.

@martisch
Copy link
Contributor Author

martisch commented Dec 7, 2020

Created a separate issue for darwin/arm64 support in x/sys/cpu: #43046

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 FrozenDueToAge NeedsFix OS-Darwin release-blocker Security
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants