Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/cpu: support darwin/arm64 CPU feature detection [freeze exception] #42747

Open
martisch opened this issue Nov 20, 2020 · 16 comments
Open

internal/cpu: support darwin/arm64 CPU feature detection [freeze exception] #42747

martisch opened this issue Nov 20, 2020 · 16 comments

Comments

@martisch
Copy link
Contributor

@martisch martisch commented Nov 20, 2020

darwin/arm64 does not support CPUID or HWCAP CPU feature detection like other operating systems on arm64 currently supported by Go.

A new way using sysctlbyname will need to be implemented to detect if for example hw.optional.armv8_1_atomics is set to 1.

https://developer.apple.com/documentation/apple_silicon/addressing_architectural_differences_in_your_macos_code

If someone has an M1 Apple Silicon Mac please post the output of:
sysctl -a | grep "hw\.optional"

@martisch martisch added the NeedsFix label Nov 20, 2020
@martisch martisch self-assigned this Nov 20, 2020
@martisch martisch added this to the Go1.17 milestone Nov 20, 2020
@FiloSottile
Copy link
Member

@FiloSottile FiloSottile commented Nov 20, 2020

hw.optional.floatingpoint: 1
hw.optional.watchpoint: 4
hw.optional.breakpoint: 6
hw.optional.neon: 1
hw.optional.neon_hpfp: 1
hw.optional.neon_fp16: 1
hw.optional.armv8_1_atomics: 1
hw.optional.armv8_crc32: 1
hw.optional.armv8_2_fhm: 1
hw.optional.armv8_2_sha512: 1
hw.optional.armv8_2_sha3: 1
hw.optional.amx_version: 2
hw.optional.ucnormal_mem: 1
hw.optional.arm64: 1

from my MacBook Pro M1.

@FiloSottile FiloSottile modified the milestones: Go1.17, Go1.16 Nov 20, 2020
@FiloSottile FiloSottile changed the title internal/cpu: support darwin/arm64 CPU feature detection internal/cpu: support darwin/arm64 CPU feature detection [freeze exception] Nov 20, 2020
@FiloSottile
Copy link
Member

@FiloSottile FiloSottile commented Nov 20, 2020

I'd like to request a freeze exception for this. Without hardware support, AES is not only extremely slow, but also not constant time, which is a security issue. crypto/tls will switch away from it, but not all applications can dynamically fallback, and not all peers support good alternatives.

The implementation is Go 1.16 would have to be scoped to only darwin/arm64 (not replacing the one in the other OSes), but if it is I think it would have very limited chances of affecting other targets, and darwin/arm64 will inevitably be in flux during the freeze anyway.

/cc @rsc @golang/release

@mundaym
Copy link
Member

@mundaym mundaym commented Nov 20, 2020

Can we just set aes, pmull, sha1 and sha2 to true for the darwin/arm64 target? It doesn't look like they are optional. That seems like a low risk change for Go 1.16 and then some sort of sysctl parsing could be added for Go 1.17 for atomics etc.

@martisch
Copy link
Contributor Author

@martisch martisch commented Nov 20, 2020

A low complexity workaround since there is only one Apple Silicon Chip (except the DTK which should be similar and testable) is that for darwin/arm64 we just hardcode the supported arm64 features in internal/cpu and @FiloSottile + @cherrymui can test that this works on their Apple Silicon Macs with ./all.bash. internal/cpu is only used by the standard library and I would expect any code dependent on those CPU features to have tests.

Then for go1.17 we can follow up with runtime CPU feature checks.

Im working on a CL to test the hard coding of CPU features.

@FiloSottile
Copy link
Member

@FiloSottile FiloSottile commented Nov 20, 2020

We have to consider that Go 1.16 will have to work correctly until 2022, and while I don't expect new Apple Silicon chips to drop such core features, we don't really know in advance. Binaries built with Go 1.16 might also stick around longer than Go 1.16 itself, and this is logic that would end up embedded in all of them.

@gopherbot
Copy link

@gopherbot gopherbot commented Nov 20, 2020

Change https://golang.org/cl/271986 mentions this issue: internal/cpu: set features used by Apple Silicon M1 for darwin/arm64

@martisch
Copy link
Contributor Author

@martisch martisch commented Nov 20, 2020

I see these options to prevent regressions with future darwin/arm64 CPUs if we hard code the darwin/arm64 supported CPU features in Go 1,16:

  1. Create a minor version of Go 1.16 that sets specific features to always false again if Apple Silicon is sold that does not support the feature.

  2. GODEBUG=cpu.all=off will work to turn them off too. GODEBUG only refuses to turn on features that it didn't detect present.

  3. Find a way to reliably detect that Go 1.16 is running on Apple Silicon M1 (may be easier than CPU feature detection) and only then enable all CPU features accordingly without individual detection.

Note that we likely need an allowlist for AES or some kind of different check anyway in the future as AES is not shown in the hw.optional. list.

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Nov 20, 2020

@FiloSottile Thanks for letting us know. We've discussed this on the release team, and based on the rationale you provided and the external constraints that could not be worked around in advance, we are leaning towards this freeze exception being granted.

Please keep the release schedule in mind and re-evaluate this as needed if this cannot be resolved without causing a significant additional delay or risk for Go 1.16.

@dmitshur dmitshur added the Security label Nov 20, 2020
@martisch martisch assigned martisch and unassigned martisch Nov 21, 2020
@martisch
Copy link
Contributor Author

@martisch martisch commented Nov 21, 2020

Looking further how to have runtime feature detection in darwin/arm64 we can do instead of hardcoding:

UPDATE: I have it working on darwin/amd64. Now need to wire it up for the posted arm64 sysctl names and then somebody needs to test it.

@gopherbot
Copy link

@gopherbot gopherbot commented Nov 21, 2020

Change https://golang.org/cl/271989 mentions this issue: internal/cpu: add darwin/arm64 CPU feature detection support

@martisch
Copy link
Contributor Author

@martisch martisch commented Nov 22, 2020

Since we still have some hardcoded CPU features (AES) for which no hw.optional setting seems to be available are there machdep.cpu fields that identify the CPU in sysctl -a on arm64? If so please post them.

What is the output of sysctl machdep.cpu.features on M1?

On amd64 Mac OS X I see:

machdep.cpu.vendor: GenuineIntel
machdep.cpu.brand_string: Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz
@mlflr
Copy link

@mlflr mlflr commented Nov 22, 2020

What is the output of sysctl machdep.cpu.features on M1?

Nothing (Mac Mini M1):

$ sysctl machdep.cpu.features
$ sysctl machdep.cpu
machdep.cpu.cores_per_package: 8
machdep.cpu.core_count: 8
machdep.cpu.logical_per_package: 8
machdep.cpu.thread_count: 8
machdep.cpu.brand_string: Apple processor
@markmentovai
Copy link

@markmentovai markmentovai commented Nov 23, 2020

@mundaym #42747 (comment) :

Can we just set aes, pmull, sha1 and sha2 to true for the darwin/arm64 target? It doesn't look like they are optional. That seems like a low risk change for Go 1.16 and then some sort of sysctl parsing could be added for Go 1.17 for atomics etc.

I’ll go a step further and suggest that you can assume the presence of these instructions not just for Go 1.16 but for all time. There shouldn’t be any need for run-time detection of these features on mac-arm64. For that matter, although ios-arm64 has a lower baseline CPU than mac-arm64, these specific instructions are also guaranteed to be available on ios-arm64 as well.

In support, I offer clang and llvm, which in code contributed by Apple engineers:

In further support, none of the instructions under consideration (aes, pmull, sha1 or sha2) have a corresponding node in hw.optional. Detection isn’t necessary! As you’ll see from the above, cryptographic extensions (FeatureCrypto) are available on all Apple arm64 CPUs back to the very first one, the A7. All of these instructions fall under the ARMv8 cryptographic extension (ARM DDI 0487F.c §A2.3). If Apple clang is happy to emit these instructions (even if only via intrinsics without issuing a warning), golang should have no qualms either.

Similarly, even for some of the hw.optional nodes that are available, it’s completely safe to assume the presence of the feature without the need for a run-time check. hw.optional.floatingpoint and hw.optional.neon both appear, but they’re never going to be absent on mac-arm64.

@markmentovai
Copy link

@markmentovai markmentovai commented Nov 23, 2020

…and I’ll put my money where my mouth is: I assume the presence of crc32 in Chromium on mac-arm64.

The only reason I don’t assume pmull is that we don’t currently have any code that uses it, but there’s something in development that I expect to approve.

We’re also assuming ARMv8.3 as a baseline, for example via our use of fjcvtzs.

The clang/llvm references above support all of this, and in fact, if you’re using Apple clang (as opposed to open-source, which hasn’t caught up), you’ll even find the __ARM_ARCH_8_3__ macro defined.

@martisch
Copy link
Contributor Author

@martisch martisch commented Nov 23, 2020

@markmentovai Thanks for the information. I think https://go-review.googlesource.com/c/go/+/271989/7/src/internal/cpu/cpu_arm64_darwin.go that I came up with before your comment aligns with your comments. Go already assumes all arm64 support e.g. asimd and fp. The aes, pmull, sha1 or sha2 are set unconditionally to true on darwin and atomics and crc32 will be set depending on the existing hw.optional value.

I do not think we need to assume crc32 unconditionally as it would only save a syscall at runtime startup and checking for it keeps the property that for each CPU feature Go cares about to detect at runtime if there exist a hw.optional value it is checked.

@clausecker
Copy link

@clausecker clausecker commented Nov 25, 2020

If this lands, similar changes should be made to x/sys/cpu as it suffers from the same issue. Should I open a new issue for that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.