Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: using AVX-512 instruction without supporting CPUID flag(s) on MacOS hangs the Go runtime #42649

Open
vsivsi opened this issue Nov 17, 2020 · 4 comments

Comments

@vsivsi
Copy link

@vsivsi vsivsi commented Nov 17, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15.5 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

MacOS 10.15.7

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/vsi/Library/Caches/go-build"
GOENV="/Users/vsi/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/vsi/go/pkg/mod"
GONOPROXY="github.com/vsivsi"
GONOSUMDB="github.com/vsivsi"
GOOS="darwin"
GOPATH="/Users/vsi/go"
GOPRIVATE="github.com/vsivsi"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/Cellar/go/1.15.5/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.15.5/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/kp/kjdr0ytx5z9djnq4ysl15x0h0000gn/T/go-build367056703=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Attempt to use Intel AVX-512 VPOPCNT family AVX-512 instructions in go assembler.

What did you expect to see?

Assembly code using these instruction should run properly on processors supporting them, and should generate a UD fault (SIGILL) and terminate when invoked on a CPU without support.

What did you see instead?

Go runtime hangs forever with 100% CPU utilization upon executing a VPOPCNT(B/W/D/Q) instruction on hardware that doesn't support it. Tested running on a MacPro (2019) with 2.7 GHz 24-Core Intel Xeon W CPU (Xeon W-3265M)

$ sysctl machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET BMI1 AVX2 FDPEO SMEP BMI2 ERMS INVPCID PQM FPU_CSDS MPX PQE AVX512F AVX512DQ RDSEED ADX SMAP CLFSOPT CLWB IPT AVX512CD AVX512BW AVX512VL PKU AVX512VNNI MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD

Note, this processor does not include AVX512_BITALG or AVX512_VPOPCNTDQ, which are required for VPOPCNT(B/W) and VPOPCNT(D/Q) respectively. For a summary of the VPOPCNT support matrix, see: https://github.com/HJLebbink/asm-dude/wiki/VPOPCNT

The Intel processor documentation says that attempting to run such AVX512 instructions when the supporting feature CPUID flags are not set should result in raising a #UD exception. As expected, directly executing the amd64 UD2 instruction causes the go runtime to abort with SIGILL: illegal instruction. But when unsupported, these AVX512 instructions cause the runtime to hang in a tight loop of some kind, which doesn't seem to be consistent or correct behavior.

Here is a dump from a process sample of the hung go runtime process resulting from the repro below.

/usr/bin/sample Output
Sampling process 40325 for 3 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
Analysis of sampling vpopcntw (pid 40325) every 1 millisecond
Process:         vpopcntw [40325]
Path:            /Users/USER/*/vpopcntw
Load Address:    0x1000000
Identifier:      vpopcntw
Version:         ???
Code Type:       X86-64
Parent Process:  zsh [37987]

Date/Time: 2020-11-16 15:50:45.036 -0800
Launch Time: 2020-11-16 15:50:29.269 -0800
OS Version: Mac OS X 10.15.7 (19H15)
Report Version: 7
Analysis Tool: /usr/bin/sample

Physical footprint: 1732K
Physical footprint (peak): 1732K

Call graph:
2819 Thread_355890 DispatchQueue_1: com.apple.main-thread (serial)
+ 2816 ??? (in ) [0xc00009e7d0]
+ ! 2816 main.popcnt (in vpopcntw) + 0 [0x105c8e0]
+ 3 runtime.main (in vpopcntw) + 521 [0x102ee69]
+ 3 0x0
+ 2 _sigtramp (in libsystem_platform.dylib) + 0 [0x7fff685985e0]
+ 1 _sigtramp (in libsystem_platform.dylib) + 29 [0x7fff685985fd]
+ 1 runtime.sigtramp (in vpopcntw) + 51 [0x105aeb3]
+ 1 ??? (in ) [0xc000000480]
+ 1 runtime.setg (in vpopcntw) + 5 [0x1059405]
2819 Thread_355892
+ 2819 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff6859fb8b]
+ 2819 runtime.mstart_stub (in vpopcntw) + 46 [0x105b14e]
+ 2819 runtime.mstart (in vpopcntw) + 102 [0x10316a6]
+ 2819 runtime.mstart1 (in vpopcntw) + 200 [0x1031788]
+ 2818 runtime.sysmon (in vpopcntw) + 173 [0x10399ed]
+ ! 2818 runtime.usleep (in vpopcntw) + 49 [0x1047dd1]
+ ! 2818 runtime.asmcgocall (in vpopcntw) + 173 [0x10593ed]
+ ! 2818 runtime.usleep_trampoline (in vpopcntw) + 11 [0x105b02b]
+ ! 2818 usleep (in libsystem_c.dylib) + 53 [0x7fff68466de4]
+ ! 2818 nanosleep (in libsystem_c.dylib) + 196 [0x7fff68466eea]
+ ! 2816 __semwait_signal (in libsystem_kernel.dylib) + 10 [0x7fff684e3756]
+ ! 1 cerror (in libsystem_kernel.dylib) + 20 [0x7fff684e2241]
+ ! : 1 cerror_nocancel (in libsystem_kernel.dylib) + 0 [0x7fff684e1629]
+ ! 1 cerror (in libsystem_kernel.dylib) + 0 [0x7fff684e222d]
+ 1 runtime.sysmon (in vpopcntw) + 433 [0x1039af1]
+ 1 runtime.retake (in vpopcntw) + 518 [0x103a0e6]
+ 1 runtime.preemptone (in vpopcntw) + 165 [0x103a2a5]
+ 1 runtime.preemptM (in vpopcntw) + 135 [0x103f307]
+ 1 runtime.pthread_kill (in vpopcntw) + 49 [0x1047b11]
+ 1 runtime.asmcgocall (in vpopcntw) + 173 [0x10593ed]
+ 1 runtime.pthread_kill_trampoline (in vpopcntw) + 16 [0x105b330]
+ 1 pthread_kill (in libsystem_pthread.dylib) + 179 [0x7fff685a3d65]
2819 Thread_355893
+ 2819 runtime.mcall (in vpopcntw) + 91 [0x105799b]
+ 2819 runtime.park_m (in vpopcntw) + 157 [0x103571d]
+ 2819 runtime.schedule (in vpopcntw) + 110 [0x1034f0e]
+ 2819 runtime.startlockedm (in vpopcntw) + 133 [0x1033785]
+ 2819 runtime.stopm (in vpopcntw) + 197 [0x1032d25]
+ 2819 runtime.notesleep (in vpopcntw) + 231 [0x1009387]
+ 2819 runtime.semasleep (in vpopcntw) + 141 [0x102972d]
+ 2819 runtime.pthread_cond_wait (in vpopcntw) + 57 [0x1048459]
+ 2819 runtime.asmcgocall (in vpopcntw) + 173 [0x10593ed]
+ 2819 runtime.pthread_cond_wait_trampoline (in vpopcntw) + 16 [0x105b2b0]
+ 2819 _pthread_cond_wait (in libsystem_pthread.dylib) + 698 [0x7fff685a4425]
+ 2819 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff684e3882]
2819 Thread_355894
+ 2819 runtime.mcall (in vpopcntw) + 91 [0x105799b]
+ 2819 runtime.park_m (in vpopcntw) + 157 [0x103571d]
+ 2819 runtime.schedule (in vpopcntw) + 110 [0x1034f0e]
+ 2819 runtime.startlockedm (in vpopcntw) + 133 [0x1033785]
+ 2819 runtime.stopm (in vpopcntw) + 197 [0x1032d25]
+ 2819 runtime.notesleep (in vpopcntw) + 231 [0x1009387]
+ 2819 runtime.semasleep (in vpopcntw) + 141 [0x102972d]
+ 2819 runtime.pthread_cond_wait (in vpopcntw) + 57 [0x1048459]
+ 2819 runtime.asmcgocall (in vpopcntw) + 173 [0x10593ed]
+ 2819 runtime.pthread_cond_wait_trampoline (in vpopcntw) + 16 [0x105b2b0]
+ 2819 _pthread_cond_wait (in libsystem_pthread.dylib) + 698 [0x7fff685a4425]
+ 2819 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff684e3882]
2819 Thread_355895
2819 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff6859fb8b]
2819 runtime.mstart_stub (in vpopcntw) + 46 [0x105b14e]
2819 runtime.mstart (in vpopcntw) + 102 [0x10316a6]
2819 runtime.mstart1 (in vpopcntw) + 147 [0x1031753]
2819 runtime.schedule (in vpopcntw) + 727 [0x1035177]
2819 runtime.findrunnable (in vpopcntw) + 2687 [0x10344ff]
2819 runtime.stopm (in vpopcntw) + 197 [0x1032d25]
2819 runtime.notesleep (in vpopcntw) + 231 [0x1009387]
2819 runtime.semasleep (in vpopcntw) + 141 [0x102972d]
2819 runtime.pthread_cond_wait (in vpopcntw) + 57 [0x1048459]
2819 runtime.asmcgocall (in vpopcntw) + 173 [0x10593ed]
2819 runtime.pthread_cond_wait_trampoline (in vpopcntw) + 16 [0x105b2b0]
2819 _pthread_cond_wait (in libsystem_pthread.dylib) + 698 [0x7fff685a4425]
2819 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff684e3882]

Total number in stack (recursive counted multiple, when >=5):
5 runtime.asmcgocall (in vpopcntw) + 173 [0x10593ed]

Sort by top of stack, same collapsed (when >= 5):
__psynch_cvwait (in libsystem_kernel.dylib) 8457
__semwait_signal (in libsystem_kernel.dylib) 2816
main.popcnt (in vpopcntw) 2816

Binary Images:
0x1000000 - 0x10c61ee +vpopcntw (???) /Users/*/vpopcntw
0xd52f000 - 0xd5c0f47 dyld (750.6) <1D318D60-C9B0-3511-BE9C-82AFD2EF930D> /usr/lib/dyld
0x7fff2a0cc000 - 0x7fff2a0ccfff com.apple.Accelerate (1.11 - Accelerate 1.11) <4F9977AE-DBDB-3A16-A536-AC1F9938DCDD> /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate
0x7fff2a0e4000 - 0x7fff2a73afff com.apple.vImage (8.1 - 524.2.1) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vImage.framework/Versions/A/vImage
0x7fff2a73b000 - 0x7fff2a9a2ff7 libBLAS.dylib (1303.60.1) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
0x7fff2a9a3000 - 0x7fff2ae76fef libBNNS.dylib (144.100.2) <99C61C48-B14C-3DA6-8C31-6BF72DA0A3A9> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBNNS.dylib
0x7fff2ae77000 - 0x7fff2b212fff libLAPACK.dylib (1303.60.1) <5E3E3867-50C3-3E6A-9A2E-007CE77A4641> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
0x7fff2b213000 - 0x7fff2b228fec libLinearAlgebra.dylib (1303.60.1) <3D433800-0099-33E0-8C81-15F83247B2C9> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLinearAlgebra.dylib
0x7fff2b229000 - 0x7fff2b22eff3 libQuadrature.dylib (7) <371F36A7-B12F-363E-8955-F24F7C2048F6> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libQuadrature.dylib
0x7fff2b22f000 - 0x7fff2b29ffff libSparse.dylib (103) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libSparse.dylib
0x7fff2b2a0000 - 0x7fff2b2b2fef libSparseBLAS.dylib (1303.60.1) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libSparseBLAS.dylib
0x7fff2b2b3000 - 0x7fff2b48afd7 libvDSP.dylib (735.140.1) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvDSP.dylib
0x7fff2b48b000 - 0x7fff2b54dfef libvMisc.dylib (735.140.1) <3601FDE3-B142-398D-987D-8151A51F0A96> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libvMisc.dylib
0x7fff2b54e000 - 0x7fff2b54efff com.apple.Accelerate.vecLib (3.11 - vecLib 3.11) /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/vecLib
0x7fff2ccb4000 - 0x7fff2d043ffa com.apple.CFNetwork (1128.0.1 - 1128.0.1) <07F9CA9C-B954-3EA0-A710-3122BFF9F057> /System/Library/Frameworks/CFNetwork.framework/Versions/A/CFNetwork
0x7fff2e445000 - 0x7fff2e8c4feb com.apple.CoreFoundation (6.9 - 1677.104) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
0x7fff2f82d000 - 0x7fff2f82dfff com.apple.CoreServices (1069.24 - 1069.24) /System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices
0x7fff2f82e000 - 0x7fff2f8b3fff com.apple.AE (838.1 - 838.1) <2E5FD5AE-8A7F-353F-9BD1-0241F3586181> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/AE.framework/Versions/A/AE
0x7fff2f8b4000 - 0x7fff2fb95ff7 com.apple.CoreServices.CarbonCore (1217 - 1217) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/CarbonCore.framework/Versions/A/CarbonCore
0x7fff2fb96000 - 0x7fff2fbe3ffd com.apple.DictionaryServices (1.2 - 323.6) <26B70C82-25BC-353A-858F-945B14C803A2> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/DictionaryServices.framework/Versions/A/DictionaryServices
0x7fff2fbe4000 - 0x7fff2fbecff7 com.apple.CoreServices.FSEvents (1268.100.1 - 1268.100.1) /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/FSEvents.framework/Versions/A/FSEvents
0x7fff2fbed000 - 0x7fff2fe27ff6 com.apple.LaunchServices (1069.24 - 1069.24) <9A5359D9-9148-3B18-B868-56A9DA5FB60C> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/LaunchServices
0x7fff2fe28000 - 0x7fff2fec0ff1 com.apple.Metadata (10.7.0 - 2076.7) <0973F7E5-D58C-3574-A3CE-4F12CAC2D4C7> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Versions/A/Metadata
0x7fff2fec1000 - 0x7fff2feeefff com.apple.CoreServices.OSServices (1069.24 - 1069.24) <0E4F48AD-402C-3E9D-9CA9-6DD9479B28F9> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/OSServices.framework/Versions/A/OSServices
0x7fff2feef000 - 0x7fff2ff56fff com.apple.SearchKit (1.4.1 - 1.4.1) <2C5E1D85-E8B1-3DC5-91B9-E3EDB48E9369> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SearchKit.framework/Versions/A/SearchKit
0x7fff2ff57000 - 0x7fff2ff7bff5 com.apple.coreservices.SharedFileList (131.4 - 131.4) <02DE0D56-E371-3EF5-9BC1-FA435451B412> /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/SharedFileList.framework/Versions/A/SharedFileList
0x7fff307c1000 - 0x7fff307c7fff com.apple.DiskArbitration (2.7 - 2.7) <0BBBB6A6-604D-368B-9943-50B8CE75D51D> /System/Library/Frameworks/DiskArbitration.framework/Versions/A/DiskArbitration
0x7fff30b02000 - 0x7fff30ec7fff com.apple.Foundation (6.9 - 1677.104) <7C69F845-F651-3193-8262-5938010EC67D> /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
0x7fff3123b000 - 0x7fff312dfff3 com.apple.framework.IOKit (2.0.2 - 1726.140.1) <14223387-6F81-3976-8605-4BC2F253A93E> /System/Library/Frameworks/IOKit.framework/Versions/A/IOKit
0x7fff34de8000 - 0x7fff34df4ffe com.apple.NetFS (6.0 - 4.0) <4415F027-D36D-3B9C-96BA-AD22B44A4722> /System/Library/Frameworks/NetFS.framework/Versions/A/NetFS
0x7fff379d7000 - 0x7fff379f3fff com.apple.CFOpenDirectory (10.15 - 220.40.1) <7E6C88EB-3DD9-32B0-81FC-179552834FA9> /System/Library/Frameworks/OpenDirectory.framework/Versions/A/Frameworks/CFOpenDirectory.framework/Versions/A/CFOpenDirectory
0x7fff379f4000 - 0x7fff379ffffd com.apple.OpenDirectory (10.15 - 220.40.1) <4A92D8D8-A9E5-3A9C-942F-28576F6BCDF5> /System/Library/Frameworks/OpenDirectory.framework/Versions/A/OpenDirectory
0x7fff3ad9c000 - 0x7fff3b0e5ff1 com.apple.security (7.0 - 59306.140.5) /System/Library/Frameworks/Security.framework/Versions/A/Security
0x7fff3b0e6000 - 0x7fff3b16effb com.apple.securityfoundation (6.0 - 55236.60.1) <7C69DF47-4017-3DF2-B55B-712B481654CB> /System/Library/Frameworks/SecurityFoundation.framework/Versions/A/SecurityFoundation
0x7fff3b19d000 - 0x7fff3b1a1ff8 com.apple.xpc.ServiceManagement (1.0 - 1) <2C62956C-F2D4-3EB0-86C7-EAA06331621A> /System/Library/Frameworks/ServiceManagement.framework/Versions/A/ServiceManagement
0x7fff3be4d000 - 0x7fff3bec7ff7 com.apple.SystemConfiguration (1.19 - 1.19) <84F9B3BB-F7AF-3B7C-8CD0-D3C22D19619F> /System/Library/Frameworks/SystemConfiguration.framework/Versions/A/SystemConfiguration
0x7fff3fe37000 - 0x7fff3fefcfe7 com.apple.APFS (1412.141.1 - 1412.141.1) /System/Library/PrivateFrameworks/APFS.framework/Versions/A/APFS
0x7fff41c07000 - 0x7fff41c16fd7 com.apple.AppleFSCompression (119.100.1 - 1.0) <466ABD77-2E52-36D1-8E39-77AE2CC61611> /System/Library/PrivateFrameworks/AppleFSCompression.framework/Versions/A/AppleFSCompression
0x7fff433d7000 - 0x7fff433e0ff7 com.apple.coreservices.BackgroundTaskManagement (1.0 - 104) /System/Library/PrivateFrameworks/BackgroundTaskManagement.framework/Versions/A/BackgroundTaskManagement
0x7fff461e8000 - 0x7fff461f8ff3 com.apple.CoreEmoji (1.0 - 107.1) <7C2B3259-083B-31B8-BCDB-1BA360529936> /System/Library/PrivateFrameworks/CoreEmoji.framework/Versions/A/CoreEmoji
0x7fff46838000 - 0x7fff468a2ff0 com.apple.CoreNLP (1.0 - 213) /System/Library/PrivateFrameworks/CoreNLP.framework/Versions/A/CoreNLP
0x7fff4771d000 - 0x7fff4774bffd com.apple.CSStore (1069.24 - 1069.24) /System/Library/PrivateFrameworks/CoreServicesStore.framework/Versions/A/CoreServicesStore
0x7fff539a9000 - 0x7fff53a77ffd com.apple.LanguageModeling (1.0 - 215.1) /System/Library/PrivateFrameworks/LanguageModeling.framework/Versions/A/LanguageModeling
0x7fff53a78000 - 0x7fff53ac0fff com.apple.Lexicon-framework (1.0 - 72) <41F208B9-8255-3EC7-9673-FE0925D071D3> /System/Library/PrivateFrameworks/Lexicon.framework/Versions/A/Lexicon
0x7fff53ac7000 - 0x7fff53accff3 com.apple.LinguisticData (1.0 - 353.18) <3B92F249-4602-325F-984B-D2DE61EEE4E1> /System/Library/PrivateFrameworks/LinguisticData.framework/Versions/A/LinguisticData
0x7fff54e35000 - 0x7fff54e81fff com.apple.spotlight.metadata.utilities (1.0 - 2076.7) <0237323B-EC78-3FBF-9FC7-5A1FE2B5CE25> /System/Library/PrivateFrameworks/MetadataUtilities.framework/Versions/A/MetadataUtilities
0x7fff55938000 - 0x7fff55942fff com.apple.NetAuth (6.2 - 6.2) /System/Library/PrivateFrameworks/NetAuth.framework/Versions/A/NetAuth
0x7fff5ebce000 - 0x7fff5ebdeff3 com.apple.TCC (1.0 - 1) <017AB27D-6821-303A-8FD2-6DAC795CC7AA> /System/Library/PrivateFrameworks/TCC.framework/Versions/A/TCC
0x7fff622c1000 - 0x7fff622c3ff3 com.apple.loginsupport (1.0 - 1) <12F77885-27DC-3837-9CE9-A25EBA75F833> /System/Library/PrivateFrameworks/login.framework/Versions/A/Frameworks/loginsupport.framework/Versions/A/loginsupport
0x7fff64de1000 - 0x7fff64e15fff libCRFSuite.dylib (48) <5E5DE3CB-30DD-34DC-AEF8-FE8536A85E96> /usr/lib/libCRFSuite.dylib
0x7fff64e18000 - 0x7fff64e22fff libChineseTokenizer.dylib (34) <7F0DA183-1796-315A-B44A-2C234C7C50BE> /usr/lib/libChineseTokenizer.dylib
0x7fff64eae000 - 0x7fff64eb0ff7 libDiagnosticMessagesClient.dylib (112) /usr/lib/libDiagnosticMessagesClient.dylib
0x7fff65384000 - 0x7fff65385fff libSystem.B.dylib (1281.100.1) <0A6C8BA1-30FD-3D10-83FD-FF29E221AFFE> /usr/lib/libSystem.B.dylib
0x7fff65412000 - 0x7fff65413fff libThaiTokenizer.dylib (3) <4F4ADE99-0D09-3223-B7C0-C407AB6DE8DC> /usr/lib/libThaiTokenizer.dylib
0x7fff6542b000 - 0x7fff65441fff libapple_nghttp2.dylib (1.39.2) <07FEC48A-87CF-32A3-8194-FA70B361713A> /usr/lib/libapple_nghttp2.dylib
0x7fff65476000 - 0x7fff654e8ff7 libarchive.2.dylib (72.140.1) /usr/lib/libarchive.2.dylib
0x7fff65586000 - 0x7fff65586ff3 libauto.dylib (187) /usr/lib/libauto.dylib
0x7fff6564c000 - 0x7fff6565cffb libbsm.0.dylib (60.100.1) <00BFFB9A-2FFE-3C24-896A-251BC61917FD> /usr/lib/libbsm.0.dylib
0x7fff6565d000 - 0x7fff65669fff libbz2.1.0.dylib (44) <14CC4988-B6D4-3879-AFC2-9A0DDC6388DE> /usr/lib/libbz2.1.0.dylib
0x7fff6566a000 - 0x7fff656bcfff libc++.1.dylib (902.1) <59A8239F-C28A-3B59-B8FA-11340DC85EDC> /usr/lib/libc++.1.dylib
0x7fff656bd000 - 0x7fff656d2ffb libc++abi.dylib (902) /usr/lib/libc++abi.dylib
0x7fff656d3000 - 0x7fff656d3fff libcharset.1.dylib (59) <72447768-9244-39AB-8E79-2FA14EC0AD33> /usr/lib/libcharset.1.dylib
0x7fff656d4000 - 0x7fff656e5fff libcmph.dylib (8) /usr/lib/libcmph.dylib
0x7fff656e6000 - 0x7fff656fdfd7 libcompression.dylib (87) <64C91066-586D-38C0-A2F3-3E60A940F859> /usr/lib/libcompression.dylib
0x7fff659d7000 - 0x7fff659edff7 libcoretls.dylib (167) <770A5B96-936E-34E3-B006-B1CEC299B5A5> /usr/lib/libcoretls.dylib
0x7fff659ee000 - 0x7fff659effff libcoretls_cfhelpers.dylib (167) <940BF370-FD0C-30A8-AA05-FF48DA44FA4C> /usr/lib/libcoretls_cfhelpers.dylib
0x7fff66115000 - 0x7fff66115fff libenergytrace.dylib (21) <162DFCC0-8F48-3DD0-914F-FA8653E27B26> /usr/lib/libenergytrace.dylib
0x7fff6613c000 - 0x7fff6613efff libfakelink.dylib (149.1) <36146CB2-E6A5-37BB-9EE8-1B4034D8F3AD> /usr/lib/libfakelink.dylib
0x7fff6614d000 - 0x7fff66152fff libgermantok.dylib (24) /usr/lib/libgermantok.dylib
0x7fff6615d000 - 0x7fff6624dfff libiconv.2.dylib (59) <18311A67-E4EF-3CC7-95B3-C0EDEE3A282F> /usr/lib/libiconv.2.dylib
0x7fff6624e000 - 0x7fff664a5fff libicucore.A.dylib (64260.0.1) <8AC2CB07-E7E0-340D-A849-186FA1F27251> /usr/lib/libicucore.A.dylib
0x7fff664bf000 - 0x7fff664c0fff liblangid.dylib (133) <30CFC08C-EF36-3CF5-8AEA-C1CB070306B7> /usr/lib/liblangid.dylib
0x7fff664c1000 - 0x7fff664d9ff3 liblzma.5.dylib (16) /usr/lib/liblzma.5.dylib
0x7fff664f1000 - 0x7fff66598ff7 libmecab.dylib (883.11) <0D5BFD01-D4A7-3C8D-AA69-C329C1A69792> /usr/lib/libmecab.dylib
0x7fff66599000 - 0x7fff667fbff1 libmecabra.dylib (883.11) /usr/lib/libmecabra.dylib
0x7fff66cc7000 - 0x7fff67143ff5 libnetwork.dylib (1880.120.4) /usr/lib/libnetwork.dylib
0x7fff671e4000 - 0x7fff67217fde libobjc.A.dylib (787.1) <6DF81160-5E7F-3E31-AA1E-C875E3B98AF6> /usr/lib/libobjc.A.dylib
0x7fff6722a000 - 0x7fff6722efff libpam.2.dylib (25.100.1) <0502F395-8EE6-3D2A-9239-06FD5622E19E> /usr/lib/libpam.2.dylib
0x7fff67231000 - 0x7fff67267ff7 libpcap.A.dylib (89.120.1) /usr/lib/libpcap.A.dylib
0x7fff6735f000 - 0x7fff67549ff7 libsqlite3.dylib (308.5) <35A2BD9F-4E33-30DE-A994-4AB585AC3AFE> /usr/lib/libsqlite3.dylib
0x7fff6779a000 - 0x7fff6779dffb libutil.dylib (57) /usr/lib/libutil.dylib
0x7fff6779e000 - 0x7fff677abff7 libxar.1.dylib (425.2) /usr/lib/libxar.1.dylib
0x7fff677b1000 - 0x7fff67893fff libxml2.2.dylib (33.5) /usr/lib/libxml2.2.dylib
0x7fff67897000 - 0x7fff678bffff libxslt.1.dylib (16.9) <34A45627-DA5B-37D2-9609-65B425E0010A> /usr/lib/libxslt.1.dylib
0x7fff678c0000 - 0x7fff678d2ff3 libz.1.dylib (76) <793D9643-CD83-3AAC-8B96-88D548FAB620> /usr/lib/libz.1.dylib
0x7fff68181000 - 0x7fff68186ff3 libcache.dylib (83) /usr/lib/system/libcache.dylib
0x7fff68187000 - 0x7fff68192fff libcommonCrypto.dylib (60165.120.1) /usr/lib/system/libcommonCrypto.dylib
0x7fff68193000 - 0x7fff6819afff libcompiler_rt.dylib (101.2) <49B8F644-5705-3F16-BBE0-6FFF9B17C36E> /usr/lib/system/libcompiler_rt.dylib
0x7fff6819b000 - 0x7fff681a4ff7 libcopyfile.dylib (166.40.1) <3C481225-21E7-370A-A30E-0CCFDD64A92C> /usr/lib/system/libcopyfile.dylib
0x7fff681a5000 - 0x7fff68237fdb libcorecrypto.dylib (866.140.1) <60567BF8-80FA-359A-B2F3-A3BAEFB288FD> /usr/lib/system/libcorecrypto.dylib
0x7fff68344000 - 0x7fff68384ff0 libdispatch.dylib (1173.100.2) /usr/lib/system/libdispatch.dylib
0x7fff68385000 - 0x7fff683bbfff libdyld.dylib (750.6) <789A18C2-8AC7-3C88-813D-CD674376585D> /usr/lib/system/libdyld.dylib
0x7fff683bc000 - 0x7fff683bcffb libkeymgr.dylib (30) /usr/lib/system/libkeymgr.dylib
0x7fff683bd000 - 0x7fff683c9ff3 libkxld.dylib (6153.141.2.2) <30AACC57-2314-3863-94B2-64AB3E002B35> /usr/lib/system/libkxld.dylib
0x7fff683ca000 - 0x7fff683caff7 liblaunch.dylib (1738.140.1) /usr/lib/system/liblaunch.dylib
0x7fff683cb000 - 0x7fff683d0ff7 libmacho.dylib (959.0.1) /usr/lib/system/libmacho.dylib
0x7fff683d1000 - 0x7fff683d3ff3 libquarantine.dylib (110.40.3) /usr/lib/system/libquarantine.dylib
0x7fff683d4000 - 0x7fff683d5ff7 libremovefile.dylib (48) <7C7EFC79-BD24-33EF-B073-06AED234593E> /usr/lib/system/libremovefile.dylib
0x7fff683d6000 - 0x7fff683edff3 libsystem_asl.dylib (377.60.2) <1563EE02-0657-3B78-99BE-A947C24122EF> /usr/lib/system/libsystem_asl.dylib
0x7fff683ee000 - 0x7fff683eeff7 libsystem_blocks.dylib (74) <0D53847E-AF5F-3ACF-B51F-A15DEA4DEC58> /usr/lib/system/libsystem_blocks.dylib
0x7fff683ef000 - 0x7fff68476fff libsystem_c.dylib (1353.100.2) /usr/lib/system/libsystem_c.dylib
0x7fff68477000 - 0x7fff6847affb libsystem_configuration.dylib (1061.141.1) <0EE84C33-64FD-372B-974A-AF7A136F2068> /usr/lib/system/libsystem_configuration.dylib
0x7fff6847b000 - 0x7fff6847efff libsystem_coreservices.dylib (114) /usr/lib/system/libsystem_coreservices.dylib
0x7fff6847f000 - 0x7fff68487fff libsystem_darwin.dylib (1353.100.2) <5B12B5DB-3F30-37C1-8ECC-49A66B1F2864> /usr/lib/system/libsystem_darwin.dylib
0x7fff68488000 - 0x7fff6848ffff libsystem_dnssd.dylib (1096.100.3) /usr/lib/system/libsystem_dnssd.dylib
0x7fff68490000 - 0x7fff68491ffb libsystem_featureflags.dylib (17) <29FD922A-EC2C-3F25-BCCC-B58D716E60EC> /usr/lib/system/libsystem_featureflags.dylib
0x7fff68492000 - 0x7fff684dfff7 libsystem_info.dylib (538) <8A321605-5480-330B-AF9E-64E65DE61747> /usr/lib/system/libsystem_info.dylib
0x7fff684e0000 - 0x7fff6850cff7 libsystem_kernel.dylib (6153.141.2.2) <5CDBBC06-6CA6-3432-9FDA-681047866F3E> /usr/lib/system/libsystem_kernel.dylib
0x7fff6850d000 - 0x7fff68554fff libsystem_m.dylib (3178) <00F331F1-0D09-39B3-8736-1FE90E64E903> /usr/lib/system/libsystem_m.dylib
0x7fff68555000 - 0x7fff6857cfff libsystem_malloc.dylib (283.100.6) <8549294E-4C53-36EB-99F3-584A7393D8D5> /usr/lib/system/libsystem_malloc.dylib
0x7fff6857d000 - 0x7fff6858affb libsystem_networkextension.dylib (1095.140.2) /usr/lib/system/libsystem_networkextension.dylib
0x7fff6858b000 - 0x7fff68594ff7 libsystem_notify.dylib (241.100.2) /usr/lib/system/libsystem_notify.dylib
0x7fff68595000 - 0x7fff6859dfef libsystem_platform.dylib (220.100.1) <009A7C1F-313A-318E-B9F2-30F4C06FEA5C> /usr/lib/system/libsystem_platform.dylib
0x7fff6859e000 - 0x7fff685a8fff libsystem_pthread.dylib (416.100.3) <62CB1A98-0B8F-31E7-A02B-A1139927F61D> /usr/lib/system/libsystem_pthread.dylib
0x7fff685a9000 - 0x7fff685adff3 libsystem_sandbox.dylib (1217.141.2) <051C4018-4345-3034-AC98-6DE42FB8273B> /usr/lib/system/libsystem_sandbox.dylib
0x7fff685ae000 - 0x7fff685b0fff libsystem_secinit.dylib (62.100.2) /usr/lib/system/libsystem_secinit.dylib
0x7fff685b1000 - 0x7fff685b8ffb libsystem_symptoms.dylib (1238.120.1) <5820A2AF-CE72-3AB3-ABCC-273A3419FB55> /usr/lib/system/libsystem_symptoms.dylib
0x7fff685b9000 - 0x7fff685cfff2 libsystem_trace.dylib (1147.120) <04B47629-847B-3D74-8ABE-C05EF9DEEFE4> /usr/lib/system/libsystem_trace.dylib
0x7fff685d1000 - 0x7fff685d6ff7 libunwind.dylib (35.4) <42B7B509-BAFE-365B-893A-72414C92F5BF> /usr/lib/system/libunwind.dylib
0x7fff685d7000 - 0x7fff6860cffe libxpc.dylib (1738.140.1) <3E243A41-030F-38E3-9FD2-7B38C66C35B1> /usr/lib/system/libxpc.dylib
Sample analysis of process 40325 written to file /dev/stdout

Minimal Reproduction

The code below when compiled with go build and then executed, should immediately hang with 100% processor utilization on a single process thread, when run on a CPU missing either AVX512_BITALG or AVX512_VPOPCNTDQ CPUID feature flags (which I believe at the time of this writing is all Apple Macs).

main.go

package main

func popcnt()  // assembly stub

func main() {
	popcnt()
}

popcnt_amd64.s

// +build !gccgo,!purego

#include "textflag.h"

// func popcnt() 
TEXT ·popcnt(SB), NOSPLIT, $0-0

// This instruction causes the Go runtime to immediately hang at 100% CPU utilization
VPOPCNTW Z1, Z0   // Requires AVX512_BITALG

// Or equivalently, so does this one
VPOPCNTQ Z1, Z0  // Requires AVX512_VPOPCNTDQ

RET
@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 17, 2020

Strange.

I get the same behavior from C.

main.c:

void foo();
int main(int argc, char *argv[]) {
  foo();
}

main.s:

	.globl _foo
_foo:
	vpopcntw	%zmm1, %zmm0
	ret

This program also hangs. Compile with gcc main.c main.s, run with ./a.out.
So I think this is an OSX bug, not a Go bug.

Nothing obvious when run under a debugger. The debugger runs it forever, and every time I interrupt it it is at the vpopcntw instruction.

@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 17, 2020

The same C code generates an illegal instruction fault on Linux, so chances are it isn't the chip (although my mac and linux boxes aren't exactly the same chip.)

@vsivsi
Copy link
Author

@vsivsi vsivsi commented Nov 17, 2020

The Darwin kernel has a semi-spooky 2-tier AVX512 process "promotion" mechanism that involves trapping AVX512 instruction faults, changing process status to support AVX512, and then rerunning the offending instruction. In theory this scheme should only happen once per process upon encountering the first AVX512 instruction. The purpose is to avoid the large additional process state required for AVX512 (around 2KB) when it is not needed. I would assume that it would only try this promotion procedure once per process, such that if the AVX512 instruction causing the fault still isn't supported after enabling AVX512 in the process state, that fault should revert to the process. But I'm way out over my skis on this kind of stuff... Here's the Darwin reference:

https://github.com/apple/darwin-xnu/blob/0a798f6738bc1db01281fc08ae024145e84df927/osfmk/i386/fpu.c#L176

@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 17, 2020

I've submitted a bug to Apple, reference number FB8902463. Their bug reporting tool isn't really public, so I'll report back here if they say anything (which they usually don't, they just silently ignore them).

@randall77 randall77 added the OS-Darwin label Nov 17, 2020
@ALTree ALTree changed the title Using AVX-512 instruction without supporting CPUID flag(s) on MacOS hangs go runtime runtime: using AVX-512 instruction without supporting CPUID flag(s) on MacOS hangs the Go runtime Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.