We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go version
$ go version go version go1.14.3 linux/arm64
Yes
go env
GO111MODULE="" GOARCH="arm64" GOBIN="" GOCACHE="/home/barak/.cache/go-build" GOENV="/home/barak/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="arm64" GOHOSTOS="linux" GOINSECURE="" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/barak/src/popcount" GOPRIVATE="" GOPROXY="https://proxy.golang.org,direct" GOROOT="/usr/lib/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/lib/go/pkg/tool/linux_arm64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build456776336=/tmp/go-build -gno-record-gcc-switches"
I was working with direct Arm64 assembly instructions and discovered the following minimal reproduction:
main.go:
// +build arm64,!gccgo package main import "fmt" func CountBytes(s []byte) uint64 { if len(s) != 16 { return 0 } return countSixteenBytes(&s[0], uint64(len(s))) } // This function is implemented in sixteen_bytes_arm64.s //go:noescape func countSixteenBytes(src *byte, len uint64) (ret uint64) func main() { bytes := []byte{ 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, 0x03, } x := CountBytes(bytes) fmt.Println(x) // SHOULD PRINT 32, PRINTS 16 }
sixteen_bytes_arm64.s
// +build arm64,!gccgo,!appengine #include "textflag.h" TEXT ·countSixteenBytes(SB), NOSPLIT, $0 MOVD src+0(FP), R1 MOVD len+8(FP), R2 VLD1.P 16(R1), [V16.B16] VCNT V16.B16, V7.B16 WORD $0x6E3038E8 // UADDLV V7.B16, V8.H VMOV V8.H[0], R0 MOVD R0, ret+16(FP) RET
The line VCNT V16.B16, V7.B16 is the troubling one. The assembler accepts it, as it is a valid instruction.
VCNT V16.B16, V7.B16
I expected the CountBytes function to return 32 (as indeed, the population count of the array is 32)
Instead, it returns 16.
Stepping through the debugger, I saw something very odd in the disassembly.
objdump -d repro
000000000009a970 <main.countSixteenBytes>: 9a970: f94007e1 ldr x1, [sp, #8] 9a974: f9400be2 ldr x2, [sp, #16] 9a978: 4cdf7030 ld1 {v16.16b}, [x1], #16 9a97c: 0e205a07 cnt v7.8b, v16.8b 9a980: 6e3038e8 uaddlv h8, v7.16b 9a984: 0e023d00 umov w0, v8.h[0] 9a988: f9000fe0 str x0, [sp, #24] 9a98c: d65f03c0 ret
This is the silent error. I asked for the equivalent of
4e205a07 cnt v7.16b, v16.16b
Which is the extended, arm64, 16-byte vector version of the instruction, and instead got the 8-byte version when assembled. (On page 167 of this ARM64 guide, the Q flag bit is not set in the first instruction byte, when it should be: https://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdf)
The text was updated successfully, but these errors were encountered:
Thanks for reporting this, I'll send a fix.
Sorry, something went wrong.
Change https://golang.org/cl/236638 mentions this issue: cmd/asm: fix the encoding error of VCNT instruction for arm64
cmd/asm: fix the encoding error of VCNT instruction for arm64
e5e3869
No branches or pull requests
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I was working with direct Arm64 assembly instructions and discovered the following minimal reproduction:
main.go:
sixteen_bytes_arm64.s
The line
VCNT V16.B16, V7.B16
is the troubling one. The assembler accepts it, as it is a valid instruction.What did you expect to see?
I expected the CountBytes function to return 32 (as indeed, the population count of the array is 32)
Instead, it returns 16.
What did you see instead?
Stepping through the debugger, I saw something very odd in the disassembly.
objdump -d repro
This is the silent error. I asked for the equivalent of
Which is the extended, arm64, 16-byte vector version of the instruction, and instead got the 8-byte version when assembled.
(On page 167 of this ARM64 guide, the Q flag bit is not set in the first instruction byte, when it should be: https://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdf)
The text was updated successfully, but these errors were encountered: