Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/asm: can't store float32 in odd-numbered registers s1,s3,s5,... on GOARCH=arm #33379

Open
jpap opened this issue Jul 31, 2019 · 4 comments

Comments

@jpap
Copy link

commented Jul 31, 2019

We lack the ability to store float32 into odd-numbered single-precision floating point sX registers on GOARCH=arm.

The only way to access the floating-point registers is via the F0, F1, ... register names, and these correspond to the double-precision float64 registers d0, d1, .... If you write a 32-bit value to these registers, the assembler will emit an opcode that writes to the single-precision floating point register sX that aliases the corresponding dX register, where dX == s(2*X).

The only way to access the odd-numbered sX registers is to write BYTE $0x.... statements and lay out the opcodes directly. But this has the unfortunate consequence of not being able to use virtual registers like FP, and thus extra instructions are required to work around this limitation.

What version of Go are you using (go version)?

$ go version
go version go1.12.4 darwin/amd64

Does this issue reproduce with the latest release?

Yes; including tip as of July 25, 2019.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/jpap/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/jpap/Development/go"
GOPROXY=""
GORACE=""
GOROOT="/Users/jpap/Development/go/src/github.com/jpap/go"
GOTMPDIR=""
GOTOOLDIR="/Users/jpap/Development/go/src/github.com/jpap/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/wc/jkc42h512cv0n1tdp51yjp7w0000gq/T/go-build781111158=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

[main.go]

package main

func storeFloat(a, b, c, d, e, f float32)

func main() {
	storeFloat(1, 2, 3, 4, 5, 6)
}

[storeFloat.s]

// +build arm

// storeFloat(a, b, c, d, e, f float32)
TEXT ·storeFloat(SB), $0-24
  MOVF  a+0(FP), F0   // ed9d0a01        vldr    s0, [sp, #4]
  MOVF  b+4(FP), F1   // ed9d1a02        vldr    s2, [sp, #8]
  MOVF  c+8(FP), F2   // ed9d2a03        vldr    s4, [sp, #12]
  MOVF d+12(FP), F3   // ed9d3a04        vldr    s6, [sp, #16]
  MOVF e+16(FP), F4   // ed9d4a05        vldr    s8, [sp, #20]
  MOVF f+20(FP), F5   // ed9d5a06        vldr    s10, [sp, #24]
  RET

Build the above project, for example to target the Raspberry Pi having vfp support:

env GOOS=linux GOARCH=arm GOARM=6 go build -o demo .

Disassemble the binary:

objdump -d demo

Observe the opcodes of the MOVF instructions in storeFloat: they all reference even-numbered registers s0, s2, s4, s6, s8, s10. I have appended the disassembly to the code above as line-comments.

I did try to specify single-precision floating point registers using S0, S1, S2, S3, ..., however the assembler fails with errors:

./storeFloat.s:5: illegal or missing addressing mode for symbol S0
./storeFloat.s:6: illegal or missing addressing mode for symbol S1
./storeFloat.s:7: illegal or missing addressing mode for symbol S2
./storeFloat.s:8: illegal or missing addressing mode for symbol S3
./storeFloat.s:9: illegal or missing addressing mode for symbol S4
./storeFloat.s:10: illegal or missing addressing mode for symbol S5
asm: assembly of ./storeFloat.s failed

A quick look at the Go source code shows that only REG_Rx and REG_Fx register types have been defined in src/cmd/internal/obj/arm.

@jpap

This comment has been minimized.

Copy link
Author

commented Aug 2, 2019

Concretely, it would be great if we could:

  • Expand the legal floating point registers that the assembler accepts to F0, ..., F31 so that we can target all 32 single-precision registers.
  • Specify the floating-point precision in a MOV instruction so that the assembler can map FN to dN or sN as appropriate. This is how the arm64 architecture is treated: there is FMOVS for a single-precision load, and FMOVD for the double-precision variant.

@ALTree ALTree added this to the Go1.14 milestone Aug 2, 2019

@cherrymui

This comment has been minimized.

Copy link
Contributor

commented Aug 6, 2019

  • Expand the legal floating point registers that the assembler accepts to F0, ..., F31 so that we can target all 32 single-precision registers.

We could consider adding more register support. I would think adding Sx registers, either odd-numbered only, or all 32 with S0 aliases F0, S2 aliases F1, etc., instead of changing the existing semantics of Fx registers.

What is your use case for these registers? Accessing them in assembly (only)? If you want the Go compiler to use all 32 single-precision registers, it would need more than that.

  • Specify the floating-point precision in a MOV instruction so that the assembler can map FN to dN or sN as appropriate. This is how the arm64 architecture is treated: there is FMOVS for a single-precision load, and FMOVD for the double-precision variant.

We already have: MOVF and MOVD.

@jpap

This comment has been minimized.

Copy link
Author

commented Aug 7, 2019

What is your use case for these registers? Accessing them in assembly (only)?

I only access these registers from assembly, currently via the BYTE workaround. I construct the opcodes in a code generator using a hard-coded FP virtual register to R13 (SP) mapping based on the TEXT stack size.

@jpap

This comment has been minimized.

Copy link
Author

commented Aug 7, 2019

We could consider adding more register support. I would think adding Sx registers, either odd-numbered only, or all 32 with S0 aliases F0, S2 aliases F1, etc., instead of changing the existing semantics of Fx registers.

It would be nice to have some consistency with arm64, where the Fx virtual register names map to the single- and double-precision aarch64 register sx and dx depending on the mnemonic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.