Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: switch to a register-based calling convention for Go functions #40724

Open
76 of 106 tasks
aclements opened this issue Aug 12, 2020 · 258 comments
Open
76 of 106 tasks

cmd/compile: switch to a register-based calling convention for Go functions #40724

aclements opened this issue Aug 12, 2020 · 258 comments

Comments

@aclements
Copy link
Member

@aclements aclements commented Aug 12, 2020

I propose that we switch the Go internal ABI (used between Go functions) from stack-based to register-based argument and result passing for Go 1.16 1.17.

I lay out the details of our proposal and the work required in this document.

The ABI specification can be found here.

This was previously proposed in #18597, where @dr2chase did some excellent prototyping work that our new proposal builds on. With multiple ABI support, we’re now in a much better position to execute on this without breaking compatibility with the existing body of Go assembly code. I’ve opened this new issue to focus on discussion of our new proposal.

/cc @dr2chase @danscales @thanm @cherrymui @mknyszek @prattmic @randall77

An incomplete and evolving list of tasks:

  • Define ABI specification (CL, @aclements)
  • Add GOEXPERIMENT=regabi and asm macro (CL)
  • cmd/compile: late call lowering (CLs)
  • cmd/compile: implement register arguments (@dr2chase)
    • spill around morestack call (CL)
    • add temporary ABI pragma for testing (CL)
    • add abstract argument registers (WIP CL)
    • implement ABI rules (CL)
    • use abstract argument registers
  • cmd/compile: implement register returns
  • cmd/compile: ABI wrappers (CL, old CL)
    • cmd/link: mangle wrapper names for ELF (CL)
    • cmd/link: mangle wrapper names for PE (CL)
    • implement zeroing of X15 in ABI0->ABIInternal wrappers (CL)
  • cmd/asm: add syntax for runtime packages to reference ABIInternal symbols (CL)
    • Mark runtime.{goexit,asyncPreempt}, reflect.{makeFuncStub,methodValueCall} definitions ABIInternal (CL)
    • Mark reference to runtime.callbackWrap in sys_windows_amd64.s as ABIInternal (CL)
  • runtime: fix assembly closure calls (mcall) (@aclements, CL)
  • reflect: support calling ABIInternal (CL)
  • reflect: support MakeFunc and method value calls (CL)
  • go and defer statements
    • cmd/compile: desugar go/defer to zero-argument (@thanm, CL)
    • cmd/compile: fix defer recover() (@cherrymui, CL)
    • cmd/compile: desugar go/defer with results (@cherrymui, CL)
  • cmd/cgo: decouple cgo callbacks from ABI (CL)
  • runtime: finalizer support (CL)
  • runtime: Windows callback support (CL)
  • cmd/internal/obj: introduce and use RegEntryTmp registers in prologue (@mknyszek, CL)
  • cmd/compile: generate correct ABI0 argument maps for body-less functions (see WriteFuncMap) (@dr2chase, CL)
  • cmd/compile: make cgo_unsafe_args generate an ABI0 function (or an ABIInternal-with-no-registers function) (@cherrymui, CL)
  • runtime: update debug call protocol (CL)
  • Fix ABI0/ABIInternal confusion in cgo (CL)
  • Run signature fuzzer and fix bugs (everyone)
  • Get all.bash to pass on Linux, Windows, and Darwin (everyone)

High-priority non-critical path

  • Export ABI information across package boundaries to eliminate more wrappers (@aclements, CL)
  • Make funcPC always return the "native" PC of a function (maybe also introduce ABIOther) (@cherrymui, CL)
  • Strip unnecessary ABIInternal annotations once funcPC is in
  • Tracebacks
    • Define extended argument metadata format (in progress, @cherrymui)
    • cmd/compile: produce extended argument metadata (@cherrymui, CL)
    • runtime: consume extended argument metadata (@cherrymui, CL)
    • cmd/compile,runtime: traceback metadata for spill slot liveness
  • DWARF
    • cmd/compile: ensure arguments have appropriate DWARF locations
    • Ensure GDB and Delve can find register arguments
    • [ ] cmd/compile: add a DWARF vendor attribute with function argument frame size (context)
  • Wrapper mangling for non-critical arches
    • cmd/link: mangle wrapper names for Mach-O (CL)
    • cmd/link: mangle wrapper names for xcoff
    • cmd/link: mangle wrapper names for Plan 9
  • runtime: simplify go/defer call paths
    • assert that go/defer frames are zero sized (CL)
    • replace defer reflectcall with a direct closure call (CL)
  • cmd/internal/obj: reject splittable ABIInternal functions without morestack spill info (e.g., asm functions) because we can't generate a correct morestack path (CL)
  • cmd/asm: don't reference args_stackmap in ABIInternal functions (because it's for ABI0) (CL)
  • Port performance-critical assembly to ABIInternal (see CL 296372 for valuable functions) (CL 308931, CL 310184)
  • math: fix overhead of assembly->Go calls CL 310331
  • Release notes
    • Mention undocumented behavior we've observed applications depending on that may have changed.
      • Using reflect.ValueOf(fn).Pointer() to get the PC of an assembly function will now return the PC of the ABI wrapper
    • Performance surprise: calling from assembly to Go is slightly more expensive.
    • Performance surprise: calling an assembly function via a closure is slightly more expensive.
    • Traceback format is much improved

Enabling steps

  • Enable regabiwrappers by default on amd64 (Linux, Windows, and Darwin)
  • Enable regabidefer
  • Enable regabireflect
  • Enable regabig
  • Enable regabiargs

Testing

  • Function signature fuzzer (ongoing)
    • Static call, closure call, method call, interface call, reflect.{ValueOf(target),MakeFunc(x, target),Method(x)}.{Call,Interface}, called from defer, called from go, called as a finalizer
    • {Big,Small} {argument,result} in {memory,registers} and {not addressed,addressed and not leaked to heap,addressed and leaked to heap}
    • Use defer in a test function (check arguments, modify results)
    • Pass pointers-to-stack and pointers-to-heap
    • Cause result to move to heap
    • runtime: add MoveStackOnNextCall, assertions for pointer-to-stack/pointer-to-heap, assertions for live/dead (@aclements, CL)
    • Integrate runtime testing hook
    • Clean up fuzzer, and get into tools repo.
  • cmd/compile: revive clobberdead (CL)
  • cmd/compile: add clobberdeadreg (CL)

Post-MVP

  • Move g from TLS to register (CL)
    • Use g register on non-Linux OSes (see CL)
  • Reserve and use X15 as zero register (CL)
    • [ ] Should this be X0 for more compact instruction encoding?
  • Port to other platforms
    • amd64
    • arm64
    • mips
    • ppc
    • s390x
    • risc-v
    • porting guide
  • Eliminate legacy ABI paths for all platforms, even if they're using ABIInternal with 0 registers
  • Eliminate spill slots?
  • Pass pointer-shaped method receiver in context register (#44827)
  • runtime: simplify go/defer call paths
    • simplify go/defer paths to assume zero args/results
    • simplify special _defer allocator and remove special _defer cases from mallocgc (because _defer records will be fixed size).
  • cmd/vet: add syntax and checks for register arguments and results in assembly like x+0(FP)
  • cmd/vet: check argument size for ABIInternal functions and fix runtime asm declarations

Cleanup (can be done later)

  • runtime: port all assembly -> Go calls to ABIInternal?
  • Eliminate ABIAlias support from cmd/compile, cmd/link, and object format
  • remove now-redundant fields from ssa.AuxCall
  • rationalize use of LocalsOffset/FrameOffset (for registers that use a link register, e.g. arm64, powerppc).
  • cmd/compile: revisit abiutils.go and clean up the API (also reduce ABIConfig copying)
  • cmd/internal/obj: generate stack maps in obj so we can better compact the morestack/body maps and reduce subtlety?
  • cmd/compile,runtime: re-enable passing arguments to go in runtime (context)
  • cmd/compile: always attach ir.Func to function ir.Names (context)
  • cmd/compile: support ABIInternal for tail calls (context)
  • cmd/compile: once wrapping of defers is on for all arch/OS combinations, it should be possible to simplify or remove some of the openDefer processing in ssagen (notablly we could get rid of openDeferSave()
  • [ ] runtime: think about an ABI-insensitive debugCall Delve uses DWARF information
  • Fix up named slots & value association
  • fix firstpos in ssa.go – sometimes it's late.
@aclements aclements added this to the Go1.16 milestone Aug 12, 2020
@seebs
Copy link
Contributor

@seebs seebs commented Aug 12, 2020

Once upon a time, I was one of several people whose primary work activity for I think two days was tracking down a weird kernel crash, which was ultimately caused by callee-saved registers. Well, the crash was caused by a [FOO_MAX] array being overrun. But the thing where the result of that overrun was to overwrite a single automatic variable, which never had its address taken, five call frames away? That was caused by callee-saved registers.

Which is to say:

Platform ABIs typically define callee-save registers, which place substantial additional requirements on a garbage collector. There are alternatives to callee-save registers that share many of their benefits, while being much better suited to Go.

+1 +1 +1 +1 +1 +1 [...]

Loading

@ncw
Copy link
Contributor

@ncw ncw commented Aug 13, 2020

If I'm understanding the proposal correctly, assembler code will continue to be written with the current stack based calling conventions.

I understand the excellent reasoning behind doing this, I'll just note that this is surely going to frustrate assembly code writers not being able to pass and receive args in registers as often assembly code fragments are tiny and the overhead of calling them is massive.

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Aug 13, 2020

@ncw This restriction to ABI0 for assembly writers isn't necessarily permanent. See https://go.googlesource.com/proposal/+/master/design/27539-internal-abi.md#proposal.

Loading

@laboger
Copy link
Contributor

@laboger laboger commented Aug 13, 2020

Right now writeBarrier is a special builtin that allows arguments to be passed in registers. Couldn't that same concept be extended to others like memmove and memcpy to avoid call overhead even though they are asm?

Loading

@aclements
Copy link
Member Author

@aclements aclements commented Aug 13, 2020

@laboger , yes, we're considering special-casing just a few assembly functions. The hottest ones by far are memmove, memclrNoHeapPointers, and possibly syscall.Syscall. The write barrier calling convention is very specialized because its context is so unusual and performance-sensitive. We'll probably just switch the others to use the register ABI rather than anything more specialized (and teach the toolchain that those are special even though they're assembly).

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented Aug 14, 2020

A comment on the "stack growth" section of the doc, specifically about spilling register state for morestack into the guard space for a goroutine (copied from Gerrit):

I worry a little bit about cases where functions marked nosplit eventually call into a non-nosplit function. If the nosplit frames are large (but don't exceed the guard space) we might find out that we don't have any space left. What should we do in this case? Throwing doesn't seem quite right since this works just fine today (though, you have to be careful). Should we have "guard space" and a subset of that as "no, really, this is very guarded" space? That would require extending the guard space a bit further which isn't great...

One alternative is to manage per-goroutine space for this, but that's not great either, since its more memory used per goroutine.

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Aug 14, 2020

Good point. This is especially true for architectures that have a lot registers. Maybe we could decide that some registers are never live across a call?

Loading

@aclements
Copy link
Member Author

@aclements aclements commented Aug 14, 2020

That's an excellent point. A few possible solutions come to mind:

  1. It would be really sad if we had to make more per-goroutine space for this since you almost always have space on the stack. One possibility is to spill to the stack if there's room, and otherwise we keep a pool of allocated "spill space" objects. For that, we could keep just one pre-allocated per P and if the goroutine needs it, it can pull it off the P, spill into it, and then once it's entered the runtime it can allocate a new one (probably getting it from a global allocation pool) and attach it to the P. On return, it can return it to the allocation pool.

  2. A similar option would be to have just one spill space per P. Again, use the stack if you can, otherwise spill into the P, enter the runtime, grow the stack even if it's just a preemption, and then dump the spilled registers back onto the stack in the right place.

  3. We could revisit the nosplit rules. It's often (though not always) a mistake to call from a nosplit function to a non-nosplit function. I'm not sure exactly what this would look like. Maybe a non-nosplit function called from a nosplit function has to have a special prologue? Related: #21314 (comment). I think the only cases I enumerated in that comment that intentionally call from a nosplit to a non-nosplit function aren't running on a user G stack and thus can't grow it anyway.

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 1, 2020

Change https://golang.org/cl/252258 mentions this issue: cmd/asm: define a macro for GOEXPERIMENT=regabi

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 1, 2020

Change https://golang.org/cl/252257 mentions this issue: cmd/internal/objabi: add regabi GOEXPERIMENT

Loading

gopherbot pushed a commit that referenced this issue Sep 3, 2020
This is the "feature flag" for the register calling convention work
(though since this work is expected to extend over a few releases,
it's not version-prefixed). This will let us develop the register
calling convention on the main branch while maintaining an easy toggle
between the old and new ABIs.

Updates #40724.

Change-Id: I129c8d87d34e6fa0910b6fa43efb35b706021637
Reviewed-on: https://go-review.googlesource.com/c/go/+/252257
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Sep 3, 2020
This defines a macro for the regabi GOEXPERIMENT when assembling
runtime assembly code.

In general, assembly code will be shielded from the calling convention
change, but there is a small amount of runtime assembly that is going
to have to change. By defining a macro, we can easily make the small
necessary changes. The other option is to use build tags, but that
would require duplicating nontrivial amounts of unaffected code,
leading to potential divergence issues. (And unlike Go code, assembly
code can't depend on the compiler optimizing away branches on a
feature constant.) We consider the macro preferable, especially since
this is expected to be temporary as we transition to the new calling
convention.

Updates #40724.

Change-Id: I73984065123968337ec10b47bb12c4a1cbc07dc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/252258
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
@gopherbot
Copy link

@gopherbot gopherbot commented Oct 1, 2020

Change https://golang.org/cl/258938 mentions this issue: runtime,cmd/cgo: simplify C -> Go call path

Loading

@aclements
Copy link
Member Author

@aclements aclements commented Oct 2, 2020

@alexbrainman , I just ran into a bit of a snag on Windows that maybe you can better inform me on. It looks like syscall.NewCallback is going to also need an implementation of the Go internal ABI rules and I'm trying to figure out exactly what that needs to look like. (I pinged you because it looks like you last substantially touched that code, but feel free to pass the ping.)

The documentation says "The argument is expected to be a function with one uintptr-sized result. The function must not have arguments with size larger than the size of uintptr." Does this actually mean the arguments (and result) must be an integral, floating-point, or pointer type? Or does it mean, say, struct { x, y uint16 } or struct { x [4]byte } is allowed, since those are technically no larger than the size of a uintptr? The current implementation permits the latter, though I don't know if it actually works.

If they must be int/float/pointer, then mapping from the Windows ABI to the Go register ABI is relatively straightforward. If compounds are allowed, then this gets much harder.

Loading

@networkimprov
Copy link

@networkimprov networkimprov commented Oct 2, 2020

cc @jstarks re the ABI question.

Loading

@alexbrainman
Copy link
Member

@alexbrainman alexbrainman commented Oct 3, 2020

@aclements I reckon there are very few instances of syscall.NewCallback usage. But they are used a lot.

If you want to build Windows GUI, you have to use syscall.NewCallback, because Windows GUI API requires syscall.NewCallback.

Similarly, if you want to build Windows service, you have to use syscall.NewCallback.

These two pretty much cover syscall.NewCallback usage.

For Windows service, you can look in golang.org/x/sys/windows/svc. It uses this RegisterServiceCtrlHandlerEx WIndows API

https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-registerservicectrlhandlerexw

which requires LPHANDLER_FUNCTION_EX

https://docs.microsoft.com/en-us/windows/win32/api/winsvc/nc-winsvc-lphandler_function_ex

so LPHANDLER_FUNCTION_EX returns DWORD (uint32).

For GUI grep in golang.org/x/exp/shiny - it uses RegisterClass

https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-registerclassw

which uses WNDCLASSW.lpfnWndProc

https://docs.microsoft.com/en-us/windows/win32/api/winuser/ns-winuser-wndclassw

which is WNDPROC

https://docs.microsoft.com/en-us/previous-versions/windows/desktop/legacy/ms633573(v=vs.85)

which returns LRESULT which is LONG_PTR or uintptr (in Go)

https://docs.microsoft.com/en-us/windows/win32/winprog/windows-data-types

Also I recommend you look at github.com/lxn/walk for their syscall.NewCallback usage. This is an ultimate package to create GUI apps. But, I doubt, you will find different usage than golang.org/x/exp/shine. (I did not look myself)

I reckon, that is about it. For Microsoft APIs (maybe google code for syscall.NewCallback or windows.NewCallback).

But some people might use it in their custom / proprietary interface. I am not sure how these are supported by Go. You will be the judge.

Alex

Loading

@CAFxX
Copy link
Contributor

@CAFxX CAFxX commented Oct 3, 2020

IIUC this example in the microsoft docs shows that compounds are allowed:

struct Struct2 {
   int j, k;    // Struct2 fits in 64 bits, and meets requirements for return by value.
};
Struct2 func4(int a, double b, int c, float d);
// Caller passes a in RCX, b in XMM1, c in R8, and d in XMM3;
// callee returns Struct2 result by value in RAX.

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 14, 2020

Change https://golang.org/cl/262197 mentions this issue: go/analysis/passes/asmdecl: add support for ABI selector clauses

Loading

gopherbot pushed a commit to golang/tools that referenced this issue Oct 14, 2020
Accept ABI selector clauses for function definitions. The assembler
allows these clauses when compiling the runtime, so the checker needs
to be able to digest them as well.

Updates golang/go#40724

Change-Id: I49ea4d87450c498bdfe10a8c441b48fcf70bea7c
Reviewed-on: https://go-review.googlesource.com/c/tools/+/262197
Trust: Than McIntosh <thanm@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
@gopherbot
Copy link

@gopherbot gopherbot commented Oct 14, 2020

Change https://golang.org/cl/262319 mentions this issue: reflect,runtime: use internal ABI for selected ASM routines

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 14, 2020

Change https://golang.org/cl/262317 mentions this issue: cmd/dist,cmd/go: broaden use of asm macro GOEXPERIMENT_REGABI

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 14, 2020

Change https://golang.org/cl/259445 mentions this issue: cmd/compile,cmd/link: initial support for ABI wrappers

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 14, 2020

Change https://golang.org/cl/260477 mentions this issue: cmd/asm: allow def/ref of func ABI when compiling runtime

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 14, 2020

Change https://golang.org/cl/262318 mentions this issue: cmd: go get golang.org/x/tools@d1624618 && go mod vendor

Loading

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals Oct 14, 2020
@gopherbot
Copy link

@gopherbot gopherbot commented Oct 17, 2020

Change https://golang.org/cl/262117 mentions this issue: cmd/compile: delay expansion of OpArg until expand_calls

Loading

@laboger
Copy link
Contributor

@laboger laboger commented Sep 8, 2021

It appears the CL does fix my problem but now I get the signalling NaN error:

./reflect.test 
--- FAIL: TestSignalingNaNReturn (0.00s)
    abi_test.go:904: signaling NaN not correct: 7fc00001
FAIL

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Sep 8, 2021

What does the 32-bit FP load machine instruction do with signaling NaN? Does it convert to a 64-bit signaling NaN in register, or a quiet NaN? Does it preserve it when store back as 32-bit float?

If the machine instructions preserve signaling NaN, maybe we could write archFloat32FromReg and archFloat32ToReg in assembly. Something like

TEXT	·archFloat32ToReg(SB),4,$0-16
	FMOVS	val+0(FP), F1
	FMOVD	F1, ret+8(FP)
	RET

Loading

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 8, 2021

It looks like a signaling NaN should round-trip from memory->reg->memory ok. At least, there's a test for that, which I recently fixed.

Loading

@laboger
Copy link
Contributor

@laboger laboger commented Sep 8, 2021

What does the 32-bit FP load machine instruction do with signaling NaN? Does it convert to a 64-bit signaling NaN in register, or a quiet NaN? Does it preserve it when store back as 32-bit float?

It stays as a signaling NaN for loads and stores, but it looks like when storing a float64 in a register as a float32, it does an frsp to round it to a float32 and that changes it to a quiet NaN. So that one could be implemented in assembler to avoid the conversion.

Loading

gopherbot pushed a commit that referenced this issue Sep 9, 2021
Currently on amd64 and arm64, float32 values just live in the bottom 32
bits of the register, so reflect simply places them there in a RegArgs
for reflectcall to load them. This works fine because both of these
platforms don't care what the upper 32 bits are, and have instructions
to operate on float32 values specifically that we use. In sum, the
representation of the float32 in memory is identical to that of the
representation in a register.

On other platforms, however, the representation of FP values differ
depending on whether they're in memory or in a register. For instance,
on ppc64, all floating point values get promoted to a float64 when
loaded to a register (i.e. there's only one set of FP instructions). As
another example, on riscv64, narrow-width floats in registers need to be
NaN-boxed.

What all this means is that for supporting the register ABI on these
platforms, reflect needs to do a little extra work to ensure that the
representation of FP values in a RegArgs matches the representation it
takes on in a register. For this purpose, this change abstracts away the
action of storing values into a RegArgs a little bit and adds a
platform-specific hook which currently does nothing but copy the value.

For #40724.

Change-Id: I65dcc7d86d5602a584f86026ac204564617f4c5a
Reviewed-on: https://go-review.googlesource.com/c/go/+/347566
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@laboger
Copy link
Contributor

@laboger laboger commented Sep 13, 2021

I tried various assembler instructions to do this conversion but they all changed the NaNs from signaling to quiet. I found that this type of conversion is a common problem discussed in other Go issues, so just added Go code to the basic conversion to check if the quiet bit is not set in the float64 before converting, and if it then remove it from the converted float32.

func archFloat32FromReg(reg uint64) float32 {
        f64 := *(*float64)(unsafe.Pointer(&reg))
        if isSignalingNaN(f64) {
        // Convert float64 to float32 but if it is a signaling NaN don't convert to quiet.
                return math.Float32frombits(math.Float32bits(float32(*(*float64)(unsafe.Pointer(&reg)))) & f32snanbit)
        }
        return float32(f64)
}

This seems to work for reflect tests and pass the signaling NaN tests.

Loading

@randall77
Copy link
Contributor

@randall77 randall77 commented Sep 13, 2021

You should also make sure the NaN payload doesn't get squashed. Or convert -0 -> +0. We really want the identity function on every 32-bit pattern.

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Sep 13, 2021

@laboger I wonder why these don't work

TEXT	·archFloat32ToReg(SB),4,$0-16
	FMOVS	x+0(FP), F1
	FMOVD	F1, ret+8(FP)
	RET

TEXT	·archFloat32FromReg(SB),4,$0-12
	FMOVD	x+0(FP), F1
	FMOVS	F1, ret+8(FP)
	RET

Tried with a simple program and it does seem to preserve signaling NaN.

package main

import "math"
import "fmt"

const snan uint32 = 0x7f800001

func archFloat32ToReg(float32) uint64
func archFloat32FromReg(uint64) float32

func main() {
	x := math.Float32frombits(snan)
	r := archFloat32ToReg(x)
	y := archFloat32FromReg(r)
	fmt.Printf("%x %x\n", snan, math.Float32bits(y))
}

Loading

@laboger
Copy link
Contributor

@laboger laboger commented Sep 14, 2021

@cherrymui Yeah that does work. I had a concern about the float64->float32 conversion being done this way since the generated code for float32(float64) uses the frsp to round it down before storing it. I didn't know if there could be cases where this doesn't get the original float32 answer. But if the float64 is always from a float32->float64 conversion then it shouldn't need rounding.

On a side note, why is the float32 argument 8 bytes but the float32 result 4 bytes?

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Sep 15, 2021

@laboger thanks. Good to know it works. Yeah, for the float32 case it doesn't need rounding because it is always from a float32.

why is the float32 argument 8 bytes but the float32 result 4 bytes?

A float32 is always 4 bytes. But we align to pointer size between arguments and results. So one is 8 + 4; the other is 4, aligned to 8, then add 8.

Loading

@laboger
Copy link
Contributor

@laboger laboger commented Sep 16, 2021

I made some changes to the patches to match the registers described in the PPC64 ABI doc section. I hit a few issues that could affect the doc.

  • I didn't notice this before but F0 is not in the assignable set of F registers in PPC64Ops.go so I'm just going to switch to start with F1. Otherwise I get errors at build time saying it can't assign an F reg. I'd rather not switch this in PPC64Ops.go since there are plenty of F regs anyway the last 4 are not even used now.
  • With the number of Int and Float regs that we decided on, I get split stack overflow errors in some of the race tests. I'm assuming this is due to the extra space allocated for spilling. I trimmed back the number of registers for arguments so there are 12 ints and 12 floats and that avoids the split stack overflow error. Any concerns if I switch the doc back to use 12 ints 12 floats and start the floats at F1? Or is there another way to avoid the split stack overflow.
  • I am getting a failure in fixedbugs/issue45344.go, same as what was in the original issue. I briefly looked at that but don't see why that would fail with my ABI patches but not without. The fix for the issue appears to be there.

Loading

@thanm
Copy link
Contributor

@thanm thanm commented Sep 16, 2021

@laboger one of the tools we used for bringup of the reg abi on amd64 was a function signature fuzzer. If you think it might be helpful to run it for ppc64 bringup, I can send pointers/instructions-- let me know.

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Sep 17, 2021

@laboger

  • Yeah, starting at F1 SGTM. (It didn't use F0 probably because in the old backend we used to reserve F0 as a fixed value, probably 0.)
  • Reducing the number of registers to avoid stack overflow also SGTM. This is the main reason to limit the number, otherwise we could use something like 20 registers each.
  • The last one is harder to tell for now. Maybe when you have the CLs we could take a look. Thanks.

Loading

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Sep 17, 2021

I'd also be a little careful about using too many registers, because our register allocator sometimes gets confused. There's a pretty serious performance regression for one benchmark on AMD64 caused by a return of a bunch of registers that caused it to do a bunch of new spills inside a loop. Longer term that is a bug we need to fix, in the short term, diminishing returns with more and more registers suggests just don't use every register that we "could", so as to avoid these other pathologies (for now, at least).

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 20, 2021

Change https://golang.org/cl/342869 mentions this issue: cmd/compile: document register-based ABI for ppc64

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 20, 2021

Change https://golang.org/cl/351110 mentions this issue: cmd/compile/internal: add ABI register information for ppc64

Loading

gopherbot pushed a commit that referenced this issue Sep 21, 2021
This adds the defines for ABI registers on PPC64. Other changes
will need to be in place before they are enabled.

Updates #40724

Change-Id: Ia6ead140719eda9aa99b99c48afafff684c33039
Reviewed-on: https://go-review.googlesource.com/c/go/+/351110
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Sep 30, 2021

@laboger

For fixedbugs/issue45344.go, in cmd/compile/internal/ssa/gen/PPC64.rules

(Store {t} ptr val mem) && t.Size() == 8 && (is64BitInt(val.Type) || isPtr(val.Type)) => (MOVDstore ptr val mem)

I think we want to change the condition to !is64BitFloat, like all other architectures. Then it builds on PPC64 with GOEXPERIMENT=regabi.

Loading

@laboger
Copy link
Contributor

@laboger laboger commented Oct 18, 2021

The register ABI is now enabled for PPC64. https://go-review.googlesource.com/c/go/+/353969

Loading

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Oct 18, 2021

Nice work!

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 19, 2021

Change https://golang.org/cl/356519 mentions this issue: cmd/compile/internal: add ABI register information for riscv64

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 22, 2021

Change https://golang.org/cl/357974 mentions this issue: cmd/compile: change context register from X20 to X26

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 22, 2021

Change https://golang.org/cl/357976 mentions this issue: runtime,cmd/compile: change duff{zero,copy} register for regabi

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented Oct 28, 2021

Change https://golang.org/cl/359337 mentions this issue: cmd/compile/internal: add ABI register info for riscv64

Loading

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Nov 11, 2021

Should I assume riscv register ABI will not make the 1.18 release? That's what it looks like to me, but I want to confirm before bumping this to 1.19.

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Nov 11, 2021

@dr2chase yes.

Loading

@cherrymui cherrymui removed this from the Go1.18 milestone Nov 11, 2021
@cherrymui cherrymui added this to the Go1.19 milestone Nov 11, 2021
@laboger
Copy link
Contributor

@laboger laboger commented Nov 16, 2021

Currently ABIInternal can only be used with assembler functions in the runtime package. Are there any plans to change that in the future so other assembler functions can take advantage of the register ABI?

Loading

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Nov 16, 2021

As the name states, currently ABIInternal is an internal ABI and not a stable one. Therefore it is limited specifically for internal uses. Once we make it (or some revised version of it) a stable ABI, it can be used in assembly code more widely.

(And at that time we'll probably call it ABI1 or something, instead of ABIInternal.)

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Proposals
Accepted
Linked pull requests

Successfully merging a pull request may close this issue.

None yet