Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: opt: generate conditional comparisons for || and && conditions #71268

Open
haoliu-ampere opened this issue Jan 14, 2025 · 2 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Implementation Issues describing a semantics-preserving change to the Go implementation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@haoliu-ampere
Copy link

Go version

go1.23.4 linux/arm64

Output of go env in your module/workspace:

GOARCH='arm64'

What did you do?

While investigating the performance of json package, I found unquoteBytes() spends extra time with the if c == '\\' || c == '"' || c < ' ' condition (src/encoding/json/decode.go#L1209), which could be improved by condidional comparisons (such as ARM64 CCMP instruction).

What did you see happen?

Currently, Go compiler can generate conditonal assignments (e.g., CSET, CSEL, CSINC), but it can not generate conditional comparisons (i.e., CCMP), which let you combine the results of multiple comparisons so you can perform a single test at the end. (see more details: The AArch64 processor (aka arm64), part 16: Conditional execution - The Old New Thing)

E.g., following are Go tests (https://godbolt.org/z/q6YMT7Msa) and corresponding C tests (https://godbolt.org/z/4qshde78o, compiled by GCC -O1). Go compiler generates CMP;BEQ insteald of CPM;CCMP for test2():

The Go tests:

func test1(c int) (r int) {
    if c > 0 {
        r = 1
    }
    return
}
// CMP     $0, R0
// CSET    GT, R0
// RET

func test2(a, b, c int) (r int) {
    if c == '\\' || c == '"' || c < ' ' {
        r = a
    } else {
        r = b
    }
    return
}
//   CMP     $92, R2
//   BEQ     pc28
//   CMP     $34, R2
//   BEQ     pc28
//   CMP     $32, R2
//   BLT     pc28
//   MOVD    R1, R0
// pc28:
//   RET     (R30)

The C tests:

int test1(int c) {
    if (c > 0) {
        return 1;
    }
    return 0;
}
// cmp     w0, 0
// cset    w0, gt
// ret

int test2(int a, int b, int c) {
    if (c == '\\' || c == '"' || c < ' ') {
        return a;
    } else {
        return b;
    }
}
// cmp     w2, 92
// mov     w3, 34
// ccmp    w2, w3, 4, ne
// ccmp    w2, 31, 4, ne
// csel    w0, w1, w0, gt
// ret

Measure performance

Conditional comparisons should generally improve the performance of conjunction/disjunction of conditions by &&/|| operators on ARM64 machines.

Following cases are simplified from unquoteBytes. I tested on ARM64 Neoverse-N1 (AmpereComputing Altra and AWS Graviton2 is similar), the C case (GCC -O3 generates CCMP) is much faster than the Go case (go1.23.4 generates CMP): 5.69s vs. 9.23s.

The Go Test:

package main

//go:nosplit
//go:noinline
func unquoteBytes(s []byte, len int) int {
	s = s[1:]
	r := 0

	for r < len {
		c := s[r]
		if c == '\\' || c == '"' || c < ' ' {
			break
		}
		r++
	}

	return r
}

func main() {
	data := []byte(`"hello, world"`)
	len := len(data)

	for i := 0; i < 1000*1000*500; i++ {
		unquoteBytes(data, len)
	}
}

The C Test

#include <string.h>

__attribute__((noinline, noipa))
int unquoteBytes(const char *data, int len) {
  data = data + 1;
  int r = 0;

  while (r < len) {
    char c = data[r];
    if (c == '\\' || c == '"' || c < ' ') {
      break;
    }
    r++;
  }

  return r;
}

int main() {
  const char *data = "\"hello, world\"";
  unsigned len = strlen(data);

  for (int i = 0; i < 1000 * 1000 * 500; i++) {
    unquoteBytes(data, len);
  }
}

As C may have less overhead than Go in function call and main, let's just compare the linux-perf samples of the loop:

Go results:

; 95.89% 35560 mytest mytest [.] main.unquoteBytes
;  3.97%  1473 mytest mytest [.] main.main

7381 :   73260:  add     x0, x0, #0x1  ; loop header
 699 :   73264:  cmp     x3, x0
   0 :   73268:  b.le    73288         ; loop exit
9698 :   7326c:  ldrb    w2, [x1, x0]
   0 :   73270:  cmp     w2, #0x5c
   0 :   73274:  b.eq    73288         ; loop exit
8688 :   73278:  cmp     w2, #0x22
   0 :   7327c:  b.eq    73288         ; loop exit
7908 :   73280:  cmp     w2, #0x20
   0 :   73284:  b.cs    73260
 661 :   73288:  ret

C results:

; 99.76% 22829 simp simp [.] unquoteBytes
;  0.20%    45 simp simp [.] main

12051 :   4006f8: cmp     x6, x2
    0 :   4006fc: b.eq    400724       ; loop exit
  121 :   400700: mov     x2, x4
  431 :   400704: ldrb    w3, [x0, x2]
  547 :   400708: add     x4, x2, #0x1
    0 :   40070c: cmp     w3, #0x5c
  221 :   400710: ccmp    w3, w5, #0x4, ne
 1233 :   400714: ccmp    w3, #0x1f, #0x0, ne
 6381 :   400718: b.hi    4006f8       ; loop header
  991 :   40071c: sub     w0, w2, #0x1
    0 :   400720: ret
    0 :   400724: mov     w0, w1
    0 :   400728: ret

The assembly instructions are much similar except the CMP and CCMP. If we just count samples related to the loop, the C case (CCMP) vs the Go case (CCMP): 21976 vs 35035 (+47%).

Since the input data may affect performance, I also tested data like "\hello, world", so the loop could break at the 1st comparison against \, then CCMP is still faster than CMP (1.00s vs. 1.29s).

What did you expect to see?

Could we enhance Go compiler to generate CCMP?

BTW. I searched and didn't find any issue about conditional instructions (there is just an old issue #6011 about failing to generate conditional move).

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 14, 2025
@gabyhelp gabyhelp added the Implementation Issues describing a semantics-preserving change to the Go implementation. label Jan 14, 2025
@mknyszek mknyszek added this to the Unplanned milestone Jan 14, 2025
@mknyszek
Copy link
Contributor

CC @golang/compiler

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Implementation Issues describing a semantics-preserving change to the Go implementation. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants