Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: switch statements over strings are not optimal #33934

Closed
marigonzes opened this issue Aug 29, 2019 · 4 comments

Comments

@marigonzes
Copy link

commented Aug 29, 2019

What version of Go are you using (go version)?

$ go version
go version devel +3d48ae3 Wed Aug 28 03:23:59 2019 +0000 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What did you do?

I played around with some switch statements over strings to check how the gc would behave.
For example, the switch in the following function has some issues (https://godbolt.org/z/L-Z-TM):

func f(x string) int {
	switch x {
	case "":
		return -1
	case "1", "2", "3":
		return -2
	default:
		return -3
	}
}

What did you expect to see?

I expected the generated assembly code to be relatively simple, with no redundant instructions or jumps.

What did you see instead?

Instead, I can see some redundancies in the output of the compiler (below, I will be referring to the lines in the website mentioned above):

  1. In line 79, the comparison x == "1" uses runtime.cmpstring, while the others compare x directly with an int
  2. In line 18, its checked if len(x) == 0, which is never true, give that the check on line 16 has failed
  3. In line 81, the jump should be to the return -3 block
  4. If I follow some paths of execution, there are a lot of double jumps (jeq -> jeq and jeq -> jne)

@mvdan mvdan added the Performance label Aug 29, 2019

@julieqiu

This comment has been minimized.

Copy link

commented Aug 29, 2019

@randall77

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2019

I think this is because the binary search is non-optimal when there are multiple strings of the same length.
For instance, in this case we have 4 strings: "", "1", "2", "3". When binary searching, we separate the set of strings into two sets, "","1" and "2","3". That really isn't optimal, because the test between those two sets requires a length and a content test (I think this is the cmpstring call).
We should choose our split point where the length changes, and only use the content if the lengths are all identical.

@gopherbot

This comment has been minimized.

Copy link

commented Sep 12, 2019

Change https://golang.org/cl/195138 mentions this issue: cmd/compile: optimize switch on strings

@gopherbot

This comment has been minimized.

Copy link

commented Sep 14, 2019

Change https://golang.org/cl/195340 mentions this issue: cmd/compile: optimize switch on strings

@gopherbot gopherbot closed this in 85fc765 Sep 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.