Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Switch on four-byte string constants repeats comparisons #53333

Open
greatroar opened this issue Jun 11, 2022 · 4 comments
Open

cmd/compile: Switch on four-byte string constants repeats comparisons #53333

greatroar opened this issue Jun 11, 2022 · 4 comments
Labels
NeedsFix Performance
Milestone

Comments

@greatroar
Copy link

@greatroar greatroar commented Jun 11, 2022

What version of Go are you using (go version)?

$ gotip version
go version devel go1.19-55590f3a2b Fri Jun 10 23:30:35 2022 +0000 linux/amd64

What operating system and processor architecture are you using (go env)?

GOAMD64="v2"
GOARCH="amd64"

What did you do?

func mimetype(ext string) string {
	switch ext {
	case ".htm":
		return "text/html; charset=utf-8"
	case ".eot":
		return "application/vnd.ms-fontobject"
	case ".svg":
		return "image/svg+xml; charset=utf-8"
	case ".ttf":
		return "font/ttf"
	default:
		return ""
	}
}

What did you expect to see?

I expected the switch to be compiled to a check for len(ext) == 4 followed by a binary search that has a single CMPL for each case.

What did you see instead?

A binary search where each case has a length check against four, followed by a CMPL; worse, the binary search is preceded by a runtime.cmpstring against ".htm", so that comparison occurs twice (it also occurs as a CMPL against $1836345390).

The generated code is closer to expected when there is one fewer case. Then it only repeats the length check.

@greatroar greatroar changed the title cmd/compile: Switch one four-byte strings repeats comparisons cmd/compile: Switch on four-byte strings repeats comparisons Jun 11, 2022
@greatroar greatroar changed the title cmd/compile: Switch on four-byte strings repeats comparisons cmd/compile: Switch on four-byte string constants repeats comparisons Jun 11, 2022
@Jorropo
Copy link
Contributor

@Jorropo Jorropo commented Jun 11, 2022

We should also do that with 8 bytes on 64 bits architectures with support for 8 bytes immediates in comparisons.

@greatroar
Copy link
Author

@greatroar greatroar commented Jun 11, 2022

The GOSSAFUNC dump already shows both the literal comparisons and the runtime.cmpstring call in the AST repr:

. . IF-Cond
. . . LE bool tc(1) # sw.go:13:2
. . . . CALLFUNC Walked int tc(1) # sw.go:13:2
. . . . . NAME-runtime.cmpstring Class:PFUNC Offset:0 Used FUNC-func(string, string) int tc(1)
. . . . CALLFUNC-Args
. . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . LITERAL-".htm" string tc(1) # sw.go:14:7
. . . . LITERAL-0 int tc(1) # sw.go:13:2
. . IF-Body
. . . IF # _sw.go:13:2
. . . IF-Cond
. . . . ANDAND bool tc(1) # sw.go:16:2
. . . . . EQ bool tc(1) # sw.go:16:2
. . . . . . LEN int tc(1) # sw.go:16:2
. . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . LITERAL-4 int tc(1) # sw.go:16:2
. . . . . EQ bool tc(1) # sw.go:16:2
. . . . . . LITERAL-1953457454 uint32 tc(1) # sw.go:16:2
. . . . . . OR uint32 tc(1) # sw.go:16:2
. . . . . . . OR uint32 tc(1) # sw.go:16:2
. . . . . . . . OR uint32 tc(1) # sw.go:16:2
. . . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . . LITERAL-0 int tc(1) # sw.go:16:2
. . . . . . . . . LSH uint32 tc(1) # sw.go:16:2
. . . . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . . . LITERAL-1 int tc(1) # sw.go:16:2
. . . . . . . . . . LITERAL-8 uint tc(1) # sw.go:16:2
. . . . . . . . LSH uint32 tc(1) # sw.go:16:2
. . . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . . LITERAL-2 int tc(1) # sw.go:16:2
. . . . . . . . . LITERAL-16 uint tc(1) # sw.go:16:2
. . . . . . . LSH uint32 tc(1) # sw.go:16:2
. . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . LITERAL-3 int tc(1) # sw.go:16:2
. . . . . . . . LITERAL-24 uint tc(1) # sw.go:16:2

@seankhliao seankhliao added Performance NeedsInvestigation labels Jun 11, 2022
@randall77
Copy link
Contributor

@randall77 randall77 commented Jun 13, 2022

I think the reason it is this way is that the top clause of the binary search is a <= on strings, whereas the other comparisons are == on strings.

The relevant code is in cmd/compile/internal/walk/compare.go. it says:

// Don't do byte-wise comparisons for <, <=, etc.
// They're fairly complicated.

I think we'd just have to implement that. It will be somewhat complicated I think (e.g. endianness matters), but not too bad.

@randall77 randall77 added NeedsFix and removed NeedsInvestigation labels Jun 13, 2022
@randall77 randall77 added this to the Unplanned milestone Jun 13, 2022
@randall77
Copy link
Contributor

@randall77 randall77 commented Jun 16, 2022

I looked at this a bit more. I originally thought we could just optimize generic string vs. constant string ordered comparisons, and that would fix this issue also. But those comparisons are very tough, mostly because we don't know anything about the length of the nonconstant string. We can't easily read the nonconstant string because we need to check the length to be sure we're still in bounds. ==/!= comparisons avoid that by comparing the length first.
However, in switch statements, we've already determined that the lengths are equal. So we could do something simpler there. It would probably require a new comparison op that's "less than on a string and constant string with equal length".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix Performance
Projects
None yet
Development

No branches or pull requests

4 participants