-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slices: redundant comparison in BinarySearchFunc #70846
Comments
Related Code Changes (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
It's true, the final comparison to test whether the target element is actually present in the sequence is redundant about 50% of the time. It could be avoided by retaining the result of the last comparison performed within the loop: if it was zero, the final comparison isn't needed. This would cost just a register, and an extra conditional branch at the end. n := len(x)
// Define cmp(x[-1], target) < 0 and cmp(x[n], target) >= 0 .
// Invariant: cmp(x[i - 1], target) < 0, cmp(x[j], target) >= 0.
i, j := 0, n
for i < j {
h := int(uint(i+j) >> 1) // avoid overflow when computing h
// i ≤ h < j
// Idea 1: retain cmp result here
if cmp(x[h], target) < 0 {
i = h + 1 // preserves cmp(x[i - 1], target) < 0
} else {
// Idea 2: check cmp==0 and return early here
j = h // preserves cmp(x[j], target) >= 0
}
}
// i == j, cmp(x[i-1], target) < 0, and cmp(x[j], target) (= cmp(x[i], target)) >= 0 => answer is i.
return i, i < n && cmp(x[i], target) == 0 Equivalently, you could handle the three cases (+ve, -ve, 0) of the comparison within the loop distinctly, returning when zero, though this would add an extra conditional branch to 50% of loop iterations. However: this algorithm is deceptively hard to get right. Famously it was first published in 1946, but it wasn't till 1962 that the first correct version was published. That's why BinarySearch is among the most heavily commented functions in the entire Go project. Casual tweaks to it should be carefully weighed in light of that fact. |
what about: n := len(x)
// Define cmp(x[-1], target) < 0 and cmp(x[n], target) >= 0 .
// Invariant: cmp(x[i - 1], target) < 0, cmp(x[j], target) >= 0.
i, j, found := 0, n, false
for i < j {
h := int(uint(i+j) >> 1) // avoid overflow when computing h
// i ≤ h < j
// Idea 1: retain cmp result here
if c := cmp(x[h], target); c < 0 {
i = h + 1 // preserves cmp(x[i - 1], target) < 0
} else {
// Idea 2: check cmp==0 and return early here
j = h // preserves cmp(x[j], target) >= 0
found = found || c == 0
}
}
// i == j, cmp(x[i-1], target) < 0, and cmp(x[j], target) (= cmp(x[i], target)) >= 0 => answer is i.
return i, i < n && found |
CC @eliben |
I agree with @adonovan that this code is notoriously tricky and is best left alone, doubly so when the issue is about a single O(1) comparison per call, triply so when the "fixes" require O(log(n)) more work. |
For what it's worth, I compiled both versions of the code, and the extra gunk to avoid that last call to compare was a little more than I expected (">" marks branch targets, "+" marks extra code in the loop):
Makes me wonder whether we should get serious about callee-save, at least for not-pointers. |
Go version
go version go1.23.4 windows/amd64
Output of
go env
in your module/workspace:What did you do?
Ran this code.
What did you see happen?
Three comparisons being made.
What did you expect to see?
Two comparisons.
BinarySearchFunc always does a final comparison for its boolean return value, even though the slice element in question and the target might already have been compared. Since the comparison function could potentially be costly, this should be avoided.
The text was updated successfully, but these errors were encountered: