Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use slices.BinarySearch in util.searchToken #408

Merged
merged 4 commits into from
Oct 20, 2023
Merged

Conversation

pr00se
Copy link
Contributor

@pr00se pr00se commented Oct 17, 2023

What this PR does:

The slices package introduces a generic slices.BinarySearch which looks to be faster than sort.Search, especially as the number of items to search increases.

Benchmark:

go test -run='TestSearch' -bench='BenchmarkSearch' -timeout='3h' -benchtime='10000x' -cpu=1 -count=10

name                                   old time/op  new time/op  delta
SearchToken/searchToken_3_instances    89.2ns ± 3%  78.2ns ± 3%  -12.36%  (p=0.000 n=9+8)
SearchToken/searchToken_9_instances     100ns ± 3%    87ns ± 3%  -13.03%  (p=0.000 n=10+10)
SearchToken/searchToken_27_instances    107ns ± 4%    92ns ± 4%  -13.28%  (p=0.000 n=10+9)
SearchToken/searchToken_81_instances    138ns ± 3%   121ns ± 3%  -12.25%  (p=0.000 n=9+10)
SearchToken/searchToken_243_instances   187ns ± 6%   149ns ± 3%  -20.47%  (p=0.000 n=10+9)
SearchToken/searchToken_729_instances   313ns ± 4%   215ns ± 3%  -31.35%  (p=0.000 n=10+9)

Which issue(s) this PR fixes:

(none)

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Contributor

@56quarters 56quarters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one question about the test.

@@ -424,3 +425,32 @@ func TestGetTokenDistance(t *testing.T) {
require.Equal(t, testData.expected, distance)
}
}

func TestSearchToken(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified that this behavior matches the old implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, but just to triple-check myself since this is a core part of the code, 9cec3d0 rolls back to the old function, and adds a test that compares the results of the new and old functions over 1M random keys. All tests passed here. 1275009 reverts those changes.

Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I checked the actual sort.Search() vs slices.SearchBinary() implementation and I've also written this simple comparison test:

func TestSearchToken_EnsureBinarySearchLogicIsEqualToSortSearch(t *testing.T) {
	tokens := []uint32{1, 5, 6, 8, 10}

	searchOld := func(target uint32) int {
		return sort.Search(len(tokens), func(x int) bool {
			return tokens[x] > target
		})
	}

	searchNew := func(target uint32) int {
		i, found := slices.BinarySearch(tokens, target)
		if found {
			// we want the first token > key, not >= key
			i = i + 1
		}
		return i
	}

	for i := uint32(0); i < tokens[len(tokens)-1]+2; i++ {
		assert.Equal(t, searchOld(i), searchNew(i))
	}
}

@pr00se pr00se merged commit 1aa3bbe into main Oct 20, 2023
3 checks passed
@pr00se pr00se deleted the use-slices-binarysearch branch October 20, 2023 17:52
ying-jeanne pushed a commit that referenced this pull request Nov 2, 2023
* Add test and benchmark for searchToken

* Use slices.BinarySearch in util.searchToken

* Validate that functionality is identical

* Revert "Validate that functionality is identical"

This reverts commit 9cec3d0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants