-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease test time for TestManyKnnDocs.testLargeSegment #11945
Conversation
we need to run a Very cool, will test on my 2-core. we may be able to upgrade from |
Updated with tidy! (Oops on failing precommit.) |
Works for me. I was able to now run this monster test in < 10 minutes time.
|
Very cool idea (although I have no idea wha this does because of my ignorance for KNN). |
This M is ... the length of the "postings list" for vector. This test codec allows using a larger value... so more data written per document, but less documents needed to trigger the overflow that we wanted to test for here. |
awesome stuff @jdconrad!!! |
* Improve speed of TestManyKnnDocs
thanks @jdconrad ! |
oh nice plan, thanks everyone |
This change adds an additional test codec allowing a configurable number for max connections per vector when building an hnsw index. By setting the number of connections to
128
as part ofTestManyKnnDocs.testLargeSegment
we can reduce the number of indexed vectors to2088992
and still reproduce the test failure prior to the fix by @benwtrent in #11905.This changed reduced the test time for me from ~90 minutes to ~3 minutes locally.
cc @rmuir