Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease test time for TestManyKnnDocs.testLargeSegment #11945

Merged
merged 8 commits into from
Nov 17, 2022

Conversation

jdconrad
Copy link
Contributor

This change adds an additional test codec allowing a configurable number for max connections per vector when building an hnsw index. By setting the number of connections to 128 as part of TestManyKnnDocs.testLargeSegment we can reduce the number of indexed vectors to 2088992 and still reproduce the test failure prior to the fix by @benwtrent in #11905.

This changed reduced the test time for me from ~90 minutes to ~3 minutes locally.

cc @rmuir

@rmuir
Copy link
Member

rmuir commented Nov 17, 2022

we need to run a ./gradlew tidy and commit/push the results to fix formatting.

Very cool, will test on my 2-core. we may be able to upgrade from @Monster to @Nightly.

@jdconrad
Copy link
Contributor Author

Updated with tidy! (Oops on failing precommit.)

@rmuir
Copy link
Member

rmuir commented Nov 17, 2022

Works for me. I was able to now run this monster test in < 10 minutes time.

<===:lucene:core:test (SUCCESS): 1 test(s)
The slowest tests (exceeding 500 ms) during this run:
  527.29s TestManyKnnDocs.testLargeSegment (:lucene:core)
The slowest suites (exceeding 1s) during this run:
  527.61s TestManyKnnDocs (:lucene:core)

BUILD SUCCESSFUL in 9m 2s
19 actionable tasks: 5 executed, 14 up-to-date

@rmuir rmuir merged commit a18b62d into apache:main Nov 17, 2022
@uschindler
Copy link
Contributor

Very cool idea (although I have no idea wha this does because of my ignorance for KNN).

@rmuir
Copy link
Member

rmuir commented Nov 17, 2022

This M is ... the length of the "postings list" for vector. This test codec allows using a larger value... so more data written per document, but less documents needed to trigger the overflow that we wanted to test for here.

@benwtrent
Copy link
Member

awesome stuff @jdconrad!!!

asfgit pushed a commit that referenced this pull request Nov 17, 2022
@rmuir
Copy link
Member

rmuir commented Nov 17, 2022

thanks @jdconrad !

@msokolov
Copy link
Contributor

oh nice plan, thanks everyone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants