generated from CDCgov/template
-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
Algorithm DevelopmentTasks related to training, testing, evaluating and improving language modelsTasks related to training, testing, evaluating and improving language models
Description
With the first round of results collected from exact neighbor search, we found the search times to be prohibitively slow (about 1.8 seconds per input). This won't work in production, so we'll need the speedup promised by approximate neighbor search. The scope of this ticket is two-fold:
- implement the HNSW approximate neighbor search algorithm using hnswlib, and tie it to sentence transformers in the performance evaluation script
- create a hyperparameter optimization script for HNSW that uses a grid-search over its relevant construction parameters to find the values that best optimize recall with respect to exact search and search time. Note that we're using recall here because the benchmark of approximate search is not how right it is in a vacuum, but how close it can get to exact search.
Metadata
Metadata
Assignees
Labels
Algorithm DevelopmentTasks related to training, testing, evaluating and improving language modelsTasks related to training, testing, evaluating and improving language models