-
Notifications
You must be signed in to change notification settings - Fork 751
Description
This issue is related to conda-forge/hnswlib-feedstock#11 . @yihming nicely volunteered to help!
Currently the used SIMD instruction set is used based on compile-time flags with instance calculation being on of the most critical parts. This makes binary distribution crash on SSE2 or underperform on AVX.
We can add an option to compile the AVX instructions for a SSE2 build and call those if AVX is available (it is safe to assume SSE2 is available on all x64 machines).
For checking availability of AVX something like this https://stackoverflow.com/questions/6121792/how-to-check-if-a-cpu-supports-the-sse3-instruction-set .
SSE2 distance functions are already implemented in https://github.com/nmslib/hnswlib/blob/master/hnswlib/space_l2.h and https://github.com/nmslib/hnswlib/blob/master/hnswlib/space_ip.h (though only a single version is compiled due to preprocessor checks). Selection of the distance function is also implemented (e.g. https://github.com/nmslib/hnswlib/blob/master/hnswlib/space_ip.h#L252), but it is done based only on the vector length.
Testing: modified version (compiled with SSE2 arch) should have outperform SSE2-only baseline running search on the same 1-10M built index (random vectors with say 128 dimension).