You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you have a lots of RAM or the dataset is small, HNSW is the best option, it is a very fast and accurate index. The 4 <= M <= 64 is the number of links per vector, higher is more accurate but uses more RAM. The speed-accuracy tradeoff is set via the efSearch parameter. The memory usage is (d * 4 + M * 2 * 4) bytes per vector.
根据 faiss wiki 的说法,HNSW 仅用于内存很大且数据集很小的情况。按这段文字所给的数据,取 M = 16,则每个向量会占用 160byte,这是当前方法内存占用的 5 倍。
其优点:基于图检索的改进方法,检索速度极快,10亿级别秒出检索结果,而且召回率几乎可以媲美Flat,能达到惊人的97%。检索的时间复杂度为loglogn,几乎可以无视候选向量的量级了。并且支持分批导入,极其适合线上任务,毫秒级别体验。(来自网传)
其缺点:构建索引极慢,占用内存极大(是Faiss中最大的,大于原向量占用的内存大小)
The text was updated successfully, but these errors were encountered: