use lru_cache to optimize loading of local vector store #496
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
问题
当前
local_doc_qa
的实现中,每回答一次问题都要执行一次FAISS.load_local
重新加载知识库,哪怕用户选择的是同一个知识库。对于较大的知识库(1G以上),重新加载知识库带来明显的延迟。解决建议
利用
lru_cache
缓存FAISS.load_local
的结果,把所有调用FAISS.load_local
的地方替换成load_vector_store
即可实现知识库的缓存加载。用户可通过在model_config.py
中设置CACHED_VC_NUM
变量来调整缓存知识库的数量。在测试中,基于1.5G的知识库,通过缓存明显地降低了LLM开始回答问题前的等待时间。如果以后开发了跨知识库查询的功能,更是能显著降低延迟。