-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] 启用上下文关联,每次embedding搜索到的内容都会比前一次多一段 #613
Labels
bug
Something isn't working
Comments
该问题的主要原因是MyFAISS.py文件再搜索上下文关联文档后,修改了缓存的doc文档,导致的。 简单修改的话,就只需要做下deepcopy即可: for id_seq in id_lists:
for id in id_seq:
if id == id_seq[0]:
_id = self.index_to_docstore_id[id]
doc = copy.deepcopy(self.docstore.search(_id))
else:
_id0 = self.index_to_docstore_id[id]
doc0 = self.docstore.search(_id0)
doc.page_content += " " + doc0.page_content
if not isinstance(doc, Document):
raise ValueError(f"Could not find document for id {_id}, got {doc}")
doc_score = min([scores[0][id] for id in [indices[0].tolist().index(i) for i in id_seq if i in indices[0]]])
doc.metadata["score"] = int(doc_score)
docs.append(doc)
return docs @imClumsyPanda FYI |
谢谢,已解决 |
imClumsyPanda
added a commit
that referenced
this issue
Jun 14, 2023
已在master分支中按照评论中方法进行修复,感谢反馈。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
问题描述 / Problem Description
启用上下文关联 chunk_conent,每次embedding搜索到的内容都会比前一次多一段
复现问题的步骤 / Steps to Reproduce
预期的结果 / Expected Result
每次搜索到的内容是一致的
实际结果 / Actual Result
每次搜索到的内容都比之前要多一段,如上图
环境信息 / Environment Information
The text was updated successfully, but these errors were encountered: