-
Notifications
You must be signed in to change notification settings - Fork 15.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GoogleCloudEnterpriseSearchRetriever consistently returns no results #8219
Comments
Answer generated by a 🤖 AnswerI understand that you're experiencing an issue where the One potential cause of this issue could be the In your provided code, I noticed that the retriever = GoogleCloudEnterpriseSearchRetriever(
project_id=my-project-id,
search_engine_id=my-engine,
filter="your_filter_expression",
max_documents=3,
) Please replace If this doesn't resolve the issue, it could be related to other parameters in the Please note that this is just one potential solution. If you find that this solution is correct and it's a bug that could impact other users, we encourage you to make a pull request. If this solution doesn't align with your experience, you might want to consider upgrading to the latest version of the software. Sources
This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. |
Adding on Matt's question - no response is returned even when the filter is empty: Can it be the case that GoogleCloudEnterpriseSearchRetriever in langchain.retrievers does not return the document snippets for the engines built on a "website" yet, and only supports engines built on "unstructured data"? |
@mattsday @pepecura — I also ran into this issue. It's definitely returning document results. If you're looking to return web search results, you can extend the original class and override
|
Thank you @kamikasky - this worked well |
I just found the problem. In this function def _convert_unstructured_search_response(
self, results: Sequence[SearchResponse]
) -> List[Document]:
"""Converts a sequence of search results to a list of LangChain documents."""
from google.protobuf.json_format import MessageToDict
documents: List[Document] = []
for result in results:
document_dict = MessageToDict(
result.document._pb, preserving_proto_field_name=True
)
derived_struct_data = document_dict.get("derived_struct_data", None)
if not derived_struct_data:
continue
doc_metadata = document_dict.get("struct_data", {})
doc_metadata["id"] = document_dict["id"]
chunk_type = (
"extractive_answers"
if self.get_extractive_answers
else "extractive_segments"
)
if chunk_type not in derived_struct_data:
continue
for chunk in derived_struct_data[chunk_type]:
doc_metadata["source"] = derived_struct_data.get("link", "")
if chunk_type == "extractive_answers":
doc_metadata["source"] += f":{chunk.get('pageNumber', '')}"
documents.append(
Document(
page_content=chunk.get("content", ""), metadata=doc_metadata
)
)
return documents I can do a PR with the fix. |
I'm having the same issue with unstructured data. |
Just made a PR here for those interested! |
Replace this entire comment with: - Description: fixed Google Enterprise Search Retriever where it was consistently returning empty results, - Issue: related to [issue 8219](#8219), - Dependencies: no dependencies, - Tag maintainer: @hwchase17 , - Twitter handle: [Tomas Piaggio](https://twitter.com/TomasPiaggio)!
Should this issue be marked as fixed after #10546 |
enterprise search was returning no results langchain-ai/langchain#8219
Hi, @mattsday, I'm helping the LangChain team manage their backlog and am marking this issue as stale. The issue you reported regarding the Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you for your understanding and cooperation. |
System Info
Versions:
GoogleCloudEnterpriseSearchRetriever
it consistently returns zero results without error.Workarounds / Validations attempted:
discoveryengine_v1beta.SearchServiceClient()
directly works as expected and provides resultsWho can help?
No response
Information
Related Components
Reproduction
Given the following code:
This will always print
[]
no matter how I tweak the search query etc.Expected behavior
Would expect at least one search result and the
result
dictionary not to be empty.The text was updated successfully, but these errors were encountered: