Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes issue #5072 - adds additional support to Weaviate #5085

Merged
merged 4 commits into from
May 23, 2023

Conversation

jettro
Copy link
Contributor

@jettro jettro commented May 22, 2023

Implementation is similar to search_distance and where_filter

adds 'additional' support to Weaviate queries

Fixes #5072

Before submitting

Who can review?

Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
@dev2049

Implementation is similar to search_distance and where_filter
@@ -201,6 +201,8 @@ def similarity_search_by_text(
query_obj = self._client.query.get(self._index_name, self._query_attrs)
if kwargs.get("where_filter"):
query_obj = query_obj.with_where(kwargs.get("where_filter"))
if kwargs.get("additional"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this apply to other search methods as well? (like similarity_search_by_vector)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can use the same mechanism in the function similarity_search_by_vector. They have different values that you can use in the additional field. But that is specified in the Weaviate documentation. Not sure if we should replicate that here as well.

I would not use it in the functions similarity_search_with_score and max_marginal_relevance_search_by_vector. They allready use the mechanism with the additional field. They already return a score.

Do you want me to add it to the similarity_search_by_vector method as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that'd be great!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@dev2049 dev2049 added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label May 22, 2023
Add the additional part also to the vector based query
output = docsearch.similarity_search(
"foo",
k=2,
additional=["certainty"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is certainty supposed to be in the output? maybe im misunderstanding what with_additional does

Copy link
Contributor Author

@jettro jettro May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right. I am trying to run the integration tests myself. But I cannot make it work. Now that you mention it, I think the output check should be like this:

assert output == [Document(page_content="foo", metadata={"page": 0, "_additional": {"certainty": 0}})]

I do not know how to get the value for certainty in the test. Of course I can make the change and commit, but I cannot test it myself at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in you can't make the integration test pass or you can't make it run at all? to run you can do

pytest tests/integration_tests/vectorstores/test_weaviate.py

you'll want to spin up local weaviate container, instructions for that at top of integration test file

cd tests/integration_tests/vectorstores/docker-compose
docker compose -f weaviate.yml up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, learning about poetry at the same time. I managed to get the integration tests running with your tips. Now it keeps telling me he is missing the OpenAI key:

E ValueError: Error during query: [{'locations': [{'column': 6, 'line': 1}], 'message': 'explorer: get class: vectorize params: vectorize params: vectorize params: vectorize keywords: remote client vectorize: OpenAI API Key: no api key found neither in request header: X-OpenAI-Api-Key nor in environment variable under OPENAI_APIKEY', 'path': ['Get', 'LangChain_47555ec10c224f01abb7b82f7458c0b6']}]

Trying to figure out what to do here. I added the .env file in the integration_tests folder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the integration test requires OpenAI model. if you have an openai api key just set that in the terminal you're running this in with

export OPENAI_API_KEY="<api_key>"

if you don't have openai key i can just run the test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I really tried everything I could, but I cannot fix it. If you can try the test, then I will try to fix it later for my environment.

@dev2049 dev2049 merged commit b950022 into langchain-ai:master May 23, 2023
12 checks passed
vowelparrot pushed a commit that referenced this pull request May 24, 2023
Implementation is similar to search_distance and where_filter

# adds 'additional' support to Weaviate queries

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
@danielchalef danielchalef mentioned this pull request Jun 5, 2023
This was referenced Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PR looks good. Use to confirm that a PR is ready for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to use _additional fields while executing a Weaviate query
2 participants