Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing broken BM25 support with Weaviate - fixes #3720 #3723

Merged
merged 8 commits into from
Dec 19, 2022

Conversation

zoltan-fedor
Copy link
Contributor

Unfortunately the BM25 support with Weaviate got broken with Haystack v1.11.0+, which is getting fixed with this commit.

Please see more details under issue #3720.

Related Issues

Proposed Changes:

The issue was caused by this commit.

I have fixed it by:

  • making the Weaviate document store to be a subclass of KeywordDocumentStore
  • adding the query_batch() method to the Weaviate document store, although there is no batch querying in Weaviate currently, see @laura-ham 's comment from Weaviate, so the query_batch() will be implemented as a loop (just like it is for some other document stores in Haystack which don't support batch querying).
  • adding Weaviate to the BM25 node tests, so this can't happen again

How did you test it?

  1. Expanded the test_retrieval_without_filters test case to include Weaviate with BM25
  2. I also had a local project using BM25 with Weaviate, so I have also tested this change via that project to ensure that it in fact works with real data.

Notes for the reviewer

The BM25 search WITH filters is still not supported in Weaviate, that is coming with Weaviate v1.18.0, at which point additional tests will need to be allowed to run for Weaviate (the ones with "filters")

Checklist

Unfortunately the BM25 support with Weaviate got broken with Haystack v1.11.0+, which is getting fixed with this commit.

Please see more under issue deepset-ai#3720.
@zoltan-fedor zoltan-fedor requested a review from a team as a code owner December 15, 2022 23:43
@zoltan-fedor zoltan-fedor requested review from mayankjobanputra and removed request for a team December 15, 2022 23:43
Mypy forced me to set the signature of the `query` method of the Weaviate document store to the same as its parent, the `KeywordDocumentStore`, where the `query` parame is `Optional`, but has NO default value, so it must be provided (as None) at runtime.
I am not quite sure why the abstract method's `query` param was set without a default value while its type is `Optional`, but I didn't want to change that, so instead I have changed the Weaviate tests.
Copy link
Contributor

@ZanSara ZanSara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for taking care of WeaviateDocumentStore 😊

test/document_stores/test_weaviate.py Show resolved Hide resolved
test/document_stores/test_weaviate.py Show resolved Hide resolved
test/document_stores/test_weaviate.py Show resolved Hide resolved
test/document_stores/test_weaviate.py Show resolved Hide resolved
@ZanSara ZanSara removed the request for review from mayankjobanputra December 19, 2022 10:22
@zoltan-fedor
Copy link
Contributor Author

zoltan-fedor commented Dec 19, 2022

Hi @ZanSara,
The query() method of the WeaviateDocumetnStore no longer has a default value for the query param, as its signature now must match the KeywordDocumentStore's query() abstractmethod's signature.
That is why the query=None has been added to those test. All that is to match the signature of the KeywordDocumentStore's abstractmethods.
That is why the test is breaking now after your commit.

I agree, that the KeywordDocumentStore's query() abstractmethod's signature should probably have None as default value for the query param, but it doesn't have it and I didn't want to modify the abstractmethod. I personally believe that should be changed, but being an abstractmethod I didn't want to go there, as then all other methods which already implemented this would also needed to be changed.

@ZanSara
Copy link
Contributor

ZanSara commented Dec 19, 2022

Hey @zoltan-fedor, you're right, I didn't notice you adjusted the default. Let me revert.

@ZanSara ZanSara merged commit e143f7c into deepset-ai:main Dec 19, 2022
@zoltan-fedor zoltan-fedor deleted the bugfix-bm25-with-weaviate branch December 19, 2022 16:32
ZanSara pushed a commit that referenced this pull request Dec 20, 2022
* Fixing broken BM25 support with Weaviate - fixes #3720

Unfortunately the BM25 support with Weaviate got broken with Haystack v1.11.0+, which is getting fixed with this commit.

Please see more under issue #3720.

* Fixing mypy issue - method signature wasn't matching the base class

* Mypy related test fix

Mypy forced me to set the signature of the `query` method of the Weaviate document store to the same as its parent, the `KeywordDocumentStore`, where the `query` parame is `Optional`, but has NO default value, so it must be provided (as None) at runtime.
I am not quite sure why the abstract method's `query` param was set without a default value while its type is `Optional`, but I didn't want to change that, so instead I have changed the Weaviate tests.

* Adding a note regarding an upcomming fix in Weaviate v1.17.0

* Apply suggestions from code review

* revert

* [EMPTY] Re-trigger CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BM25 support with Weaviate got broken with Haystack v1.11.0+
2 participants