Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support embedding dimensions on DeepsetCloudDocumentStore #2995

Merged
merged 14 commits into from
Aug 12, 2022

Conversation

dmigo
Copy link
Member

@dmigo dmigo commented Aug 8, 2022

Related Issues

  • fixes #issue-number

Proposed Changes:

  • Add embedding_dim to dc store
  • Remove similarity and return_embedding from query params, it is not used

How did you test it?

Notes for the reviewer

Checklist

@dmigo dmigo requested a review from a team as a code owner August 8, 2022 14:25
@dmigo dmigo marked this pull request as draft August 8, 2022 14:27
@dmigo dmigo force-pushed the support-dimensions-on-dc-store branch from ef206ea to 7f51ea0 Compare August 9, 2022 09:27
@dmigo dmigo changed the title Support embedding dimensions on DeepsetCloudDocumentStore feat: Support embedding dimensions on DeepsetCloudDocumentStore Aug 9, 2022
@tstadel
Copy link
Member

tstadel commented Aug 9, 2022

Removing return_embedding from get_document is fine for now. In the future we might support that in all document stores: #3007. Currently it's just confusing as it has no effect.

Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes for deepsetcloud look good! However I guess something with your schema generation went wrong. There are multiple types missing that should be there. We first have to fix that before merging. Can you please revert all changes to haystack/json-schemas/haystack-pipeline-master.schema.json? The schema should update automatically during CI...

@anakin87
Copy link
Member

anakin87 commented Aug 9, 2022

Sorry for the intrusion.
I encountered the same problem with schema generation in the past. I add some information to be helpful...

  • the CI shows an error, requesting to locally update and commit the JSON schema
  • you try to generate the schema but the right schema is generated only if:
    • you have a full installation of Haystack (pip install -e '.[all]')
    • every module is working and importable (update_json_schema in generate_json_schema.py somewhere tries to import all the possible nodes and I found out that in my installation audio nodes were not working.)

@dmigo
Copy link
Member Author

dmigo commented Aug 9, 2022

@anakin87 Thanks for the hints! pip install -e '.[all]' is what I try to run at the moment, however some packages fail to install due to M1 probably. So, I'm debugging this behaviour right now.

@dmigo
Copy link
Member Author

dmigo commented Aug 9, 2022

To generate a valid schma I did:

  1. brew install openblas
  2. brew upgrade cmake
  3. Excluded onnx from the list of dependencies
  4. GRPC_PYTHON_BUILD_SYSTEM_ZLIB=true OPENBLAS="$(brew --prefix openblas)" pip install -e '.[all]'
  5. python .github/utils/generate_json_schema.py

@dmigo dmigo marked this pull request as ready for review August 9, 2022 14:38
@dmigo dmigo requested a review from a team as a code owner August 9, 2022 14:38
haystack/document_stores/deepsetcloud.py Outdated Show resolved Hide resolved
haystack/document_stores/deepsetcloud.py Outdated Show resolved Hide resolved
@tstadel tstadel added type:feature New feature or request ignore-for-release-notes PRs with this flag won't be included in the release notes. topic:dc_document_store topic:dc labels Aug 11, 2022
@dmigo dmigo force-pushed the support-dimensions-on-dc-store branch from f9f3489 to 8d8c589 Compare August 11, 2022 18:16
Copy link
Member

@tstadel tstadel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tstadel tstadel merged commit da7836a into master Aug 12, 2022
@tstadel tstadel deleted the support-dimensions-on-dc-store branch August 12, 2022 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ignore-for-release-notes PRs with this flag won't be included in the release notes. topic:dc_document_store type:feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants