Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs of DeepsetCloudDocumentStore #2460

Merged
merged 5 commits into from
Apr 27, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions docs/_src/api/api/document_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -3987,18 +3987,20 @@ See https://haystack.deepset.ai/components/document-store for more information.

- `api_key`: Secret value of the API key.
If not specified, will be read from DEEPSET_CLOUD_API_KEY environment variable.
- `workspace`: workspace in Deepset Cloud
- `index`: index to access within the Deepset Cloud workspace
See docs on how to generate an API key for your workspace: https://docs.cloud.deepset.ai/docs/connect-deepset-cloud-to-your-application
- `workspace`: workspace name in Deepset Cloud
- `index`: name of the index to access within the Deepset Cloud workspace. This equals typically the name of your pipeline.
You can run Pipeline.list_pipelines_on_deepset_cloud() to see all available ones.
- `duplicate_documents`: Handle duplicates document based on parameter options.
Parameter options : ( 'skip','overwrite','fail')
skip: Ignore the duplicates documents
overwrite: Update any existing documents with the same ID when adding documents.
fail: an error is raised if the document ID of the document being added already
exists.
- `api_endpoint`: The URL of the Deepset Cloud API.
- `api_endpoint`: The URL of the Deepset Cloud API. Usually this is: "https://api.cloud.deepset.ai/api/v1".
If not specified, will be read from DEEPSET_CLOUD_API_ENDPOINT environment variable.
- `similarity`: The similarity function used to compare document vectors. 'dot_product' is the default since it is
more performant with DPR embeddings. 'cosine' is recommended if you are using a Sentence BERT model.
more performant with DPR embeddings. 'cosine' is recommended if you are using a Sentence Transformer model.
- `label_index`: index for the evaluation set interface
- `return_embedding`: To return document embedding.

Expand Down
12 changes: 7 additions & 5 deletions haystack/document_stores/deepsetcloud.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from haystack.utils import DeepsetCloud


DEFAULT_API_ENDPOINT = f"DC_API_PLACEHOLDER/v1" # TODO
DEFAULT_API_ENDPOINT = "https://api.cloud.deepset.ai/api/v1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is superfluous. The actual DEFAULT_API_ENDPOINT has moved to haystack/utils/deepsetcloud.py. There it makes sense to adjust it. But this one can be safely removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, adjusted it


logger = logging.getLogger(__name__)

Expand All @@ -35,18 +35,20 @@ def __init__(

:param api_key: Secret value of the API key.
If not specified, will be read from DEEPSET_CLOUD_API_KEY environment variable.
:param workspace: workspace in Deepset Cloud
:param index: index to access within the Deepset Cloud workspace
See docs on how to generate an API key for your workspace: https://docs.cloud.deepset.ai/docs/connect-deepset-cloud-to-your-application
:param workspace: workspace name in Deepset Cloud
:param index: name of the index to access within the Deepset Cloud workspace. This equals typically the name of your pipeline.
You can run Pipeline.list_pipelines_on_deepset_cloud() to see all available ones.
:param duplicate_documents: Handle duplicates document based on parameter options.
Parameter options : ( 'skip','overwrite','fail')
skip: Ignore the duplicates documents
overwrite: Update any existing documents with the same ID when adding documents.
fail: an error is raised if the document ID of the document being added already
exists.
:param api_endpoint: The URL of the Deepset Cloud API.
:param api_endpoint: The URL of the Deepset Cloud API. Usually this is: "https://api.cloud.deepset.ai/api/v1".
If not specified, will be read from DEEPSET_CLOUD_API_ENDPOINT environment variable.
tholor marked this conversation as resolved.
Show resolved Hide resolved
:param similarity: The similarity function used to compare document vectors. 'dot_product' is the default since it is
more performant with DPR embeddings. 'cosine' is recommended if you are using a Sentence BERT model.
more performant with DPR embeddings. 'cosine' is recommended if you are using a Sentence Transformer model.
:param label_index: index for the evaluation set interface

:param return_embedding: To return document embedding.
Expand Down