Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[document_store] Raise warning when labels are overwritten #1257

Merged
merged 12 commits into from
Jul 14, 2021
Merged

[document_store] Raise warning when labels are overwritten #1257

merged 12 commits into from
Jul 14, 2021

Conversation

akkefa
Copy link
Contributor

@akkefa akkefa commented Jul 6, 2021

Proposed changes:

Status (please check what you already did):

  • First draft (up for discussions & feedback)
  • Final code
  • Added tests
  • Updated documentation

@akkefa
Copy link
Contributor Author

akkefa commented Jul 6, 2021

@brandenchan Please review this pull request. labels overwrite warning added for SQL based document store. If I am on right track I will add the same functionality for other stores.

haystack/document_store/sql.py Outdated Show resolved Hide resolved
haystack/document_store/sql.py Outdated Show resolved Hide resolved
haystack/document_store/sql.py Outdated Show resolved Hide resolved
haystack/document_store/sql.py Outdated Show resolved Hide resolved
@akkefa akkefa requested a review from brandenchan July 8, 2021 17:47
@brandenchan
Copy link
Contributor

The latest changes look great! Thanks for those. Excited to see this working for the other document stores too!

@akkefa
Copy link
Contributor Author

akkefa commented Jul 9, 2021

The latest changes look great! Thanks for those. Excited to see this working for the other document stores too!

I will add this functionality to the base class as the get_duplicate_lables() method. All document stores will use that generic method.

@akkefa akkefa marked this pull request as ready for review July 9, 2021 19:40
@akkefa
Copy link
Contributor Author

akkefa commented Jul 9, 2021

@brandenchan Please review my changes for all document stores.

@brandenchan
Copy link
Contributor

Hey @akkefa, I just tried out these changes here are my notes.

I ran this script

from haystack.document_store import ElasticsearchDocumentStore, SQLDocumentStore
from haystack import Label
from haystack.utils import launch_es

launch_es()

labels = [Label(question="", answer="", is_correct_answer=False, is_correct_document=False, origin="asdf", id="1"),
          Label(question="", answer="", is_correct_answer=False, is_correct_document=False, origin="asdf", id="1")]

es_ds = ElasticsearchDocumentStore()
es_ds.write_labels(labels)

sql_ds = SQLDocumentStore()
sql_ds.write_labels(labels)

mem_ds = InMemoryDocumentStore()
mem_ds.write_labels(labels)
  1. ES seems to work
  2. SQL throws an error when there are duplicate ids. Let's leave it like this for now.
  3. Could you add support for the InMemoryDocumentStore too?

Once you address the comments and add support for InMemoryDocumentStore, I think we should be just about ready to merge!

haystack/document_store/base.py Outdated Show resolved Hide resolved
haystack/document_store/base.py Outdated Show resolved Hide resolved
@akkefa
Copy link
Contributor Author

akkefa commented Jul 14, 2021

@brandenchan

Once you address the comments and add support for InMemoryDocumentStore, I think we should be just about ready to merge!

@brandenchan InMemoryDocumentStore changes add. Pull request is ready for merge. 😴

@brandenchan brandenchan self-requested a review July 14, 2021 14:17
Copy link
Contributor

@brandenchan brandenchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thank you for your work on this! LGTM

@brandenchan brandenchan self-requested a review July 14, 2021 14:18
@brandenchan brandenchan merged commit 97c1e2c into deepset-ai:master Jul 14, 2021
@akkefa akkefa deleted the write_labels_warning branch July 15, 2021 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants