Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recent breaking changes & Migration guidance #1590

Closed
tholor opened this issue Oct 13, 2021 · 3 comments
Closed

Recent breaking changes & Migration guidance #1590

tholor opened this issue Oct 13, 2021 · 3 comments

Comments

@tholor
Copy link
Member

tholor commented Oct 13, 2021

In preparation for Haystack 1.0, we conduct some major refactorings in the next weeks incl. several breaking changes.

We will keep a list of breaking changes here to help you migrate your own Haystack scripts and applications.

Refactoring of primitives (#1398)

We improved the basic datatypes. This might particularly break your code to

  • write Documents to the doc store
  • process Answers from a pipeline / reader

Document

  • Renamed Document.text -> Document.content
  • Removed Document.question (was only used a while ago in FAQ search cases)
  • Rename text_field -> content_field in ElasticsearchDocumentStore & Weaviate init
  • Remove faq_question_field in ElasticsearchDocumentStore & Weaviate init

Label

  • Rename Label.question -> Label.query
  • Make Label.answer an Answer obj rather than plain str
  • Remove Label.document_id (can now be accessed via Label.document.id)
  • Rename Label.model_id -> Label.pipeline_id
  • Remove Label.offset_start_in_doc (can now be accessed via label.answer.offsets_in_document[0].start`

Answer

The reader returns now an Answer object rather than a dict.
It follows this new structure:

Particularly the handling of offsets has changed to be more explicit and allow for other multiple spans (e.g. TableQA):

# Old -> New 

answer["offset_start_in_doc"] -> answer.offsets_in_document[0].start
answer["offset_end_in_doc"] -> answer.offsets_in_document[0].end

answer["offset_start"] -> answer.offset_in_context[0].start
answer["offset_end"] -> answer.offset_in_context[0].end
@tholor tholor pinned this issue Oct 13, 2021
@pk1130
Copy link

pk1130 commented Oct 14, 2021

Hey @tholor! Apologies for commenting on a pinned message. I have recently installed haystack on my local machine using the git+https method. Since I haven't used the repo too much yet, would it work to do a pip reinstall to get the newest version of Haystack working on my machine? I'm specifically using Haystack for building a retriever using DPR.

Thanks a ton!

@ZanSara
Copy link
Contributor

ZanSara commented Oct 15, 2021

Hey @pk1130, if you installed Haystack with:

pip install git+https://github.com/deepset-ai/haystack.git

then you should reinstall it with pip.

Instead, if you installed it with:

git clone https://github.com/deepset-ai/haystack.git
cd haystack
pip install --editable .

then you should go in the root haystack folder and pull the latest master with:

git pull
pip install dataclasses-json

There is no need to pip reinstall Haystack after that (given that you used the option --editable as above).

Hope it helps!


PS: next time please open a separate issue 😉

@ZanSara
Copy link
Contributor

ZanSara commented Apr 11, 2022

As we have now a migrating guide and this is becoming obsolete, I will close this issue. Please open a new one if you face migration issues 🙂

@ZanSara ZanSara closed this as completed Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants