Update LFQA with the latest LFQA seq2seq and retriever models #2210

vblagoje · 2022-02-17T11:04:13Z

Enables the use of the new LFQA model and its DPR-based context and question encoders.
The old LFQA architecture continues to be supported.
Haystack users can choose which architecture to choose (old vs. new LFQA)
Mix/match old/new retriever with old/new seq2seq generator is also possible
The new LFQA architecture has almost 2x better KILT retriever scores on Wikipedia-like text.
Empirically the grounding is better in the new architecture (no concrete metrics available yet)
Motivation for the new LFQA architecture is detailed in this article

Note - do not merge before we update documentation. What would you guys like the new LFQA docs to look like? We can, for example, showcase both architectures in a single notebook? Feedback and suggestions are appreciated.

Status (please check what you already did):

First draft (up for discussions & feedback)
Final code
Added tests
Updated documentation

vblagoje · 2022-02-18T07:58:49Z

Guess we have three options: a) Keep documentation as is, b) Update documentation to use new LFQA, c) showcase both old and new LFQA in one notebook. Although this last one might be cool from my perspective, maybe it would be breaking Haystack's pattern of small, focused documentation examples. LMK @TuanaCelik

TuanaCelik · 2022-02-18T08:29:25Z

Good question. Let me check with @brandenchan while I am also understanding how we would ideally do these things 👍🏽

TuanaCelik · 2022-02-22T10:22:31Z

Thanks for waiting @vblagoje Here's how we suggest to proceed:

Update the tutorial so that new and improved LFQA is the default.
You could also add a note in the tutorial about how his latest version is an improvement on the previous. You could also add a link to the blog you wrote if you like
If you want to go further with the LFQA tutorial, you could also add a paragraph at the beginning explaining why this is so exciting and point to some resources about LFQA and a preview you can achieve by running the notebook

Branden went and had a look at our documentation and there's actually no need for change there because we don't specify which model it uses. So just the tutorial :)

Hope this helps! Let me know if there's anything else

vblagoje · 2022-02-22T10:41:25Z

Ok thanks @TuanaCelik When you say LFQA tutorial, what exactly are you referring to?

TuanaCelik · 2022-02-22T11:00:24Z

Ah sorry @vblagoje, we were referring to the .py tutorial script here and .ipynb notebook here.

Just to confirm: I assume to get the new LFQA to run, we just need to specify a new LFQA model name yes?

vblagoje · 2022-02-22T11:09:44Z

A bit more change needed - we need to use DPR encoders instead of EmbeddingReetriever

TuanaCelik · 2022-02-22T13:54:10Z

Hey @vblagoje , yes you're right. You can go ahead with changing the EmbeddingRetreiver to DPR and make the changes so that the LFQA models are used by default (e.g. when you change the EmbeddingRetriever and also the generator to use the new lfqa so yjernite/bart_eli5" to "vblagoje/bart_lfqa")

If that and the updates from the previous comment are complete we will make sure on our side to include the relevant info in our documentation here

review-notebook-app · 2022-02-23T16:57:15Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ZanSara · 2022-03-03T15:54:55Z

Just a note: please ignore the merge conflicts for docs and json-schemas. Tag me when you are ready to merge and I'll take care of them 🙂

vblagoje · 2022-03-07T16:13:35Z

@ZanSara this was ready to merge two weeks ago. Apologies, should have been more explicit about it.

ZanSara · 2022-03-08T14:07:17Z

@vblagoje Great, thanks for the contribution!

vblagoje and others added 4 commits February 23, 2022 18:07

Register BartEli5Converter for vblagoje/bart_lfqa model

9cccc8a

Update LFQA unit tests

f239127

Update LFQA tutorials

7d35978

Update Documentation & Code Style

f015559

Merge branch 'master' into seq2seq_new_lfqa

4aeb07f

ZanSara self-requested a review March 8, 2022 14:07

ZanSara approved these changes Mar 8, 2022

View reviewed changes

ZanSara merged commit 6c0094b into deepset-ai:master Mar 8, 2022

vblagoje deleted the seq2seq_new_lfqa branch February 28, 2023 12:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update LFQA with the latest LFQA seq2seq and retriever models #2210

Update LFQA with the latest LFQA seq2seq and retriever models #2210

vblagoje commented Feb 17, 2022 •

edited

vblagoje commented Feb 18, 2022

TuanaCelik commented Feb 18, 2022

TuanaCelik commented Feb 22, 2022

vblagoje commented Feb 22, 2022

TuanaCelik commented Feb 22, 2022 •

edited

vblagoje commented Feb 22, 2022

TuanaCelik commented Feb 22, 2022

review-notebook-app bot commented Feb 23, 2022

ZanSara commented Mar 3, 2022

vblagoje commented Mar 7, 2022

ZanSara commented Mar 8, 2022

Update LFQA with the latest LFQA seq2seq and retriever models #2210

Update LFQA with the latest LFQA seq2seq and retriever models #2210

Conversation

vblagoje commented Feb 17, 2022 • edited

vblagoje commented Feb 18, 2022

TuanaCelik commented Feb 18, 2022

TuanaCelik commented Feb 22, 2022

vblagoje commented Feb 22, 2022

TuanaCelik commented Feb 22, 2022 • edited

vblagoje commented Feb 22, 2022

TuanaCelik commented Feb 22, 2022

review-notebook-app bot commented Feb 23, 2022

ZanSara commented Mar 3, 2022

vblagoje commented Mar 7, 2022

ZanSara commented Mar 8, 2022

vblagoje commented Feb 17, 2022 •

edited

TuanaCelik commented Feb 22, 2022 •

edited