Where do I get the document subset of Cord-19 used for covid-qa #182

jdpsen · 2022-09-23T12:16:04Z

The paper mentions "We selected 147 scientific articles mostly related to COVID-19 from the CORD-19" . How can I get the subset of documents to create an index ?

Timoeller · 2022-09-23T13:18:13Z

You can convert the QA dataset into the documents used.
Here you find the QA dataset: https://github.com/deepset-ai/COVID-QA/blob/master/data/question-answering/200423_covidQA.json

In this JSON there are fields called "context" where the document texts are.

For what do you want to create an index? Are you using Haystack for creating a searchable index?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where do I get the document subset of Cord-19 used for covid-qa #182

Where do I get the document subset of Cord-19 used for covid-qa #182

jdpsen commented Sep 23, 2022

Timoeller commented Sep 23, 2022

Where do I get the document subset of Cord-19 used for covid-qa #182

Where do I get the document subset of Cord-19 used for covid-qa #182

Comments

jdpsen commented Sep 23, 2022

Timoeller commented Sep 23, 2022