Skip to content

MRC Datasets

Xanh Ho edited this page Jan 26, 2021 · 5 revisions

Multi-hop Reasoning Datasets

Year Dataset Task Size Source Web, Paper Answering type Created by Note
2020 2WikiMultiHopQA RC 200K Wikipedia paper, web Span extraction automated
2020 HybridQA RC 70K Wikipedia paper, web Span extraction crowdsourcing combine with tabular data
2020 R4C R4C 5K Wikipedia paper, web Span extraction crowdsourcing
2018 HotpotQA RC 113K Wikipedia paper, web Span extraction crowdsourcing
2018 ComplexWebQuestions RC 35K web snippet paper, web Span extraction automated & crowdsourcing
2018 QAngaroo RC 50K Wikipedia, MEDLINE paper, web Multiple choice automated

Commonsense Reasoning Datasets

Year Dataset Task Size Source Web, Paper Answering type Created by Note
2020 ProtoQA RC 9.8K web snippet paper, web Free answering crowdsourcing
2019 CosmosQA RC 36K narrative paper, web Multiple choice crowdsourcing
2019 HellaSWAG RC 70K web snippet paper, web Multiple choice language model
2019 MSCript 2.0 RC 20K narrative paper, web Multiple choice crowdsourcing
2019 SocialIQA OpenQA 38K paper, web Multiple choice crowdsourcing
2018 Commonsense QA OpenQA 12K ConceptNet paper, [web] Multiple choice crowdsourcing
2018 ReCoRD RC 120K news article paper, web Span extraction crowdsourcing
2018 OpenbookQA OpenQA 6.0K textbook paper, web Multiple choice crowdsourcing
2018 SWAG RC 113K video captions paper, web Multiple choice language model
2018 DuoRC RC 186K movie script paper, web Span extraction crowdsourcing
2018 MCScript RC 30K written story paper, web Multiple choice crowdsourcing

Coreference Reasoning Datasets:

Survey paper: Coreference Reasoning in Machine Reading Comprehension

Year Dataset Task Size Source Web, Paper Answering type Created by Note
2019 Quoref RC 24K Wikipedia paper, web Span extraction crowdsourcing
paper, web