Skip to content

obada-jaras/Arabic-QA-Datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arabic-QA-Datasets

data format (SQuAD format):

file.json
├── "data"
│   └── [i]
│       ├── "paragraphs"
│       │   └── [j]
│       │       ├── "context": "paragraph text"
│       │       └── "qas"
│       │           └── [k]
│       │               ├── "answers"
│       │               │   └── [l]
│       │               │       ├── "answer_start": N
│       │               │       └── "text": "answer"
│       │               ├── "id": "<uuid>"
│       │               └── "question": "paragraph question?"
│       └── "title": "document id"
└── "version": XXX

Datasets:

Dataset Q/A Pairs Paragraphs Articles
Arabic-SQuAD 48,344 10,364 -
AAQAD 17,911 3,381 299
TyDiQA 16,425 15,726 -
MLQA 5,852 5,085 2,627
ARCD 1,395 465 155
Total 89,927 35,021 19,038