Skip to content

This repository includes the QQ and QH datasets mentioned in the paper :Challenging the Transformer-based models with a Classical Arabic dataset: Quran and Hadith

ShathaTm/Quran_Hadith_Datasets

Repository files navigation

Quran_Hadith_Datasets

This repository includes the QQ and QH datasets as described in the paper:

Altammami, S., Atwell, E.(2022) 'Challenging the Transformer-based models with a Classical Arabic dataset: Quran and Hadith'. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). Marseille, 20–25 June 2022.

QH_Dataset.csv

Contains 310 balanced set of related and non-related Quran-verse and Hadith-teaching pairs.

QQ_Ar_training_4072.csv

Contains 4072 balanced set of related and non-related Quran-verse pairs.

QQ_Ar_tafseer_training_8144.csv

Contains the Arabic Tafseer from Aljlyleen and Almuyaser of each pair in the QQ_Ar_training_4072.csv dataset.

QQ_En_training_20360.csv

Contains five different English translations of the Quran pairs in the QQ_Ar_training_4072.csv dataset

QQ_Ar_testing_1024.csv

Contains a balanced dataset of 1024 related and non-related Quran-verse pairs that does not exist in the training dataset QQ_Ar_training_4072.csv

About

This repository includes the QQ and QH datasets mentioned in the paper :Challenging the Transformer-based models with a Classical Arabic dataset: Quran and Hadith

Resources

Stars

Watchers

Forks

Packages

No packages published