Skip to content

TaghreedT/NAH-Corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAH-Hadith-Corpus

Non-authentic Hadith Corpus

  • Arabic Hadith corpus

  • It contains 452,624 words from different lesser-known Hadith books

  • It also included several annotated Hadith books, which help to determine the switch points between the Isnad, the Matan,and the comment to provide a ground truth.

  • Some of these books have both Hadiths (authentic and NAH), while others only contain NAH.

  • In NAH_Contents.csv file, you will find the list of all Hadith books in this corpus.

  • The annotating process was done to determine eight primary features for each Hadith in this corpus:

    1. No.: The Hadith reference number.

    2. Full Hadith: The Hadith as it appears in the book without annotations

    3. Isnad: The chain of narrators.

    4. Matan: The act of the Prophet Muhammad.

    5. Authors Comments: The author describes the authenticity of each Hadith.

    6. Hadith Type: The Hadith Type (Maqtu` مقطوع, Mawquf موقوف and Marfoʻ مرفوع) or Hadith degree (ضعيف, موضوع and so on).

    7. Authenticity: Whether this Hadith is authentic or non-authentic.

    8. Topic: The chapter title.

If you use the NAH corpus, Please cite this paper:

Tarmom T, Atwell E, Alsalka MA. 2020. Non-authentic Hadith Corpus: Design and Methodology. International Journal on Islamic Applications in Computer Science And Technology. 13-19 8.3 http://eprints.whiterose.ac.uk/155642/

Releases

No releases published

Packages

No packages published