Skip to content

Latest commit

 

History

History
20 lines (17 loc) · 991 Bytes

README.md

File metadata and controls

20 lines (17 loc) · 991 Bytes

Overview

The aim of this work is based on two phases: the first one is to build a corpus dedicated to the detection and correction of spelling errors in Arabic texts that we call SPIRAL and the second phase is to see the impact of the corpus through an experimental study using a Transformer-based Model for Arabic Language Understanding. The results obtained using the F1 metric were: 80.2% for morphology error, 81.6% for phonetic error, 73% for physical error, 78.3% for permutation error, 64.3% for keyboard error, 33.7% for delete error, 86% for space-issues error, and 84.5% for tachkil error.

dataset

The dataset is under request.

Citation

@article{aichaoui2022automatic,
  title={Automatic Building of a Large Arabic Spelling Error Corpus},
  author={Aichaoui, Shaimaa Ben and Hiri, Nawel and Dahou, Abdelhalim Hafedh and Cheragui, Mohamed Amine},
  journal={SN Computer Science},
  volume={4},
  number={2},
  pages={108},
  year={2022},
  publisher={Springer}
}