Overview

The aim of this work is based on two phases: the first one is to build a corpus dedicated to the detection and correction of spelling errors in Arabic texts that we call SPIRAL and the second phase is to see the impact of the corpus through an experimental study using a Transformer-based Model for Arabic Language Understanding. The results obtained using the F1 metric were: 80.2% for morphology error, 81.6% for phonetic error, 73% for physical error, 78.3% for permutation error, 64.3% for keyboard error, 33.7% for delete error, 86% for space-issues error, and 84.5% for tachkil error.

dataset

The dataset is under request.

Citation

@article{aichaoui2022automatic,
  title={Automatic Building of a Large Arabic Spelling Error Corpus},
  author={Aichaoui, Shaimaa Ben and Hiri, Nawel and Dahou, Abdelhalim Hafedh and Cheragui, Mohamed Amine},
  journal={SN Computer Science},
  volume={4},
  number={2},
  pages={108},
  year={2022},
  publisher={Springer}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Overview

dataset

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Overview

dataset

Citation