ARASTEM is a new corpus dedicated to the Arabic stemming field, where it contains several documents containing grouped words which are semantically and morphologically related. Hence, the corpus was constructed manually by the full intervention of native Arabic speakers after collecting several texts from different Arabic discussion forums. Furt…
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ARASTEM.rar
LICENSE
README.md

README.md

ARASTEM-corpus

ARASTEM is a new corpus dedicated to the Arabic stemming field, where it contains several documents containing grouped words which are semantically and morphologically related. Hence, the corpus was constructed manually by the full intervention of native Arabic speakers after collecting several texts from different Arabic discussion forums. Furthermore, it contains words belonging to the Standard Arabic, Dialectical Arabic and Modern Pseudo Arabic languages.

Contributors: Ibtissem Abainia, Ahmed Kedaya, Chouaib Fellah and Otman Bordjiba