This corpus contains 1166 singular-plural pairs in Moroccan Arabic (Darija), extracted from the Darija Open Dataset (DODa, Outchakoucht and Es-Samaali, 2021). Each entry includes:
- Singular form (in IPA).
- Plural form (in IPA).
- Gloss (English translation).
- Singular and plural patterns (or template).
- Classification of the plural as "sound" or "broken."
This corpus is designed for linguistic research, particularly in the study of Moroccan Arabic morphology.
The corpus is provided as a .csv file with the following columns:
singular: Singular form of the noun.plural: Plural form of the noun.gloss: English translation of the noun.singular_pattern: Pattern of the singular form (all consonants indicated by the letter 'C').plural_pattern: Pattern of the plural form (all consonants indicated by the letter 'C').plural_type: "sound" or "broken."
To use the corpus, download the .csv file and open it with any spreadsheet software or programming language (e.g., Python, R).
This corpus is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). You are free to share and adapt the corpus for non-commercial purposes, as long as you provide appropriate attribution.
This corpus is derived from the Darija Open Dataset (DODa, Outchakoucht and Es-Samaali, 2021), which is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
If you use this corpus, please cite both this work and the original DODa dataset:
- Nirheche, A. (2025). Moroccan Arabic Plurals Corpus [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14642330
- Outchakoucht, A. and Es-Samaali, H. (2021). Darija Open Dataset (DODa). github.com/darija-open-dataset
This corpus is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This means:
- You are free to share and adapt the corpus for non-commercial purposes.
- You must give appropriate credit to the original authors.
- You may not use this corpus for commercial purposes without permission from the copyright holders of the original DODa dataset.
For questions or feedback, please contact anirheche@umass.edu.