Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
keras
models-EG-CRF
models-egy
models
README.md
arabic_pos_tagger.py
convert_files2utf8.py

README.md

Dialectal Arabic POS Tagger

Dialectal Arabic POS Tagger is a freeware module developed by the ALT team at Qatar Computing Research Institute (QCRI) to process Dialectal Arabic. The tagger was trained on a collection of dialectal Arabic tweets collected from frour regions - Egypt, Gulf, Maghrib and Levantine.

Arabic Dialects POS Tagger implemented using Keras/BiLSTM/ChainCRF.

Requirements

The tagger requires the following packages:

Installation

You can install the Dialectal Arabic POS Tagger by cloning the repo:

Installing Dialectal Arabic POS Tagger from github

Clone the repo from the github using the following command:

git clone https://github.com/qcri/dialectal_arabic_pos_tagger

Or download the compressed file of the project, extract it.

Getting started

Dialectal Arabic POS Tagger reads an input Arabic text file and produces the POS tags, one segment per line. The tagger expects the input file encoded in UTF-8,

python arabic_pos_tagger.py -i [in-file] -o [out-file] 

using a specific model:

python arabic_pos_tagger.py -m [model-dir] -i [in-file] -o [out-file] 

For more details see:

python arabic_pos_tagger.py -h

Publications

Randah Alharbi, Walid Magdy, Kareem Darwish, Ahmed Abdelali and Hamdy Mubarak. (2018) Part-of-Speech Tagging for Arabic Gulf Dialect Using Bi-LSTM. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). May 7-12, 2018. Miyazaki, Japan. Pages 3925-3932.

Kareem Darwish, Hamdy Mubarak, Ahmed Abdelali, Mohamed Eldesouki, Younes Samih, Randah Alharbi, Mohammed Attia, Walid Magdy and Laura Kallmeyer. (2018) Multi-Dialect Arabic POS Tagging: A CRF Approach. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). May 7-12, 2018. Miyazaki, Japan. Pages 93-98.

Support

You can ask questions and join the development discussion:

You can also post bug reports and feature requests (only) in Github issues. Make sure to read our guidelines first.

License

Dialectal Arabic POS Tagger is covered by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.


You can’t perform that action at this time.