Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
bin
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Assem's Arabic Stemmer DOI

This is an algorithm for Arabic stemming written on Snowball framework language. If offers light stemming and text normalization.

@article{Chelli2018,
author = "Assem Chelli",
title = "{Assem's Arabic Stemmer}",
year = "2018",
month = "11",
url = "https://figshare.com/articles/Assem_s_Arabic_Stemmer/7295690",
doi = "10.6084/m9.figshare.7295690.v1"
}

This is a sample of results:

Word Light Stemmer Root-Based Stemmer
طفل طفل طفل
اطفال اطفال طفل
الاطفال اطفال طفل
اطفالكم اطفال طفل
فأطفالكم اطفال طفل
اطفالهم اطفال طفل
والاطفال اطفال طفل
فاطفالهم اطفال طفل
وطفل طفل طفل
الطفولة طفول طفل
والطفلتين طفل طفل
طفلتان طفل طفل

Requirements:

They are already attached as git submodules so just run:

$ git submodule update --init --recursive

Build:

$ make build

Run:

  • Light Stemmer
$ make run
الطالب
طالب
  • Root-Based Stemmer
$ make run_root
الطالب
طلب

Test:

We configured tests to run against snowball-data arabic sample to test speed, grouping factor and precision.

$ make test

Distributions:

  • dist light stemmer to available languages:
$ make dist

About

Assem's Arabic Light Stemmer is a snowball-based stemming algorithm for Arabic aimed mainly to improve search.

Topics

Resources

License

Packages

No packages published
You can’t perform that action at this time.