An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

Global NIPS Paper Implementation Challenge

I implemented the paper based on the research methodology

Original Paper

https://arxiv.org/pdf/1611.02815.pdf

Main Goal

Develop an automated system is proposed for essay scoring in Arabic language for online exams based on stemming techniques and Levenshtein edit operations

Programming Tool

Python 2.7

Files

Some important files / directories:

heavy_stemming.py
The whole source code for heavy stemming approach
light_stemming.py
The whole source code for light stemming approach
docs
Several text files, such as questions, correct_ans, and student_ans
prefixes
Stores the list of prefixes
suffixes
Stores the list of suffixes
stopwords
Stores the list of stopwords

To Run

To run the program, execute the following command:

Heavy stemming approach: python heavy_stemming.py
Light stemming approach: python light_stemming.py

Methodology

Both approaches (heavy and light stemming) uses the following steps. The difference is only in the removal of prefixes and suffixes.

Begin Heavy Stemming on both student and correct answers
This initial step consists of two sub-steps, such as removal of numbers from both answers and removal of diacritics from both answers. For the latter task, each answer is converted to unicode then the diacritics can be removed from both answers.
Split each one of the two anwers into an array of words, processing one word at a time
It includes several steps, such as removal of stopwords, removal of prefix if word length is greater than 3, and removal of suffix if word length is greater than 3.
Find the similarities by giving a weight to each word in both answers
The weight formula for each word: Word(i) weight = 1 / (total words in correct answer)
For each word in student answer, calculate the similarity with words in correct answer
Several steps were included, such as calculating the Levenshtein distance between every word in student answer and words in correct answer AND calculating the similarity score between every word in student answer and words in correct answer.
For each word in student answer, calculate the similarity with words in correct answer
These are the rules for calculating the final mark:
- If the similarity between StudentWord(i) and CorrectWord(i) = 1 then add weight to the final mark
- Elseif the similarity between StudentWord(i) and CorrectWord(i) < 1 and >= 0.96, add weight to the final mark
- Elseif the similarity between StudentWord(i) and CorrectWord(i) >= 0.8 and < 0.96, add half the weight to the final mark
- Elseif the similarity between StudentWord(i) and CorrectWord(i) < 0.8 then no weight is added to the final mark

Albertus Kelvin
Bandung Institute of Technology

Code was developed on January 21st, 2018
Code was made publicly available on January 31st, 2018

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets/img		assets/img
docs		docs
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
heavy_stemming.py		heavy_stemming.py
light_stemming.py		light_stemming.py
prefixes		prefixes
stopwords		stopwords
suffixes		suffixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

Global NIPS Paper Implementation Challenge

Original Paper

Main Goal

Programming Tool

Files

To Run

Methodology

About

Releases

Packages

Languages

License

albertusk95/nips-challenge-essay-scoring-arabic

Folders and files

Latest commit

History

Repository files navigation

An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

Global NIPS Paper Implementation Challenge

Original Paper

Main Goal

Programming Tool

Files

To Run

Methodology

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages