The camel_arclean
utility cleans Arabic text by:
- Deleting characters that are not in Arabic, ASCII, or Latin-1.
- Converting all spacing characters to an ASCII space character.
- Converting Indic digits into Arabic digits.
- Converting extended Arabic letters into basic Arabic letters.
- Converting 1-char presentation froms into simple basic forms.
Below is the usage information that can be generated by running
camel_arclean --help
.
Usage:
camel_arclean [-o OUTPUT | --output=OUTPUT] [FILE]
camel_arclean (-v | --version)
camel_arclean (-h | --help)
Options:
-o OUTPUT --output=OUTPUT
Output file. If not specified, output will be printed to stdout.
-h --help
Show this screen.
-v --version
Show version.