Contains the CharMapper
class for mapping characters in a Unicode string to other strings.
camel_tools.utils.charmap.CharMapper
camel_tools.utils.charmap.InvalidCharMapKeyError
camel_tools.utils.charmap.BuiltinCharMapNotFoundError
JSON files to be used with CharMapper
should have the following format:
{
"default": "",
"charMap": {
"a": "z",
"b-g": "",
"x-z": null
}
}
The root object in the file should be a dictionary with two keys: 'default' and 'charMap'. These correspond to and follow the same restrictions as the respective input parameters to the CharMapper
constructor (with null in the JSON file corresponding to None in Python).
Below is a listing of built-in mappings:
- ar2bw Transliterates Arabic text to Buckwalter scheme.
- ar2safebw Transliterates Arabic text to Safe Buckwalter scheme.
- ar2xmlbw Transliterates Arabic text to XML Buckwalter scheme.
- ar2hsb Transliterates Arabic text to Habash-Soudi-Buckwalter scheme.
- bw2ar Transliterates Buckwalter scheme text to Arabic.
- bw2safebw Transliterates Buckwalter scheme text to Safe Buckwalter scheme.
- bw2xmlbw Transliterates Buckwalter scheme text to XML Buckwalter scheme.
- bw2hsb Transliterates Buckwalter scheme text to Habash-Soudi-Buckwalter scheme.
- safebw2ar Transliterates Safe Buckwalter scheme text to Arabic.
- safebw2bw Transliterates Safe Buckwalter scheme text to Buckwalter scheme.
- safebw2xmlbw Transliterates Safe Buckwalter scheme text to XML Buckwalter scheme.
- safebw2hsb Transliterates Safe Buckwalter scheme text to Habash-Soudi-Buckwalter scheme.
- xmlbw2ar Transliterates XML Buckwalter Scheme text to Arabic.
- xmlbw2bw Transliterates XML Buckwalter Scheme text to Buckwalter scheme.
- xmlbw2safebw Transliterates XML Buckwalter Scheme text to Safe Buckwalter scheme.
- xmlbw2hsb Transliterates XML Buckwalter Scheme text to Habash-Soudi-Buckwalter scheme.
- hsb2ar Transliterates Habash-Soudi-Buckwalter scheme text to Arabic.
- hsb2bw Transliterates Habash-Soudi-Buckwalter scheme text to Buckwalter scheme.
- hsb2safebw Transliterates Habash-Soudi-Buckwalter scheme text to Safe Buckwalter scheme.
- hsb2xmlbw Transliterates Habash-Soudi-Buckwalter scheme text to XML Buckwalter scheme.
See ../../reference/encoding_schemes
for more information on Arabic encoding schemes.
arclean Cleans Arabic text by
- Deleting characters that are not in Arabic, ASCII, or Latin-1.
- Converting all spacing characters to an ASCII space character.
- Converting Indic digits into Arabic digits.
- Converting extended Arabic letters into basic Arabic letters.
- Converting 1-char presentation froms into simple basic forms.