Convert Unicode text into safe ASCII strings
Emacs Lisp
Switch branches/tags
Nothing to show
Pull request Compare This branch is even with sindikat:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.org
unidecode-chars.el
unidecode-test.el
unidecode.el

README.org

TODOs

  • Optimize unidecode-chars.el, probably the same way, as in Python Unidecode
  • In unidecode-unidecode function i can also use (map ‘string …) instead of (apply #’concat (mapcar …)), but i’m not sure how to require cl-extra.el

How this package was made

The following is the explanation of the process of converting the Python package to Emacs Lisp.

The following Python script was run in Unidecode’s source directory to load its data to final_data list.

import os, sys

xs = os.listdir('.')
xs.remove('__init__.py')
xs.sort()

final_data = []
for filename in xs:
    module = __import__(filename[:-3])
    module_data = list(module.data)

    # some modules have data of 255 entries, fill them up to 256
    if len(module_data) < 256:
        module_data += [''] * (256 - len(module_data))

    final_data.extend(module_data)

The following script was used to dump final_data to a JSON file.

import json

with open('unidecode.json', 'w') as filename:
    json.dump(final_data, filename)

The following command was used to load JSON data as Emacs Lisp vector, after installing json package in Emacs:

(json-read-file "unidecode.json")

After that the resulting vector was just stored verbatim in unidecode-chars.el.