Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
sanitize - convert Unicode text into safe ASCII strings - Emacs Lisp package
Emacs Lisp
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
README.org
unidecode-chars.el
unidecode-test.el
unidecode.el

README.org

TODOs

  • Optimize unidecode-chars.el, probably the same way, as in Python Unidecode
  • In unidecode-unidecode function i can also use (map ‘string …) instead of (apply #’concat (mapcar …)), but i’m not sure how to require cl-extra.el

How this package was made

The following is the explanation of the process of converting the Python package to Emacs Lisp.

The following Python script was run in Unidecode’s source directory to load its data to final_data list.

import os, sys

xs = os.listdir('.')
xs.remove('__init__.py')
xs.sort()

final_data = []
for filename in xs:
    module = __import__(filename[:-3])
    module_data = list(module.data)

    # some modules have data of 255 entries, fill them up to 256
    if len(module_data) < 256:
        module_data += [''] * (256 - len(module_data))

    final_data.extend(module_data)

The following script was used to dump final_data to a JSON file.

import json

with open('unidecode.json', 'w') as filename:
    json.dump(final_data, filename)

The following command was used to load JSON data as Emacs Lisp vector, after installing json package in Emacs:

(json-read-file "unidecode.json")

After that the resulting vector was just stored verbatim in unidecode-chars.el.

Something went wrong with that request. Please try again.