GitHub - behdad/ucdn: Unicode Database and Normalization

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
CMakeLists.txt		CMakeLists.txt
PYTHON-LICENSE		PYTHON-LICENSE
README		README
hb-ucdn.h		hb-ucdn.h
makeunicodedata.py		makeunicodedata.py
ucdn-test.c		ucdn-test.c
ucdn.c		ucdn.c
ucdn.h		ucdn.h
unicodedata_db.h		unicodedata_db.h

Repository files navigation

UCDN - Unicode Database and Normalization

UCDN is a Unicode support library. Currently, it provides access
to basic character properties contained in the Unicode Character
Database and low-level normalization functions (pairwise canonical
composition/decomposition and compatibility decomposition). More
functionality might be provided in the future, such as additional
properties, string normalization and encoding conversion.

UCDN uses standard C89 with no particular dependencies or requirements
except for stdint.h, and can be easily integrated into existing
projects. However, it can also be used as a standalone library,
and a CMake build script is provided for this. The first motivation
behind UCDN development was to provide a standalone set of Unicode
functions for the HarfBuzz OpenType shaping library. For this purpose,
a HarfBuzz-specific wrapper is shipped along with it (hb-ucdn.h).

UCDN is published under the ISC license, please see the license header
in the C source code for more information. The makeunicodata.py script
required for parsing Unicode database files is licensed under the
PSF license, please see PYTHON-LICENSE for more information.

UCDN was written by Grigori Goronzy <greg@kinoho.net>.

How to Use

Include ucdn.c, ucdn.h and unicodedata_db.h in your project. Now,
just use the functions as documented in ucdn.h.

In some cases, it might be necessary to regenerate the Unicode
database file. The script makeunicodedata.py (Python 3.x required)
fetches the appropriate files and dumps the compressed database into
unicodedata_db.h.