my attempt at implementing Unicode
C++ Shell
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
include/libuni
src
test
.gitignore
CMakeLists.txt
Makefile
README.org
configure

README.org

libuni

libuni is my attempt to implement parts of the Unicode Standard. It is written in C++ (with the use of new C++11 features).

Currently libuni provides support for handling

  • basic UTF-8, UTF-16 (missing LE/BE, encoding), UTF-32 (missing LE/BE)
  • normalization (isNF*, toNFD, toNFKD, no working toNFC/toNFKC atm)
  • case mapping (at the moment only general case mapping (1-1) and no SpecialCasing.txt support, yet)
  • segmentation (Word Boundaries only) (UAX#29)

The implementation is partially inspired by Python’s Unicode implementation. See ICU (Boost.Locale provides a nice wrapper) or libunistring for a full Unicode implementation.

Building

Requirements

  • CMake
  • C++-Compiler with C++11 support. libuni is developed on Ubuntu 11.10 with GCC 4.6.
  • Boost: Only required for compilation (header only) and unit tests (Boost.Test).
  • UCD (see next section)

UCD

libuni generated the Unicode Database tables during compile time. It requires the ”Unicode Character Database” (UCD, see UAX#44) files which can be found at http://www.unicode.org/Public/UNIDATA/. Please read the Terms of Use before installing and using them. To make the UCD available to libuni, download the file http://www.unicode.org/Public/UNIDATA/UCD.zip and extract it into a directory called UCD in the libuni source directory. Alternatively you can set the CMake variables UCD_PATH and UCD_VERSION to point to your UCD installation.

cd libuni # move to libuni source directory
mkdir UCD
cd UCD
wget http://www.unicode.org/Public/UNIDATA/UCD.zip
unzip UCD.zip
cd ..

Compilation

libuni uses CMake but comes with a handy autotools like wrapper and can therefore be build similar to other Unix/Linux software

./configure
make
sudo make install

With make test the unit tests are build and run (requires Boost.Test).

Usage

libuni is not complete at the moment and heavily under development!

#include <iostream>
#include <libuni/utf8.hpp>
#include <libuni/normalization.hpp>

int main() {
  std::string s = "libüni";
  std::cout << std::boolalpha << libuni::is_nfd(s) << std::endl;
  std::cout << toNFD(s) << std::endl;
}