D equivalent of Posix wcwidth / wcswidth

Relevant issues

Manifest

strwidth.d	// main implementations.
benchmark.d	// benchmarking utility.
compileWidth.d	// optional utility for extracting needed width data
		// from Unicode data file.
(widthtbl.d)	// file generated by compileWidth from Unicode data

Benchmarking

To compile the benchmarking tool, if you have SCons, just run:

scons

and it will compile the benchmarking (as well the compileWidth tool, see below). You may have to edit SConstruct to point to the location of dmd on your machine.

Otherwise:

dmd -unittest -O benchmark.d strwidth.d widthtbl.d

Running the benchmark utility will test the various widthX() implementations alongside various baselines:

walkLength, which is very fast but incorrect because it does not take grapheme clusters into account;
byGraphemeWalk and graphemeStrideWalk, which do take grapheme clusters into account but are also incorrect because they don't account for East Asian Width and zero-width characters. They're also pretty slow because grapheme segmentation is expensive.

These are all tested against randomly-generated strings of various lengths and contents (ASCII-only, or a random mixture of ASCII and non-ASCII Unicode characters).

The benchmark utility takes an optional command-line argument specifying the number of strings to test per function per string type. The default number is 10000.

Generating lookup tables

The lookup tables used by isWide() and the widthX() implementations are, in part, generated from the EastAsianWidth.txt file published by the Unicode Consortium:

ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt

The code for parsing this file is in compileWidth.d.

To run compileWidth:

dmd -unittest -O compileWidth.d   # or just `scons` if you have it
./compileWidth <path_to_widths_file>

The precompiled Trie data in widthtbl.d is generated by:

./compileWidth -f trie > widthtbl.d

Note that EastAsianWidth.txt is a required input for generating this table; be sure not to overwrite widthtbl.d if you do not have this file at hand, otherwise the widthX() functions will not work correctly.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
std		std
.gitignore		.gitignore
README.md		README.md
SConstruct		SConstruct
benchmark.d		benchmark.d
compileWidth.d		compileWidth.d
strwidth.d		strwidth.d
widthtbl.d		widthtbl.d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

D equivalent of Posix wcwidth / wcswidth

Relevant issues

Manifest

Contents

Benchmarking

Generating lookup tables

About

Uh oh!

Releases

Packages

Languages

quickfur/strwidth

Folders and files

Latest commit

History

Repository files navigation

D equivalent of Posix wcwidth / wcswidth

Relevant issues

Manifest

Contents

Benchmarking

Generating lookup tables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages