Monospace Unicode character width in Ruby
Latest commit 3ddf5e6 Sep 7, 2016 @janlelis Release 1.1.1
Failed to load latest commit information.
data Support Unicode 9.0, release v1.1.0 Jun 21, 2016
lib/unicode Release 1.1.1 Sep 7, 2016
spec Improve spec Mar 2, 2016
.editorconfig add .editorconfig Jun 9, 2014
.gitignore Remove EastAsianWidth.txt from repo (not necessary) Mar 2, 2016
.rspec specs Feb 5, 2015
.travis.yml Allow JRuby HEAD to fail Mar 29, 2016
CHANGELOG.txt Always load index into memory, fixes #9 Sep 7, 2016 Add Code of Conduct Mar 2, 2016
Gemfile specs Feb 5, 2015
MIT-LICENSE.txt Support Unicode 9.0, release v1.1.0 Jun 21, 2016
Rakefile Move index builder into its own class Mar 2, 2016

Unicode::DisplayWidth [version]

Determines the monospace display width of a string in Ruby. Implementation based on EastAsianWidth.txt and other data, 100% in Ruby. Other than wcwidth(), which fulfills a similar purpose, it does not rely on the OS vendor to provide an up-to-date method for measuring string width.

Unicode version: 9.0.0

Introduction to Character Widths

Guesing the correct space a character will consume on terminals is not easy. There is no single standard. Most implementations combine data from East Asian Width, some General Categories, and hand-picked adjustments.

How this Library Handles Widths

Further at the top means higher precedence. Please expect changes to this algorithm with every MINOR version update (the X in 1.X.0)!

Width Characters Comment
X (user defined) Overwrites any other values
-1 "\b" Backspace (total width never below 0)
0 "\0", "\x05", "\a", "\n", "\v", "\f", "\r", "\x0E", "\x0F" C0 control codes that do not change horizontal width
1 "\u{00AD}" SOFT HYPHEN
2 "\u{2E3A}" TWO-EM DASH
3 "\u{2E3B}" THREE-EM DASH
0 General Categories: Mn, Me, Cf (non-arabic) Excludes ARABIC format characters
0 "\u{1160}".."\u{11FF}" HANGUL JUNGSEONG
2 East Asian Width: F, W Full-width characters
1 or 2 East Asian Width: A Ambiguous characters, user defined, default: 1
1 All other codepoints -


Install the gem with:

gem install unicode-display_width

Or add to your Gemfile:

gem 'unicode-display_width'


require 'unicode/display_width'

Unicode::DisplayWidth.of("") # => 1
Unicode::DisplayWidth.of("") # => 2

Ambiguous Characters

The second parameter defines the value returned by characterrs defined as ambiguous:

Unicode::DisplayWidth.of("·", 1) # => 1
Unicode::DisplayWidth.of("·", 2) # => 2

Custom Overwrites

You can overwrite how to handle specific code points by passing a hash (or even a proc) as third parameter:

Unicode::DisplayWidth.of("a\tb", 1, 0x09 => 10)) # => 12

Usage with String Extension

Activated by default. Will be deactivated in version 2.0:

require 'unicode/display_width/string_ext'

"".display_width #=> 1
''.display_width #=> 2

You can actively opt-out from the string extension with: require 'unicode/display_width/no_string_ext'

Usage From the CLI

Use this one-liner to print out display widths for strings from the command-line:

$ gem install unicode-display_width
$ ruby -r unicode/display_width -e 'puts Unicode::DisplayWidth.of $*[0]' -- "一"

Replace "一" with the actual string to measure

Other Implementations & Discussion

See unicode-x for more Unicode related micro libraries.

Copyright & Info