Skip to content
Browse files

Note that codepoint != grapheme

  • Loading branch information...
1 parent 2545ebf commit 6a69e763aef77bf239e156ad718e055b27fc2e02 @candlerb committed Jan 29, 2011
Showing with 5 additions and 0 deletions.
  1. +5 −0 soapbox.rb
5 soapbox.rb
@@ -215,6 +215,11 @@ def append(str)
I can only think of one occasion where I've ever had to do this. Maybe
other people do this all the time.
+ (In any case, this gives you N codepoints, which is not necessarily the
+ same as N graphemes or "visible characters". Some codepoints are invisible,
+ and some combinations of codepoints form a single grapheme, so you could
+ end up splitting a grapheme)
* You can write regular expressions to match against UTF-8 strings. Of
course, ruby 1.8 can do that, by the much simpler approach of tagging the
regexp as UTF-8, rather than every other string object in the system.

0 comments on commit 6a69e76

Please sign in to comment.
Something went wrong with that request. Please try again.