Note that codepoint != grapheme

Jan 29, 2011
I can only think of one occasion where I've ever had to do this. Maybe
other people do this all the time.
+ (In any case, this gives you N codepoints, which is not necessarily the
+ same as N graphemes or "visible characters". Some codepoints are invisible,
+ and some combinations of codepoints form a single grapheme, so you could
+ end up splitting a grapheme)
* You can write regular expressions to match against UTF-8 strings. Of
course, ruby 1.8 can do that, by the much simpler approach of tagging the
regexp as UTF-8, rather than every other string object in the system.

