Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tree: bfbdd33091
Fetching contributors…

Cannot retrieve contributors at this time

83 lines (63 sloc) 3.288 kb
++PHP UTF-8++
Version 0.5
Documentation in progress in ./docs dir
Important Note: DO NOT use these functions without understanding WHY
you are using them. In particular, do not blindly replace all use of PHP's
string functions which functions found here - most of the time you will
not need to, and you will be introducing a significant performance
overhead to your application. You can get a good idea of when to use what
from reading:
Important Note: For sake of performance most of the functions here are
not "defensive" (e.g. there is not extensive parameter checking, well
formed UTF-8 is assumed). This is particularily relevant when is comes to
catching badly formed UTF-8 - you should screen input on the "outer
perimeter" with help from functions in the utf8_validation.php and
utf8_bad.php files.
Important Note: this library treats ALL ASCII characters as valid, including ASCII control characters. But if you use some ASCII control characters in XML, it will render the XML ill-formed. Don't be a bozo:
Please report bugs to:
- if you are able, please submit a failing unit test
( with your bug report.
For feature requests / faster implementation of functions found here,
please drop them in via the RFE tracker:
Particularily interested in faster implementations!
For general support / help, use:
In the VERY WORST case, you can email me: hfuecks gmail com - I tend to be slow to respond though so be warned.
Important Note: when reporting bugs, please provide the following
PHP version, whether the iconv extension is loaded (in PHP5 it's
there by default), whether the mbstring extension is loaded. The
following PHP script can be used to determine this information;
print "PHP Version: " .phpversion()."<br>";
if ( extension_loaded('mbstring') ) {
print "mbstring available<br>";
} else {
print "mbstring not available<br>";
if ( extension_loaded('iconv') ) {
print "iconv available<br>";
} else {
print "iconv not available<br>";
Parts of the code in this library come from other places, under different
The authors involved have been contacted (see below). Attribution for
which code came from elsewhere can be found in the source code itself.
+Andreas Gohr / Chris Smith - Dokuwiki
There is a fair degree of collaboration / exchange of ideas and code
beteen Dokuwiki's UTF-8 library;
and phputf8. Although Dokuwiki is released under GPL, its UTF-8
library is released under LGPL, hence no conflict with phputf8
+Henri Sivonen ( / has also given permission for his
code to be released under the terms of the LGPL. He ported a Unicode / UTF-8
converter from the Mozilla codebase to PHP, which is re-used in phputf8
Jump to Line
Something went wrong with that request. Please try again.