Please sign in to comment.
- Fixed bug #52981 (Unicode casing table was out-of-date).
Updated with UnicodeData-6.0.0d7.txt and included the source of the generator program with the distribution. #The replaced tables, generated circa 2002, seem to reflect #Unicode 3.2. I was unable to generate the same property #offsets with Unicode 3.2 data, but all the tests I made #indicate php_unicode_is_prop() is returning the correct #values. The replaced file merely says it used a "modified #version" of ucgendat, which is not very helpful. The results #I got were not significantly different, only slightly higher #offsets at two properties, which were carried over to the #subsequent properties. #I was, however, able to replicate precisely the casing table. #The extent of the "modifications" besides omitting most of #the tables, a slightly different layout and the casing table #offsets having been multiplied by 3 is unclear. #The test suite showed no regressions; however, it's very poor #in testing the modified portion of the extension.
- Loading branch information...
Showing with 6,277 additions and 2,735 deletions.
|@@ -0,0 +1,23 @@|
|+Bug #52981 (Unicode properties are outdated (from Unicode 3.2))|
|+<?php extension_loaded('mbstring') or die('skip mbstring not available'); ?>|
|+ $upper = mb_strtoupper($str, 'UTF-8');|
|+ $len = strlen($upper);|
|+ for ($i = 0; $i < $len; ++$i) echo dechex(ord($upper[$i])) . ' ';|
|+ echo "\n";|
|+test("\xF0\x90\x90\xB8");// U+10438 DESERET SMALL LETTER H (added in 3.1.0, March 2001)|
|+// not OK|
|+test("\xE2\xB0\xB0"); // U+2C30 GLAGOLITIC SMALL LETTER AZU (added in 4.1.0, March 2005)|
|+test("\xD4\xA5"); // U+0525 CYRILLIC SMALL LETTER PE WITH DESCENDER (added in 5.2.0, October 2009)|
|+f0 90 90 90|
|+e2 b0 80|
Oops, something went wrong.