Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This branch makes symbols support UTF8 internally, which means that Unicode is supported properly at the perl level. So ${"\xff"} will give you the same scalar, regardless of the internal encoding of the string. Also, many parts of the core are now nul-clean, too, as a result of the UTF8 changes, which means that ‘$m = "a\0b"; foo->$m’ will try to call the method named "a\0b", instead of just "a". Details follow. • New API functions: Many of these take a _flags parameter, which accept the SVf_UTF8 flag. • HvNAMELEN • HvNAMEUTF8 • HvENAMELEN • HvENAMEUTF8 • gv_init_pv(n)/sv • gv_fetchmeth_pv(n)/sv • gv_fetchmeth_pv(n)/sv_autoload • gv_fetchmethod_pv(n)/sv_flags — may change • gv_autoload_pv(n)/sv • newGVgen_flags • sv_derived_from_pv(n)/sv • sv_does_pv(n)/sv • whichsig_pv(n)/sv • New internal functions: • GvNAMEUTF8 • GvENAMELEN • GvENAME_HEK • CopSTASH_flags • CopSTASH_flags_set • PmopSTASH_flags • PmopSTASH_flags_set • sv_sethek • Parts of Perl that handle Unicode symbol names correctly: • Method names (including those passed to ‘use overload’) • Typeglob names (including variable and filehandle names) • Package names • Constant subroutine names (not nul-clean yet) • goto • Symbolic dereferencing • Second argument to bless() and tie() • Return value of ref() • Package names returned by caller() • Subroutine prototypes • Attributes • Warnings and error messages that mention filehandles, packages, methods, variables, constant values, subroutines, symbolic refer- ences, format names and subroutine prototypes • Parts of Perl that now handle embedded nuls correctly: • Method names • Typeglob names (including filehandle names) • Package names • Autoloading • Return value of ref() • Package names returned by caller() • Filehandle warnings • Typeglob elements (*foo{"THING\0stuff"}) • Signal names • Warnings and error messages that mention (yes, it’s the same list as above) filehandles, packages, methods, variables, constant val- ues, subroutines, symbolic references, format names and subroutine prototypes • Other bug fixes • *{é} now treats é as the name of the glob (the usual implicit quoting), instead of treating it as a bareword (strict-unsafe) or function call. *{é} used to be equivalent to *{+é}, in other words. • Modified modules: • constant has been modified not to apply the workaround for the bug that this branch fixes, if that workaround does not apply. • attributes has been modified as part of making Unicode attri- butes work. • XS::APItest • mro, as part of making method lookup account for Unicode. • Side effects • Blessing into "\0" no longer causes ref() to return false. • *{"*é::..."} is now equivalent to *{"é::..."}, just as *{"*e::..."} is equivalent to *{"e::..."}. Previously, the * was only stripped if followed by [A-Za-z]. • $é is now subject to ‘Used only once’ warnings. It used to be exempt, as the code that checked the named considered it a punctu- ation variable.
- Loading branch information
Showing
114 changed files
with
8,841 additions
and
674 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.