You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
YAML::PP fails to load valid YAML files that have plain scalars that start with a printable Unicode character in the range \u0080 through \u00FF. That is, characters that are printable in ASCII and Unicode work as first character, as do characters that are \u0100 or higher. Affected characters work fine if they're not the first as well.
I suspect the common Perl "Unicode bug" in the regexs handling plain scalars but I wasn't able to easily identify a fix within YAML::PP. Quoting the plain scalar is a sufficient workaround.
YAML::PP is version 0.009, as installed from App::Cpanminus under perlbrew using perl 5.28.0.
Output
The bug manifests in a Perl exception that generates output similar to:
$ perl test2.pl
Line : 5
Column : 15
Expected : ALIAS DOUBLEQUOTE FLOWMAP_START FLOWSEQ_START FOLDED LITERAL PLAIN SINGLEQUOTE
Got : Invalid plain scalar
Where : perl-5.28.0/lib/site_perl/5.28.0/YAML/PP/Parser.pm line 516
YAML : "\x{c9}ric Bischoff\n"
at perl-5.28.0/lib/site_perl/5.28.0/YAML/PP/Loader.pm line 60.
The "\x{c9}ric Bischoff" is Éric Bischoff in the source YAML file (with the É being \u00c9, the exact UTF-8 bytes in the source files are c3 89).
Troubleshooting Performed
I have confirmed the source file is both valid UTF-8 (using iconv) and valid YAML (using various online validators). The YAML 1.2 spec appears to say this should work with all printable Unicode characters that aren't "indicators" or otherwise confusable with other YAML syntax.
I mentioned that higher Unicode code points are unaffected. In fact a name of ☃ric Bischoff (snowman as first character, \u2603) works perfectly.
The workaround around we identified for those who can't change their name so easily is to quote the name, which YAML::PP parses fine.
Test case
This test case reproduces the bug and prints out characters which cause YAML::PP to fail to load in the range noted.
use 5.014;
use YAML::PP qw(Load);
use feature 'unicode_strings';
# Allow Perl to spit out UTF-8 to STDOUTbinmodeSTDOUT, ':encoding(UTF-8)';
my$base = "description: Foo\nmembers:\n- displayname: ";
# Toggle single-quoting or plain scalar testcase$base .= @ARGV ? "'Xric Bischoff'" : "Xric Bischoff";
my$index = index ($base, 'X');
say"$base\n\n---------\nReplacing 'X' with other printable chars:";
formy$char (0x21 .. 0x110) {
my$str = $base;
# Unprintable chars are not valid parts of a plain scalarmy$replacement = chr($char);
nextif$replacement !~ /[[:print:]]/;
substr ($str, $index, 1, $replacement);
my$data = eval { Load($str) };
saysprintf ("\\x%X (%s)", $char, $replacement), " doesn't work"if ($@);
}
The text was updated successfully, but these errors were encountered:
YAML::PP fails to load valid YAML files that have plain scalars that start with a printable Unicode character in the range \u0080 through \u00FF. That is, characters that are printable in ASCII and Unicode work as first character, as do characters that are \u0100 or higher. Affected characters work fine if they're not the first as well.
I suspect the common Perl "Unicode bug" in the regexs handling plain scalars but I wasn't able to easily identify a fix within YAML::PP. Quoting the plain scalar is a sufficient workaround.
YAML::PP is version 0.009, as installed from App::Cpanminus under perlbrew using perl 5.28.0.
Output
The bug manifests in a Perl exception that generates output similar to:
The "\x{c9}ric Bischoff" is Éric Bischoff in the source YAML file (with the É being \u00c9, the exact UTF-8 bytes in the source files are
c3 89
).Troubleshooting Performed
I have confirmed the source file is both valid UTF-8 (using
iconv
) and valid YAML (using various online validators). The YAML 1.2 spec appears to say this should work with all printable Unicode characters that aren't "indicators" or otherwise confusable with other YAML syntax.I mentioned that higher Unicode code points are unaffected. In fact a name of ☃ric Bischoff (snowman as first character, \u2603) works perfectly.
The workaround around we identified for those who can't change their name so easily is to quote the name, which YAML::PP parses fine.
Test case
This test case reproduces the bug and prints out characters which cause YAML::PP to fail to load in the range noted.
The text was updated successfully, but these errors were encountered: