"Malformed UTF-8 character (unexpected end of string)" on a tainted string in 5.20 #13948
Comments
From Mark.Martinec@ijs.siCreated by Mark.Martinec@ijs.siUnder perl 5.20.0 the following program fails (or warns) on: Malformed UTF-8 character (unexpected end of string) if the character string is tainted. Leaving out the -T #!/usr/bin/perl -T my $taint = substr($ENV{PATH}, 0,0); # tainted empty string # just a convenient way to represent some real (spam) subject text: $chars =~ s{ ([\x00-\x1f\x7f\\]) }{ sprintf('\\u%.4X',ord($1)) }xgse; binmode(STDOUT,':utf8'); Perl Info
|
From @khwilliamsonOn 06/20/2014 07:53 PM, Mark Martinec (via RT) wrote:
I bisected this to: Stop pos() from being confused by changing utf8ness The value of pos() is stored as a byte offset. If it is stored on a $ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, a; So pos() should be stored as a character offset. The regular expression engine expects byte offsets always, so allow it This does result in more complexity than I should like, but the alter- :100644 100644 01a9e8b77bbb01605b78d988ccc0e83f6d826c74 |
The RT System itself - Status changed from 'new' to 'open' |
From @iabynOn Sat, Jun 21, 2014 at 01:06:42PM -0600, Karl Williamson wrote:
I can reduce the demo code to the following: $ p -Twe '$_ = "XXXX\x{1000}aaaaaaaaaaaaaaaaaXX" . $^X; s/X/"xxxxxx"/ge' I haven't looked into it any further yet. -- |
From @iabynOn Tue, Jun 24, 2014 at 11:53:02AM +0100, Dave Mitchell wrote:
Now fixed with the following. A good candidate for 5.20.1 commit cda67c9 s///e on tainted utf8 strings got pos() messed up -- |
@iabyn - Status changed from 'open' to 'pending release' |
From @khwilliamsonThanks for submitting this ticket The issue should be resolved with the release today of Perl v5.22. If you find that the problem persists, feel free to reopen this ticket -- |
@khwilliamson - Status changed from 'pending release' to 'resolved' |
Migrated from rt.perl.org#122148 (status was 'resolved')
Searchable as RT122148$
The text was updated successfully, but these errors were encountered: