Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
regexp engine reads 1 beyond the string #10230
Created by @nwc10
The regexp engine often reads 1 character beyond the end of the string,
This can be seen as a bug, or can be seen as wishlist. It's also old, and
$ valgrind /home/nick/Sandpit/snap5.9.x-v5.11.5-59-g801ed99/bin/perl5.11.5 -MFile::Map -e 'File::Map::map_anonymous($a, 4096); $a =~ /\0+/'
It's arguably "wishlist" because strictly, the scalar is not well formed,
You can see what the structure of the SVs that File::Map produces with
$ /home/nick/Sandpit/snap5.9.x-v5.11.5-59-g801ed99/bin/perl5.11.5 -MDevel::Peek -MFile::Map -e 'File::Map::map_anonymous($a, 16); Dump($a)'
and the "problem" again, as dump.c tries to access the byte beyond:
$ valgrind /home/nick/Sandpit/snap5.9.x-v5.11.5-59-g801ed99/bin/perl5.11.5 -MDevel::Peek -MFile::Map -e 'File::Map::map_anonymous($a, 4096); Dump($a)'
(sort of can't fix that one).
It would be good to change the regexp code in question, which currently
/* Note that nextchr is a byte even in UTF */
The "quicker" fix looks to be set nextchr to 0 if locinput >= PL_regeol
It looks like/I assume that the code retains the basic structure of Henry
Nicholas Clark (via RT) wrote:
My guess is that it won't properly match a string that contains a NULL.
On Sun, Mar 14, 2010 at 12:37:28PM -0600, karl williamson wrote:
That is my suspicion too, but I don't have any test cases.