-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid and tainted utf-8 char crashes perl 5.10.1 in regexp evaluation #9922
Comments
From Mark.Martinec@ijs.siCreated by Mark.Martinec@ijs.siTracking down a reason for crashes of a perl process while processing This is happening on a FreeBSD 7.2, using perl as installed from ports Reducing the actual crashing application to a small test case, #!/usr/bin/perl -T # Here is a HTML snippet from a malicious/obfuscated mail message. $t =~ s/&#(\d+)/chr($1)/ge; # convert HTML entities to UTF8 # show character codes in the resulting string # The following regexp evaluation crashes perl 5.10.1 on FreeBSD. $t =~ /( |\b)(http:|www\.)/i; and here is the result (hand wrapped): 60, 97, 62, 65, 116, 116, 101, 110, 116, 105, 111, 110, 32, 72, 111, Here is a backtrace as obtained from a core dump $ gdb -c perl5.10.1.core /usr/local/bin/perl5.10.1 (gdb) bt (gdb) And lastly, here is a perl debug output using the -Dr command line option: Compiling REx "( |\b)(http:|www\.)" EXECUTING... [...] Perl Info
|
From Mark.Martinec@ijs.siSome additional information on non-vulnerable systems, provided
|
From @demerphq2009/10/22 Mark Martinec <perlbug-followup@perl.org>:
Unfortunately this is just masking the cause, im pretty sure the You would have ended up in this code: case trie_utf8_fold: \ Im guessing in the second clause, probably in to_uni_fold().
Thanks, your report is very complete.
I think the regex engine is the only place that uses the unicode cheers, -- |
The RT System itself - Status changed from 'new' to 'open' |
From perl@profvince.com
Bisected down to 8902bb0 Author: Slaven Rezic <slaven@rezic.de> Another regexp failure with utf8-flagged string and byte-flagged Vincent. |
From @demerphq2009/10/23 Vincent Pit <perl@profvince.com>:
thanks - that helps a lot. -- |
From @demerphq2009/10/23 Vincent Pit <perl@profvince.com>:
The simple fix is to add a guard to the if clause to prevent looking The thing is the original patch sorta hides a deeper problem. It may For instance in old perls: use Test::More; Should match. In TRIE'd perls it wont. As in unicode rules these rules 00B5; C; 03BC; # MICRO SIGN I suppose any non-unicode pattern that doesnt use these can still be Hrmph. Cheers, -- |
From @demerphqResolved by: commit 0abd0d7 disable non-unicode case insensitive trie matching |
@demerphq - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#69973 (status was 'resolved')
Searchable as RT69973$
The text was updated successfully, but these errors were encountered: