Browse files

avoid exponential regex match for java and objc function names

In the old regex

^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\([^;]*)$
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

you can backtrack arbitrarily from [A-Za-z_0-9]* into [A-Za-z_], thus
causing an exponential number of backtracks.  Ironically it also causes
the regex not to work as intended; for example "catch" can match the
underlined part of the regex, the first repetition matching "c" and
the second matching "atch".

The replacement regex avoids this problem, because it makes sure that
at least a space/tab is eaten on each repetition.  In other words,
a suffix of a repetition can never be a prefix of the next repetition.

Signed-off-by: Paolo Bonzini <bonzini@gnu.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
  • Loading branch information...
1 parent 9b7dc71 commit 959e2e64a594e2fb8de2585078e31b07a8da6fc9 @bonzini bonzini committed with gitster Jun 17, 2009
Showing with 3 additions and 2 deletions.
  1. +3 −2 userdiff.c
View
5 userdiff.c
@@ -13,7 +13,8 @@ PATTERNS("html", "^[ \t]*(<[Hh][1-6][ \t].*>.*)$",
"[^<>= \t]+|[^[:space:]]|[\x80-\xff]+"),
PATTERNS("java",
"!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
- "^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$",
+ "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
+ /* -- */
"[a-zA-Z_][a-zA-Z0-9_]*"
"|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
"|[-+*/<>%&^|=!]="
@@ -25,7 +26,7 @@ PATTERNS("objc",
/* Objective-C methods */
"^[ \t]*([-+][ \t]*\\([ \t]*[A-Za-z_][A-Za-z_0-9* \t]*\\)[ \t]*[A-Za-z_].*)$\n"
/* C functions */
- "^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$\n"
+ "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$\n"
/* Objective-C class/protocol definitions */
"^(@(implementation|interface|protocol)[ \t].*)$",
/* -- */

0 comments on commit 959e2e6

Please sign in to comment.