Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Fix misaligned match results with perl backend #1

Merged
merged 1 commit into from Nov 4, 2013

Conversation

Projects
None yet
2 participants
Contributor

purcell commented Jul 20, 2012

Consider this source text:

98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /news/8603-exclusive-jack-johnson\xD3sets2.looktothestars.org/photo/1303-oxfam/tiny_square.jpg?1263992329 HTTP/1.1" 404 322 "http://www.looktothestars.org/?lang=it" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /cause/5-educat-tweet HTTP/1.1" 301 113 "http://www.looktothestars.org/?lang=en" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /2064 HTTP/1.1" 404 322 "http://www.looktothestars.org/?lang=hu" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /news/2601-dionne-warwicisteneradata, HTTP/1.1" 301 154 "http://www.facebook.com/plugins/like.php?href=http://www.looktothestars.org/news/2601-dionne-warwick-and-sinbad-headline-charity-event&send=false&layout=button_count&width=125&show_faces=false&action=recommend&colorscheme=light&font&height=21" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /celebrity/tweet HTTP/1.1" 404 9 "http://www.looktothestars.org/?lang=fil" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"

and the regex:

(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) \S+ \S+ \[.*?\] "GET [^? ]*(?:\\x|\/\.\.?\/|[=,:\}\{\(\)])\S*? HTTP\/1\.1"

The second highlighted match (on line 4) does not match the beginning of the line. When matching in a large text, the highlighted matches become progressively more misaligned as one moves later in the text.

This commit fixes the issue.

@purcell purcell Fix misaligned match results with perl backend
Consider this source text:
```
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /news/8603-exclusive-jack-johnson\xD3sets2.looktothestars.org/photo/1303-oxfam/tiny_square.jpg?1263992329 HTTP/1.1" 404 322 "http://www.looktothestars.org/?lang=it" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /cause/5-educat-tweet HTTP/1.1" 301 113 "http://www.looktothestars.org/?lang=en" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /2064 HTTP/1.1" 404 322 "http://www.looktothestars.org/?lang=hu" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /news/2601-dionne-warwicisteneradata, HTTP/1.1" 301 154 "http://www.facebook.com/plugins/like.php?href=http://www.looktothestars.org/news/2601-dionne-warwick-and-sinbad-headline-charity-event&send=false&layout=button_count&width=125&show_faces=false&action=recommend&colorscheme=light&font&height=21" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
98.249.190.144 - - [19/Jul/2012:20:33:04 +0200] "GET /celebrity/tweet HTTP/1.1" 404 9 "http://www.looktothestars.org/?lang=fil" "Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
```

and the regex:

```
(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) \S+ \S+ \[.*?\] "GET [^? ]*(?:\\x|\/\.\.?\/|[=,:\}\{\(\)])\S*? HTTP\/1\.1"
```

The second highlighted match (on line 4) does not match the beginning of the line. When matching in a large text, the highlighted matches become progressively more misaligned as one moves later in the text.

This commit fixes the issue.
87ad09a
Contributor

purcell commented Nov 2, 2013

Bump. :-)

@jwiegley jwiegley added a commit that referenced this pull request Nov 4, 2013

@jwiegley jwiegley Merge pull request #1 from purcell/patch-1
Fix misaligned match results with perl backend
62b292d

@jwiegley jwiegley merged commit 62b292d into jwiegley:master Nov 4, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment