Skip to content

Commit

Permalink
NUTCH-2035 urlfilter-regex case insensitive rules
Browse files Browse the repository at this point in the history
  • Loading branch information
sebastian-nagel committed Dec 15, 2017
1 parent 0e3036b commit df14c8a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion conf/regex-urlfilter.txt.template
Expand Up @@ -27,7 +27,7 @@

# skip image and other suffixes we can't yet parse
# for a more extensive coverage use the urlfilter-suffix plugin
-\.(gif|GIF|jpg|JPG|png|PNG|ico|ICO|css|CSS|sit|SIT|eps|EPS|wmf|WMF|zip|ZIP|ppt|PPT|mpg|MPG|xls|XLS|gz|GZ|rpm|RPM|tgz|TGZ|mov|MOV|exe|EXE|jpeg|JPEG|bmp|BMP|js|JS)$
-(?i)\.(gif|jpg|png|ico|css|sit|eps|wmf|zip|ppt|mpg|xls|gz|rpm|tgz|mov|exe|jpeg|bmp|js)$

# skip URLs containing certain characters as probable queries, etc.
-[?*!@=]
Expand Down

0 comments on commit df14c8a

Please sign in to comment.