Skip to content

Commit

Permalink
added hint to the regular expression tester
Browse files Browse the repository at this point in the history
  • Loading branch information
Orbiter committed Aug 27, 2014
1 parent c252111 commit f642cfb
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions htroot/CrawlStartExpert.html
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,7 @@ <h2>Expert Crawl Start</h2>
The filter is a <b><a href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html" target="_blank">regular expression</a></b>.
Example: to allow only urls that contain the word 'science', set the must-match filter to '.*science.*'.
You can also use an automatic domain-restriction to fully crawl a single domain.
Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
</span></span>
<table border="0">
<tr><td width="110"><img src="env/grafics/plus.gif"> must-match</td><td></td></tr>
Expand Down Expand Up @@ -346,6 +347,7 @@ <h2>Expert Crawl Start</h2>
<span class="info" style="float:right"><img src="env/grafics/i16.gif" width="16" height="16" alt="info"/><span style="right:0px;">
The filter is a <b><a href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html" target="_blank">regular expression</a></b>
that <b>must not match</b> with the URLs to allow that the content of the url is indexed.
Attention: you can test the functionality of your regular expressions using the <a href="/RegexTest.html">Regular Expression Tester</a> within YaCy</a>.
</span></span>
<table border="0">
<tr><td width="110"><img src="env/grafics/plus.gif"> must-match</td><td><input name="indexmustmatch" id="indexmustmatch" type="text" size="55" maxlength="100000" value="#[indexmustmatch]#" onblur="if (this.value=='') this.value='.*';"/></td><td>(must not be empty)</td></tr>
Expand Down

0 comments on commit f642cfb

Please sign in to comment.