forked from apache/nutch
-
Notifications
You must be signed in to change notification settings - Fork 3
Admin Crawl
mbauhardt edited this page Sep 13, 2010
·
9 revisions
With this plugin you are able to start a crawl. Before you start a crawl you should upload urls with the Admin Url Upload plugin. You can create more than one crawl folder. A crawl folder has an own crawl-database, link-database and index.
Click the play icon to start a crawl.
You can start a crawl with a depth (number of shards/segments) and a topN Parameter. topN means that you generate per shard N urls.
After the crawl is finished you can show how many shards are generated etc.
If you click on the host statistic link you see how many urls from a host are in the crawldb and in the current shard.


