Skip to content

Commit

Permalink
Allow for fetching limited range of webpages, to avoid massive files
Browse files Browse the repository at this point in the history
  • Loading branch information
pmyteh committed Jul 22, 2012
1 parent f0e99df commit 52d95b7
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions CONFIG_db_example.php
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@
$whitelistdomainlevel=2;
//list of domains starting, ending, and separated with :
$whitelistdomainlist=":gov.uk:.org.uk:";
//Fetch only first part of each page, to avoid huge files?
$fetchrangeonly=true;
// If $fetchrangeonly=true, what range to fetch? Here, the first 100KB is specified.
$fetchrange="0-99999";

// Set spider penetration depth. If 0 crawl only pages in database.
$MAX_PENETRATION = 5;
Expand Down

0 comments on commit 52d95b7

Please sign in to comment.