Skip to content
Alex Osborne edited this page Jul 4, 2018 · 4 revisions

You can create a new job using one of three options:

1. Based on an existing job

This option allows a new job to be created based on an existing job, regardless of whether the job has been crawled or not.  This option can be useful for repeating crawls or recovering crawls that had problems.  See Recovery of Frontier State and frontier.recover.gz.  To base a new job on an existing job, use the "copy" button on the job page of the parent job.
2. Based on a profile
This option allows a new job to be created based on an existing profile.  To create a job based on a profile, use the "copy" button on the profile page.
3. With defaults
This option creates a crawl job based on the default profile.  To create a job from the default profile, enter a job name on the Main Console screen and click the "create" button.

Note

  • Changes made to a profile or job will not change a derived job.

  • Jobs based on the default profile provided with Heritrix are not ready to run as is.  The metadata.operatorContactUrl needs to be set to a valid value.

Attachments:

copyjob.png (image/png)
createnewjob.png (image/png)
createjobbasedonprofile.png (image/png)
mainconsolenewjob.png (image/png)
createnewjobdefault.png (image/png)

Heritrix

Structured Guides:

Wiki index

FAQs

User Guide

Knowledge Base

Known Issues

Background Reading

Users of Heritrix

How To Crawl

Development

Clone this wiki locally