You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A crawl job is the scope of parsing work performed by a spider. All requests and parsed items will be linked to a crawl job.
Each job is given a job id.
For web, job management is linked to actual crawl jobs created from in cluster.
Consider using oban to manage persistent crawls. crawl jobs on cluster is not persistent, cluster has no persistence layer. on the other hand, web management should have a persistence layer for more possible functionality, like storing history etc.
v1:
see all running jobs
start/stop a job
v2:
see stats for a job
historical introspection
restarting
arguments
long running jobs
timeouts
scheduling
The text was updated successfully, but these errors were encountered:
For web, job management is linked to actual crawl jobs created from in cluster.
Consider using oban to manage persistent crawls. crawl jobs on cluster is not persistent, cluster has no persistence layer. on the other hand, web management should have a persistence layer for more possible functionality, like storing history etc.
v1:
v2:
The text was updated successfully, but these errors were encountered: