web: can initialize immediate crawl jobs #4

Ziinc · 2022-04-20T22:12:03Z

A crawl job is the scope of parsing work performed by a spider. All requests and parsed items will be linked to a crawl job.
Each job is given a job id.

For web, job management is linked to actual crawl jobs created from in cluster.

Consider using oban to manage persistent crawls. crawl jobs on cluster is not persistent, cluster has no persistence layer. on the other hand, web management should have a persistence layer for more possible functionality, like storing history etc.

v1:

see all running jobs
start/stop a job

v2:

see stats for a job
historical introspection
restarting
arguments
long running jobs
timeouts
scheduling

Ziinc · 2022-05-23T01:08:34Z

Jobber has been implemented in #11 , web needs to be able to connect to db and schedule the jobs with oban.

Ziinc mentioned this issue Apr 20, 2022

app/requestor: Can receive a Request from Web and begin crawling #5

Closed

7 tasks

Ziinc changed the title ~~app/web: can initialize immediate crawl jobs~~ web: can initialize immediate crawl jobs May 23, 2022

Ziinc mentioned this issue May 23, 2022

app/web: Can create a spider #3

Closed

Ziinc added this to Backlog in Dev May 23, 2022

Ziinc moved this from Backlog to Todo in Dev May 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

web: can initialize immediate crawl jobs #4

web: can initialize immediate crawl jobs #4

Ziinc commented Apr 20, 2022 •

edited

Ziinc commented May 23, 2022

web: can initialize immediate crawl jobs #4

web: can initialize immediate crawl jobs #4

Comments

Ziinc commented Apr 20, 2022 • edited

Ziinc commented May 23, 2022

Ziinc commented Apr 20, 2022 •

edited