Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

web: can initialize immediate crawl jobs #4

Open
2 tasks done
Ziinc opened this issue Apr 20, 2022 · 1 comment
Open
2 tasks done

web: can initialize immediate crawl jobs #4

Ziinc opened this issue Apr 20, 2022 · 1 comment
Projects

Comments

@Ziinc
Copy link
Owner

Ziinc commented Apr 20, 2022

  • A crawl job is the scope of parsing work performed by a spider. All requests and parsed items will be linked to a crawl job.
  • Each job is given a job id.

For web, job management is linked to actual crawl jobs created from in cluster.

Consider using oban to manage persistent crawls. crawl jobs on cluster is not persistent, cluster has no persistence layer. on the other hand, web management should have a persistence layer for more possible functionality, like storing history etc.

v1:

  • see all running jobs
  • start/stop a job

v2:

  • see stats for a job
  • historical introspection
  • restarting
  • arguments
  • long running jobs
  • timeouts
  • scheduling
@Ziinc
Copy link
Owner Author

Ziinc commented May 23, 2022

Jobber has been implemented in #11 , web needs to be able to connect to db and schedule the jobs with oban.

@Ziinc Ziinc changed the title app/web: can initialize immediate crawl jobs web: can initialize immediate crawl jobs May 23, 2022
@Ziinc Ziinc added this to Backlog in Dev May 23, 2022
@Ziinc Ziinc moved this from Backlog to Todo in Dev May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Dev
Todo
Development

No branches or pull requests

1 participant