Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Pull-based jobs dispatch for integration with custom compute backends #1526

Open
ZIJ opened this issue May 29, 2024 · 0 comments
Open
Labels

Comments

@ZIJ
Copy link
Contributor

ZIJ commented May 29, 2024

Why is this needed?

Currently, Digger orchestrator can only dispatch jobs to the CI backend (e.g. Actions) in a "push" fashion. With Actions, it calls the .../{workflow_id}/dispatches endpoint directly. This has the following drawbacks:

  • Integration with a different compute backend requires implementing a new CIService in the backend codebase. There is currently no way to integrate a "generic" CI backend. One way to solve it
  • Queuing and concurrency management logic has to be on Digger side. To some extent solved by max_concurrency but that's not sufficient for complex setups which may require their own job scheduling logic.

Proposed solution: expose queuing API

Digger orchestrator is already using Postgres as a queue for managing concurrency if the max_concurrency option is set. This way we are taking advantage of Postgres reliability, and not introducing a separate moving part like Redis or Kafka. Given that Digger is most often used in a self-hosted scenario, we are unlikely to ever hit performance / scalability limitations of Postgres.

Similar to google's pub-sub api

Create a subscription

POST /subscriptions/
body:

{
     topic: "jobs_myapp_prod"
}

returns 200:

{
    id: "mySubId"
}

Pull 1 job from a subscription

PATCH /subscriptions/{sub_id}

<empty body>

returns 200:

{
    jobSpec: <Digger Job Spec JSON to be passed to the CLI>
}

Why not simply /jobs/pull?

  • We need some sort of "narrowing down" on the scope of subscription; at least for prod / non-prod separation of executor pools, which might run on different infra, we need a way to differentiate "kinds" of jobs. Hence the concept of topics
  • Now that we have topics, "pulling" is actually modifying some kind of a resource. But not the topic! The topic stays the same; more like, modifying a "pool" of jobs relevant to the topic. Hence the concept of a subscription (same as in google's pub-sub)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant