-
Notifications
You must be signed in to change notification settings - Fork 1
Preparing Orchestrator for the After (NSF) Times #233
Comments
Yep, this was part of the motivation for https://github.com/yuvipanda/pangeo-forge-cloud-federation/ (although I think there have been other bits like that too) |
@yuvipanda, as we discussed yesterday (and you suggested), I believe a natural way to prototype all of this is by refactoring |
Charles and I just discussed this, and I'm 100% on board with the idea that we should work to simplify, decouple, and make stateless the Pangeo Forge service. |
We had a great discussion of this issue on today's Coordination Meeting. Minutes in the linked doc; copying here for reference:
|
Thinking aloud about what a minimal prototype of this new system would require:
Here is a sequence diagram outlining the proposed new architecture. Note the main differences with the existing system are:
sequenceDiagram
Feedstock Repo->>GitHub App:event webhook
GitHub App-->>Feedstock Repo:creates check run (queued)
GitHub App->>Bakery Agent:notifies: event
Bakery Agent->>Beam Runner:deploys job
Bakery Agent-->>GitHub App:notifies: job deployed
GitHub App-->>Feedstock Repo:updates check run (in progress)
Beam Runner->>Beam Runner Status Monitor:notifies: job complete
Beam Runner Status Monitor-->>Bakery Agent:notifies: job complete
Bakery Agent-->>GitHub App:notifies: job complete
GitHub App-->>Feedstock Repo:updates check run (completed)
|
I am not 100% decided on the best interface between the GitHub App and the Bakery Agent. Following the example above, The GitHub App could simply pass along some parsed version of the event payload, leaving it up to the Agent to decide what action to take in response to the event: sequenceDiagram
Feedstock Repo->>GitHub App:event webhook
GitHub App->>Bakery Agent:notifies: event
Alternatively (and I think I'm a bit more partial to this approach), the GitHub App could generate the appropriate pangeo-forge-runner command for the given event type, and then pass it along to the Bakery Agent: sequenceDiagram
Feedstock Repo->>GitHub App:event webhook
GitHub App-->>GitHub App:generates `pangeo-forge-runner` cmd for event
GitHub App->>Bakery Agent:POSTs cmd
This latter approach allows the Bakery Agent to (mostly) just be an invoker of pangeo-forge-runner commands, and it can have much less knowledge of GitHub Events. This feels like the right separation of concerns to me: GitHub App handles GitHub-y things, and insulates Bakery Agent from that layer. In either approach, The GitHub App should pass the name of the GitHub |
The NSF Award supporting the current phase of Pangeo Forge development has a little over a year remaining in it. So this is an opportune moment to strategize about where we (and I, with my remaining full time funded effort) want to get Pangeo Forge Orchestrator by the conclusion of this funding cycle. Here is a first pass at some priorities (with big 🙏 to @yuvipanda for so much thoughtful brainstorming on these points):
🔸 State-minimization
Minimizing state managed by Orchestrator reduces the operational and maintenance burden of Pangeo Forge Cloud. A basic assumption here is that the future operator of Pangeo Forge Cloud will value the tradeoff of reduced maintenance and greater ease of federated participation in exchange for some performance cost of decentralization.
Here are forms of state we currently manage and/or deploy from this repo, which we probably can drop:
Dropping the database has implications for the frontend site https://pangeo-forge.org/, but these can be worked around by either re-imagining what that site is used for (perhaps it does not need a dashboard) and/or having that site populated with data directly from the GitHub API.
Factoring bakery config into separate repos, in earlier iterations bakery repositories were defined for generic bakery types (AWS, GCP, etc.). As the definition of bakeries shifts to "just a Beam runner", so can the way in which bakery repos are formatted. In order for a bakery repo to be pluggable with Pangeo Forge Cloud, I think it needs to define some contractual set of config that Pangeo Forge Cloud can use to deploy a job to that bakery.
This points to the possibility that bakeries as defined in
meta.yaml
would become simply, the name of a GitHub repo.🔹 What is orchestrator (i.e. Pangeo Forge Cloud)?
In this paradigm, orchestrator, or the core of Pangeo Forge Cloud, becomes "just" a GitHub App which understands a
meta.yaml
spec and can forward appropriately-formatted requests to the bakery "agent" service.cc @rabernat
Edit: This is a very rough sketch, but wanted to get something out in the open to start a conversation, I'll continue to follow-on below as more thoughts develop.
The text was updated successfully, but these errors were encountered: