-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capture rationale for the 'Archetype' resource #102
Comments
I can start with a feature requirement that probably lead us towards the introduction of Archetypes: Developers should not have to pass the entire specification of the revision they deploy. As an example, developers should be able to create a new revision that just differs by one environment variable from the previous one, without having to re-deploy new source or image. Similarly, a developer should be able to deploy new source and carry over previously created environment variables. |
Now let me capture an API design that was suggested earlier in the design process:
The UI or CLI would be in charge of picking the right base revision (for example the latest successfully deployed), but users could decide to base on another one. Creating an Once created, depending on its traffic configuration, the Service would either automatically point to it, or let the client set a new traffic configuration, potentially pointing 0% of the traffic to this newly created revision. Note that this proposal does not prevent arbitrary traffic split or gadual rollout. |
You don't have to pass the entire spec, you can always just patch: Does this not meet your requirement stated above? |
Yes, #102 (comment) was not about the availability of Patch semantics, but a developer workflow requirement. |
It's a smart question, and I'm not sure I'm the best person to answer it. But playing it back, what is the difference in developer experience between authoring a RevisionTemplate vs authoring a base Revision that is used as a template? Since the user isn't expected to directly poke at Revisions, just RevisionTemplates, it's still just one resource for the user to wrap their head around. And one advantage to a separate RevisionTemplate is that it can contain additional fields beyond what the Revision can. Another advantage is that it allows Revisions to be immutable and always system generated. Personally, I found it much easier to process the mental model once I began calling it a Configuration instead. Now, if you edit either your Configuration or your source code, then the system will produce and run a new immutable Revision based on those changes. If you want to create a whole new branch of that code (for experiments, for prod vs qa, whatever), just add another Configuration, and new Revision chain will magically appear. (Then you can Route traffic to those Revisions as you see fit.) Something like: Code + Configuration == Revision. (Or maybe more literally, Δcode + Δconfiguration == new immutable revision.) I think I'd be fairly comfortable dealing with that model on a daily basis, but we should run it through some UX testing, too. |
A few thoughts on this top-of-mind:
This puts a significantly higher burden on the client, and eliminates an API resource that essentially summarizes exactly what you are after. Juggling List, Sort, Pick vs. Get I believe that by having the To anticipate your counter-argument: "What if I don't want the Archetype's Revision spec because it failed." Ok, so two Gets instead: arch := Archetype.Get(name)
rev := Revision.Get(arch.Status.LatestReadyRevision)
Currently Did you have something else in mind? If so, concrete examples might help focus the discussion. |
Thanks, I wish we could close this issue by capturing:
|
A few more details on why we converged on this solution (also note @evankanderson posted a detailed doc that also summarizes the rational of the decision as part of issue #63 ): Initially we did start with just 2 resources: Service and Revision (though they had different names at the time, and probably will be renamed again per issue #65 ). However, per @mattmoor's comment above, we recognized early on that we wanted Revisions to be immutable snapshots of code + config, and therefore there should be another resource responsible for minting new Revisions. Benefits of that include:
Initially that one resource was the Service... the initial API design had the Service include a RevisionSpec of the 'next' revision, and updating the Service (i.e. PATCHing the revisionSpec) would result in minting a new Revision that would be rolled out. While simple, this had several drawbacks, including the inability to handle more complex traffic scenarios such as splitting traffic over a release track and a separately maintained experiment track[1], or arbitrary n-way splits between named Revisions. Additionally, the Service combined too many concerns - namely routing traffic and being the source of truth for a single train of Revisions. Now consider the scenario below with the proposed model of a separate Archetype resource (from issue #63):
In this scenario, a manual rollout is incrementally shifting traffic from named revision 123 to 456. In addition, the Archetype is referenced to have a floating addressable 'head' revision (that serves 0% traffic). The act of creating a new Revision is done through a PATCH to the Archetype. The act of manually rolling it out is done through a PATCH to the Service. From an authorization perspective, these may be different user roles, which are reflected by the different resource types that embody these different concerns. In this advanced Service configuration, a user with a role to update the Archetype can still create new revisions and test them through the 'head' subdomain, but to manually rollout that new revision to serve customer traffic, a user with a more restrictive role to update the Service is necessary. This becomes challenging if these 2 concerns are mixed in the same object. [1] examples of the 1:N experiment use case (also from issue #63 ):
This simple configuration lets you have a separate release track and experiment track, automatically rolling out newly minted Revisions from each to a specified percentage. This would not be possible without separating Archetype from Service, enabling the multiple cardinality relationship. @dewitt's illustration (from issue #63, using the names "Configuration" and "Route" rather than "Archetype" and "Service") helps illustrate this example: |
Today, Google App Engine does not have the concept of tracks, but only the concept of traffic split. This means that users are responsible for deploying revisions and then changing the traffic configuration to point to the new revisions. They are responsible for maintaining tracks by manipulating base resources (traffic and revisions).
I am not sure this is correct. The following solution would provide arbitrary n-way traffic split:
In #102 (comment), I propose that:
This is still compatible with having different roles. |
It seems to me that the existence of the Archetype is the consequence of two things:
|
Status update: I'm working on a longer "why we did this" description, which will close out this issue. A rough outline of the content, which is based on experience building and running the GAE control plane API as well as discussions with the kubernetes team: A common problem between App Engine and Kubernetes today is that the yaml artifacts are local details that users shouldn't need to manage and maintain. The next update will include alternative designs considered, along with problems which led them to be rejected. |
The current API configuration (Routes, Configuration, and Revisions) was built based on experience with 12 factor and with existing products, particularly Google App Engine and Google Cloud Functions, which have two different models for managing resources with different tradeoffs. Google App EngineIn App Engine, most configuration is performed via an Having configuration checked in to source is compelling from a mature operations point of view, but has several limitations:
Google Cloud FunctionsIn Cloud Functions (similar to AWS Lambda and other Function-as-a-Service frameworks), there is no
Configuration (aka Archetype) as a solutionWe designed the Having a single head also allows us to present the history of Revisions as a (time-ordered) stream of deltas, rather than as a forking/branching history ala revision control. This simplifies both UI and user mental models over a general forking/history model. Without an explicit head to stamp out history from, it becomes easy for client tools to accidentally generate forks as users choose different base revisions: In particular, it becomes difficult to determine what the correct (user expected) behavior is when R4 and R5 become Failure and rollbackIn the case that the most recent Revision has failed to become Ready, tools should be able to detect this and suggest that the user reset the Configuration to the state of the last Ready Revision. Users may choose to "roll forward" off the current configuration, or replace the Configuration with the last working config. Obviously, some tools may not do this or be able to do this (e.g. a CI system may have no way to prompt the user). Separation of duties and flexibilityIn early designs, we had a single
See this document for a high-level overview of the other API options we considered. |
With the above writeup, I'm closing this issue. I expect that any future desire to revisit the resource architecture will include a writeup which addresses both the conceptual model and a number of specific customers scenarios (aka "Critical User Journeys"). The document which contains the current mapping of user journeys to sequences of API events is in the process of being converted from an internal Google Doc to a public set of markdown resources, and should form a basis for comparison with future proposals. |
Co-authored-by: Reto Lehmann <retocode@icloud.com>
Co-authored-by: Reto Lehmann <retocode@icloud.com>
By Archetype, I mean the resource that contains a base specification for new Revisions but that is not a Revision by itself.
This issue is independent from #63, where we discuss the N:M mapping between the various resources.
I suggest here to capture the rational behind the decision of introducing the Archetype resource and the alternatives that have been considered.
My underlying goal is to understand if we could find a valid solution without introducing such concept as a resource.
The text was updated successfully, but these errors were encountered: