Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Eliminate Phase and simplify Conditions #7856
Forked from #6951.
This is also discussed #1899 (comment) and elsewhere.
Users and developers apparently think of phases as states in a state machine, regardless of how much we try to dissuade them:
This interpretation conflicts with the declarative, level-based, observation-driven design, as well as limiting evolution. In particular, the phase should be derivable by observing other state and shouldn't require durable persistence for correctness.
In my experience, the danger to system extensibility is significant, and results from assumptions baked into clients.
Phases aren't themselves properties or conditions of their objects, but clients inevitably infer properties from them.
For example, a client might try to determine whether a pod is active, or whether it has reached a permanent, terminal state, such as with a switch statement, as follows:
However, let's say someone wanted to add a new Phase where the pod would also be active (containers running/restarting, etc.), as in #6951. That would mean that every client/component that made decisions based on the phase would need to also then consider the new phase.
Enums aren't extensible. Every addition is a breaking, non-backward-compatible API change. We could create a new major API version for every such addition, but that would be very expensive and disruptive. At some point, it will just become too painful to add new phases, and then we'll be left with an unprincipled distinction between phases and non-phases. Creating all imaginable phases up front would not only create a lot of complexity prematurely, but would also almost certainly not be correct.
Conditions are more extensible since the addition of new conditions doesn't (shouldn't) invalidate decisions based on existing conditions, and are also better suited to reflect conditions/properties that may toggle back and forth and/or that may not be mutually exclusive.
Rather than distributing logic that combines multiple conditions or other properties for common inferences, such as whether the scheduler may schedule new pods to a node, we should expose those properties/conditions explicitly, for similar reasons as why we explicitly return fields containing default values.
referenced this issue
May 6, 2015
I've mentioned my opinion this before, but to recap...
I don't have a problem with what is being suggested here, but I think we should recognize the tradeoff. Everything I've ever read about job management, or heard from users, suggests that users are going to reason about the state of their pods (and jobs) as being part of a state machine. So, we have a choice: we can make the state machine explicit, or make the user synthesize one from the Conditions, or do something in between (e.g. provide the user with a library that synthesizes states from the Conditions, or have kubectl do it).
Brian described good arguments for pushing this onto the user. The counter-argument is that we're infrastructure providers, and the way we provide value is by deciding what abstractions will be useful for users, and more generally tackling these kinds of difficult problems instead of pushing them onto the user. If we're completely un-opionated about everything (how to reason about your job state, how to deal with workers of a batch job, etc.), then we're not providing as much value as we could.
I think the sweet spot might be the layered approach, i.e. the server works the way Brian described, and then we build a library that synthesizes states (and use that library inside kubectl to print states, as well as allowing users to use it). But I definitely feel that users will want a state machine abstraction.
I know I am contrarian here, but I don't think we will ever prevent users
I agree that enumerated states are not extensible, but I still argue that
Anyway, orthogonal Conditions is really what we agreed on months ago.
On Wed, May 6, 2015 at 3:20 PM, Daniel Smith firstname.lastname@example.org
Quick writeup of something Leo (another Borg person) put on the table.
Instead of something like a single field "Phase" being one-of (Pending,
The nugget of clever here is that mostly people want an answer to a
I suppose this could be modeled in Conditions. E.g. scheduler patches
There's a nugget of clever here...
On Wed, May 6, 2015 at 5:34 PM, Daniel Smith email@example.com
I meant that these would work WITH conditions not instead of, in which case
On Wed, May 6, 2015 at 8:09 PM, Brian Grant firstname.lastname@example.org
Have you worked out the mapping between Unkown, Pending, Running, Failed,
I still don't feel like I understand how to handle the case of a Service
On Wed, May 6, 2015 at 8:54 PM, Brian Grant email@example.com
Running would be replaced by the Active condition (True/False): #7868 (comment).
Above I proposed a Terminated condition. The Reason associated with that condition would indicate Succeeded vs. Failed, as described above.
Pending could be derived from not Active and not Terminated for older API versions. I'm not sure we need a more explicit condition for it, since most scenarios check for Active or Terminated (or their complements), though I suggested Initialized (if we explicitly registered initializers) and Instantiated (essentially !Pending) here: #6951 (comment)
I'm also not sure we need Unknown. That would just be the desired condition (whichever that is) is not present. We shouldn't ever be returning Unknown currently, I don't think.
I'll look at the LB case again.
Deleting would be analogous to Vacating. I'm not opposed to a condition for that, though as discussed on #6951 it could be inferred uniformly for all objects from metadata.
Whether we want to make at least certain conditions uniform for all objects is another issue. As discussed in #1899 (comment), #6487, and elsewhere, orchestrators on top of Kubernetes want certain uniform conditions, such as fully initialized (so the resource can be read), fully active (so creation can be considered successful and/or dependent entities can be created), and terminated (so entities they depend on can be deleted and/or entities dependent on them can be GCed).
I assume that "Active" is amalgamated from some number of other Conditions (scheduled, installed, ...)? As a consumer of this API, some of the things I will want to do are "wait until pod is running" or "wait until pod is terminated". If we publish a small set of very core conditions, these questions could be answered by blocking until the conditions exist. But this is sort of a trap - blocking until a pod is Active is wrong - it might never become active. I need to block until it is Active or Terminated. This is a place where outlooks are a better API. I can tell just by looking at the outlook field for Active whether I should keep waiting.
I've come around to getting rid of Phase, but I want to comprehend the details of the replacement.
referenced this issue
May 8, 2015
referenced this issue
Nov 20, 2017
@bgrant0607 the pages you referenced in the issue description have moved to