-
Notifications
You must be signed in to change notification settings - Fork 39.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate Phase and simplify Conditions #7856
Comments
This is part of 1.0 though. |
Not part of 1.0, you mean. |
Just thinking out loud: could we auto-populate this from events filed about the object? |
@lavalamp I thought from prior discussions we did not want to use events as a reliable messaging system to derive status |
@derekwaynecarr I think it's a bit different if we're doing it in the master as opposed to joining events & objects externally. |
I've mentioned my opinion this before, but to recap... I don't have a problem with what is being suggested here, but I think we should recognize the tradeoff. Everything I've ever read about job management, or heard from users, suggests that users are going to reason about the state of their pods (and jobs) as being part of a state machine. So, we have a choice: we can make the state machine explicit, or make the user synthesize one from the Conditions, or do something in between (e.g. provide the user with a library that synthesizes states from the Conditions, or have kubectl do it). Brian described good arguments for pushing this onto the user. The counter-argument is that we're infrastructure providers, and the way we provide value is by deciding what abstractions will be useful for users, and more generally tackling these kinds of difficult problems instead of pushing them onto the user. If we're completely un-opionated about everything (how to reason about your job state, how to deal with workers of a batch job, etc.), then we're not providing as much value as we could. I think the sweet spot might be the layered approach, i.e. the server works the way Brian described, and then we build a library that synthesizes states (and use that library inside kubectl to print states, as well as allowing users to use it). But I definitely feel that users will want a state machine abstraction. |
I know I am contrarian here, but I don't think we will ever prevent users I agree that enumerated states are not extensible, but I still argue that Anyway, orthogonal Conditions is really what we agreed on months ago. On Wed, May 6, 2015 at 3:20 PM, Daniel Smith notifications@github.com
|
Since we're enumerating cons, let me add one additional reason to retain some sort of pre-cached state in status: it makes it easy for clients to set a fieldSelector and watch for their desired state. |
Quick writeup of something Leo (another Borg person) put on the table. Instead of something like a single field "Phase" being one-of (Pending, The nugget of clever here is that mostly people want an answer to a I suppose this could be modeled in Conditions. E.g. scheduler patches There's a nugget of clever here... On Wed, May 6, 2015 at 5:34 PM, Daniel Smith notifications@github.com
|
A "state" abstraction pushes the decision logic onto the user, as shown by my switch statement. |
@thockin Yes, the 4 fields is similar to my conditions proposal, just more ad hoc and with less information. |
I proposed more possible Conditions, analogous to Leo's proposal, here: #6951 (comment) |
I meant that these would work WITH conditions not instead of, in which case On Wed, May 6, 2015 at 8:09 PM, Brian Grant notifications@github.com
|
Conditions aren't annotations. They can't be added by just any entity. They should be curated. |
And conditions are for programmatic consumption. For instance, the Ready condition is used to remove pods from Endpoints. |
Have you worked out the mapping between Unkown, Pending, Running, Failed, I still don't feel like I understand how to handle the case of a Service On Wed, May 6, 2015 at 8:54 PM, Brian Grant notifications@github.com
|
Running would be replaced by the Active condition (True/False): #7868 (comment). Above I proposed a Terminated condition. The Reason associated with that condition would indicate Succeeded vs. Failed, as described above. Pending could be derived from not Active and not Terminated for older API versions. I'm not sure we need a more explicit condition for it, since most scenarios check for Active or Terminated (or their complements), though I suggested Initialized (if we explicitly registered initializers) and Instantiated (essentially !Pending) here: #6951 (comment) I'm also not sure we need Unknown. That would just be the desired condition (whichever that is) is not present. We shouldn't ever be returning Unknown currently, I don't think. I'll look at the LB case again. |
Why can't the service just have a Ready condition? Because no controller is responsible for updating its status currently? Pods have multiple preconditions from multiple components: they must be scheduled, and their images must be pulled. |
Can we get a "deleting" (or deleted) condition? Can we get it for all objects? |
Deleting would be analogous to Vacating. I'm not opposed to a condition for that, though as discussed on #6951 it could be inferred uniformly for all objects from metadata. Whether we want to make at least certain conditions uniform for all objects is another issue. As discussed in #1899 (comment), #6487, and elsewhere, orchestrators on top of Kubernetes want certain uniform conditions, such as fully initialized (so the resource can be read), fully active (so creation can be considered successful and/or dependent entities can be created), and terminated (so entities they depend on can be deleted and/or entities dependent on them can be GCed). |
An example where the current PodPhase Running was found to be confusing: #7868. An example where "Pending" was confusing, or at least insufficiently informative on its own: openshift/origin#2048. |
I assume that "Active" is amalgamated from some number of other Conditions (scheduled, installed, ...)? As a consumer of this API, some of the things I will want to do are "wait until pod is running" or "wait until pod is terminated". If we publish a small set of very core conditions, these questions could be answered by blocking until the conditions exist. But this is sort of a trap - blocking until a pod is Active is wrong - it might never become active. I need to block until it is Active or Terminated. This is a place where outlooks are a better API. I can tell just by looking at the outlook field for Active whether I should keep waiting. I've come around to getting rid of Phase, but I want to comprehend the details of the replacement. |
Hello, I was folowing #7856 (comment) and I wondered if conditions were still meant to be removed at some point or could we bet on them as a durable public interface to be consumed from? Thanks, |
Conditions are not going to be removed. I think the best interface around this is a two-step process:
The latter step is not implemented consistently anywhere. |
@lavalamp what are the next steps? |
If Just simple stuff like trying to
|
@clux The plan is to use serverside apply. (hopefully beta in this upcoming release) @vllry regularizing our current API would be a monumental undertaking :/ so, I am not sure if there is a practical next step other than "make new things do it the right way, wait for the mythical v2 to fix existing things" :/ |
I am closing this issue:
|
Forked from #6951.
This is also discussed #1899 (comment) and elsewhere.
Users and developers apparently think of phases as states in a state machine, regardless of how much we try to dissuade them:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/api-conventions.md#spec-and-status
This interpretation conflicts with the declarative, level-based, observation-driven design, as well as limiting evolution. In particular, the phase should be derivable by observing other state and shouldn't require durable persistence for correctness.
In my experience, the danger to system extensibility is significant, and results from assumptions baked into clients.
Phases aren't themselves properties or conditions of their objects, but clients inevitably infer properties from them.
For example, a client might try to determine whether a pod is active, or whether it has reached a permanent, terminal state, such as with a switch statement, as follows:
However, let's say someone wanted to add a new Phase where the pod would also be active (containers running/restarting, etc.), as in #6951. That would mean that every client/component that made decisions based on the phase would need to also then consider the new phase.
Enums aren't extensible. Every addition is a breaking, non-backward-compatible API change. We could create a new major API version for every such addition, but that would be very expensive and disruptive. At some point, it will just become too painful to add new phases, and then we'll be left with an unprincipled distinction between phases and non-phases. Creating all imaginable phases up front would not only create a lot of complexity prematurely, but would also almost certainly not be correct.
Conditions are more extensible since the addition of new conditions doesn't (shouldn't) invalidate decisions based on existing conditions, and are also better suited to reflect conditions/properties that may toggle back and forth and/or that may not be mutually exclusive.
Rather than distributing logic that combines multiple conditions or other properties for common inferences, such as whether the scheduler may schedule new pods to a node, we should expose those properties/conditions explicitly, for similar reasons as why we explicitly return fields containing default values.
Proposal:
cc @smarterclayton @derekwaynecarr @thockin @timothysc @davidopp @markturansky
The text was updated successfully, but these errors were encountered: