Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controller history #594

Merged
merged 1 commit into from
May 24, 2017
Merged

controller history #594

merged 1 commit into from
May 24, 2017

Conversation

kow3ns
Copy link
Member

@kow3ns kow3ns commented May 2, 2017

Controller history proposal as per SIG Apps discussion.
This contribution is meant as a common sub-feature to be leveraged by the 1.7 StatefulSet update and DaemonSet update features.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 2, 2017
@kow3ns
Copy link
Member Author

kow3ns commented May 2, 2017

@kubernetes/sig-api-machinery-api-reviews
@kubernetes/sig-apps-api-reviews
@smarterclayton
@Kargakis
@janetkuo
@enisoc
@foxish
@erictune
@bgrant0607

@k8s-ci-robot
Copy link
Contributor

@kow3ns: These labels do not exist in this repository: sig/api-machinery, sig/apps.

In response to this:

@kubernetes/sig-api-machinery-api-reviews
@kubernetes/sig-apps-api-reviews
@smarterclayton
@Kargakis
@janetkuo
@enisoc
@foxish
@erictune
@bgrant0607

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

## Abstract
In Kubernetes, in order to update and rollback the configuration and binary
images of controller managed Pods, users mutate DaemonSet, StatefulSet,
and ReplicaSet (via Deployment) Objects, and the corresponding controllers
Copy link
Contributor

@0xmichalis 0xmichalis May 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't mutate ReplicaSets[*], we create new copies when the Deployment is mutated.

[*] apart from syncing annotations, labels, and spec.MinReadySeconds between the two objects

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is not clear. You mutate Deployments which causes replicaset's to be created, deleted, and updated. It is a more complicated workflow. I will update to reflect this.

of versions of an Object to reconstruct the Object's history.
1. History respects causality. The Object type used to store point in time
snapshots must be strictly ordered with respect to creation. CreationTimestamp
should not be used, as this is susceptible to clock skew.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is clock skew between multiple masters in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple API servers is my concern yes.

- This may not necessarily be all fields of Object.
- The target Object state is the subset of the target state necessary to create
a Objects of the generated type(s).
- To make the this concrete, for the StatefulSet controller:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

### Version Tracking
Controllers must track the version of the target Object state that corresponds
to their generated Objects. This information is necessary to determine which
versions are live, and to track which Objects need to be updated during a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "and to track which Objects need to be updated during a target state update or rollback"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

History objects are immutable, pods are created or deleted, parent objects are updated by users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I have 10 Pods, and five of them are at a semantically equivalent version to my current target I only want to restart the 5 that are not. The other 5 may need to be relabeled.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think my questions is answered, you mean revision not apiversion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, clearer now.

will require testing each Object for membership in the `OwnerReferences`
of all AnyStates.

At the cost of the complexity of implementing both labeling and ownership,
Copy link
Contributor

@0xmichalis 0xmichalis May 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ownership is based on labeling. If an owner cannot match a child, the child is orphaned. If an owner matches an object that is not a child, the object is adopted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has special semantics and, afiak, you can't select based on it can you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current way our controllers work is:

  1. select all the generated objects in the namespace for a parent: set A
  2. from set A, select all objects that match the parent selector: set B
  3. from set B, filter out all objects with an owner ref other than the current parent: set C
  4. from set C, adopt all objects that have no owner ref
  5. from set A, release all objects whose owner ref matches the parent but their labels don't match the parent selector anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been putting some more thought into this. I did not consider adoption initially because I was coming from the viewpoint that controller history is not adoptable. I still have not convinced myself that it makes sense to adopt the history of another object, but I haven't convinced myself that this is an aweful idea either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming is one use case, handing off to a different workload controller type is another (eg. moving away from Deployments into a TPR).

1. The controller may deserialize the values of the AnyStates representing their
target Object state and perform a deep, semantic equality test. Here all
differences that do not constitute a mutation to the target Object state are
disregarded during the equivalence test.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here all differences that do not constitute a mutation to the target Object state are disregarded during the equivalence test.

How is this going to be done? Or are we purely based on defaulting filling out any new fields that may have been introduced since we serialized the template?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deserialize each object and write a long, Java style, object equivalence comparison. Every time a new field gets added, update the imperative code that performs the comparison. As indicated below, this might get us off the ground but should probably not be the long term solution. Strategic merge patch is the way to go in the long run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every time a new field gets added, update the imperative code that performs the comparison.

Oh, it's unlike that this will be maintainable. It's going to skew at some point. Maybe a reflective test can help us but I am not sure why we want this over a simple deep equality check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep equality will return false positives that can cause unintentional upgrades. My largest concern is that a new field with a default value causes the a rolling update for all StatefulSet's and DaemonSet's during/after a Master upgrade. As long as rolling update works well it shouldn't actually result in a loss of availability, but it could be the worst kind of surprising.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new field with a default value causes the a rolling update for all StatefulSet's and DaemonSet's during/after a Master upgrade.

Why is that the case? Wouldn't the new default affect both the DaemonSet/StatefulSet and AnyState templates?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on how the state is stored in the CR. After talking with Daniel its probably not as likely to occur as I originally thought. Default values on PodTemplateSpec doesn't really mean anything anyway. If we need defaults we're better of to add them to the PodSpec. Defaulting a specification object really doesn't make much sense. We should add defaults to the object that models an actual thing (i.e. Pod).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't have a particular preference on the matter but default values today won't cause extra rollouts (because both the StatefulSet and the CR will get them). If we ever change that, then I agree that defaulting should happen only for Pods.

We propose the following command.

```bash
> kubectl history <controller-kind> <controller-name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a kubectl rollout history command.

extension of the [above](#viewing-history ).

```bash
> kubectl rollback <controller-kind> <controller-name> --revision <any-state-name>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a kubectl rollout undo command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and kubectl rollout undo --to-revision

We can use our [hash function](#hashsing) and
[collision resolution](#collision-resolution) scheme to generate a system
wide unique identifier for an Object based on a deterministic non-unique prefix
and a serialized representation of the Object. Kuberentes Object's `.Names` must
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean .Name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it is the plural possessive of the field.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reword for clarity? Something like:

Values in the .Name field of Kubernetes Objects must conform to ...

conform to a DNS subdomain. Therefore, the total length of the unique identifier
must not exceed 255, and in practice 253, characters. We can generate a unique
identifier that meets this constraint by selecting a hash function such that the
output length is equal to `253-len(prefix)` and applying our [hash](#hashing)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handling names for history resources. In the case of Deployments, the name of the history object (ReplicaSet) is used in generating the name of the Pods. So I guess for Deployments, this needs to differ slightly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that Deployments should use the same concept where possible to deal with ReplicaSets, but, in this case, and in others, they will likely need special handling. Also, this is just to demonstrate that the approach is plausible, we will need to think about the naming implementation, and it will have to be based on the new hash function that will be used for deployment.

Objects along with its status.
- For ReplicaSet, DaemonSet, and StatefulSet, the current state of the
corresponding controllers can be derived from Pods they contain and the
ReplicasSetStatus, DaemonSetStatus, and StatefulSetStatus objects

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For history, we do care about the status fields ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only is so much as the current current and update revisions are stored in the Status Kind for the object.

// Once an AnyState has been successfully created, it can not be updated. The
// API Server will fail validation of all update requests that attempt to mutate
// the Data field. AnyStates may, however, be deleted.
type AnyState struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend to use it for purposes other than history ? AnyState is too generic a name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have other purpose right now, but the object is named after what it is. Its a generic, immutable snapshot of state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in favor of this, as it's too broad. We already have ConfigMap for "any data". Creating configmap with another name seems wrong.

[equivalence](#version-equivalence) with all other versions in the Object's
revision history.
- If the current version is semantically equivalent to its immediate
predecessor no update to the Object's target state has been performed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you determine the immediate predecessor ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorting based on the Generation.


1. The controller will [create a snapshot](#version-snapshot-creation) of the
current target Object state.
1. The controller will [reconstruct the history](#history-reconstruction) of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this history reconstruction will not always happen , only when there exists no history right ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you will reconstruct the history during every execution of the main control loop. In practice controllers benefit from a shared cache that mediates interaction with API server.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns maybe its just wording, but reconstruction means fixing a corrupted history or creating one from scratch. For every loop of the controller, we would just add a new history if the spec changed, isnt it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reconstructing means to reassemble the discreet snapshots into and ordered set as described in the referenced subsection

- If the current version is not equivalent to any prior version, this
indicates an update or a roll forward.
- Controllers should use their status objects for book keeping with respect
to current and prior revisions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean revision numbers should be kept in status fields ? Currently revisions are annotations in deployments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

names.
- e.g `PodTemplateSpecs` should be stored at the `"Template"` key.
1. The controller will store the specification type Objects `.Generation` at the
`"ControllerGeneration"` key.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you proposing we use the generation number of the objects as revision number and their monotonically increasing sequence as a way to determine the point in time in history . Just clarifying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why we would go so generic here. If we have to define a schema on Data, we might as well specialize the object and define the schema on that instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the ConfigMap like Object is avoid assumptions about the state data that third parties and operators might use. We will use Template and VolumeClaimsTemplate. An Operator might use an arbitrary field. Even tracking Generation as a well defined type won't work at this point for TPRs. We might want to consider going with a Revision field though. Controllers, that have a working Generation can use this. Deployments, Operators, TPC, can use the incremental Revision technique that Deployment currently uses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the argument, but still uncomfortable with the idea of an api object that looks a lot like config map that is explicitly described as "generic key value store". There has to be a really strong reason to go that generic. Also, map vs array comes into play here - I'd rather an array with more structured data about each bit of state, especially if these correlate to actual fields.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With an array, how would we map back to the origin field?

The strong reason for this construction is that you can use the JSON encoded version of the Object to merge patch a delta to an arbitrary other Object. You can use it to save a delta that can be reapplied later, and it will work for an arbitrary Controller-like Object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array is:

Templates []History

type History struct {
  Purpose/Name string // i.e. "podTemplate"
  TypeMeta // the api and version of the historical snippet for use by the client
  Value runtime.RawExtension // the snippet
}

Purpose/Name would map back (same as the key in the map).

Another point - a pod template is serializable but a volume claim template is not (no type metadata). So there has to be additional versioning data attached if you want to store things that are not top level objects . You could alternatively require the controller to store a full object PersistentVolumeClaim instead of VolumeClaimTemplate and reconstitute the sub object from the full, or you could create a new VolumeClaimTemplate and attach object metadata (which is not used?). Ideally we would only serialize exactly what we need, but we would need to specify a group and version for the data we specify.

If serialized state can span two api groups/versions, we need to have distinct group versions. If the state must be intrinsic, the controller revision may store a single type/version (and arbitrary snippets that it can understand for that version). I kind of prefer the latter, which means you can serialize subsets of the object.

In proto serialization, do you want to store JSON?

To reconstruct the history of a specification type Object, a controller will do
the following.

1. Select all AnyState Objects labeled as described

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these namespaced , sorry i missed that part , if you already discussed that earlier

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

1. Sort the AnyStates by the value associated with the
`"ControllerGeneration"` key.
1. This produces a strictly ordered set of AnyStates that comprises the ordered
revision history of the specification type Object.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to clarify, at what point/cases is the history construction necessary

equal to the configured `RevisionHistoryLimit`.

### Version Tracking
Controllers must track the version of the target Object state that corresponds

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By Version of the target, you mean revision of the target or APIVersion ?

record all Objects that are generated from target Object state version
represented by the AnyState as its owners.
- A revision is considered to be live while any generated Object that owns
it is live.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we are saying here is that amongst all AnyState objects for lets say StatefulSets, only the live one have ownership set the StatefulSet ? In terms of ownership , i think all AnyState which represent history should have their ownerreference pointing to the object whose history they represent. This would help in gc'ng history when the object is deleted as well

controllers that implement one method are not bound to use the same method
in the future.

### Target Object State Reconciliation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this more ? Is this a new step needed to maintain the history ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to re lable your product to match the specification that it is equivalent to, update the product in place, or recall and redistribute the product.

Object state.
1. Controllers can reconstruct their revision history.
1. Controllers can't update an AnyState's `.Data`.
1. Controllers can delete an AnyState to maintain their history with respect to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can a user delete the history using kubectl ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we set the revisionhistorylimit as 0 and hence the controllers dont generate AnyState at all ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we set the revisionhistorylimit as 0 and hence the controllers dont generate AnyState at all ?

revisionhistorylimit=0 doesn't mean that the controllers are not generating history, eg. a DaemonSet in the middle of a rollout would need both the old and the new history object. revisionHistoryLimit should take effect only for history objects with no links (no running pods in the system generated by them).

@krmayankk
Copy link

@kow3ns few other questions: what chnages are needed to consolidate with deployment revisions that are currently stored in ReplicaSets. Will they coexist ? Can you manually delete history ? Will pods created manually (without using rs) also have history ?

@0xmichalis
Copy link
Contributor

@kow3ns few other questions: what chnages are needed to consolidate with deployment revisions that are currently stored in ReplicaSets. Will they coexist ?

I don't think we are going to make any changes in Deployments in the short term. I see no issue with coexistence since this will be initially implemented for DaemonSets and StatefulSets.

Can you manually delete history ?

You should be able but you shouldn't do it. You probably want to use revisionHistoryLimit.

Will pods created manually (without using rs) also have history ?

I am not sure we are going to add this for ReplicaSets since they don't support upgrades although I could imagine squashing ReplicaSets into Deployments in the future (after careful evaluation of the option). I don't think we are going to use history for bare pods.

@0xmichalis
Copy link
Contributor

@mfojtik @tnozicka @soltysh fyi

## Requirements

1. History is a collection of points in time, and each point in time must be
represented by its own Object. While it is tempting to aggregate all of an
Copy link
Contributor

@lukaszo lukaszo May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Event if the points in time are the same we will have separate Objects for them? I'm talking about situation, when changing image: nginx -> httpd -> nginx

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in such a case the old object is reused.

serialized representation of the Object's data. The unique hash and integer can
be combined to produce a unique suffix for the Object's `.Name`.

1. We must also ensure that unique name does contain any bad words.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope you meant "doesn't"? :)

The API Server must support the creation and deletion of ControllerRevision
objects. As we have no mechanism for declarative immutability, the API server
must fail any update request that updates the `.Data` field of a
ControllerRevision Object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, you can mention that by making it immutable now, in the future we can relax immutability when that construct is introduced.

Object, a controller will do the following.

1. The controller will serialize all the Object's target object state and store
the serialized representation in the ControllerRevision's `.Data`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to call out specifically that we will trim / reduce / remove data that is not related to state (i.e. clear metadata fields, clear status, etc)? Would be good to explicitly call out what you want to persist here (I assume it's statefulset - metadata - status - other fields that don't clearly map) and maybe provide a concrete example in the section below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this general on purpose so that client code, StatefulSet controller, DaemonSet controller, operators, etc, is not constrained. There are examples of what the target Object state is above, (i.e it is called out that for ReplicaSet, DaemonSet, and StatefulSet this include the PodTemplateSpec, and that this includes the VolumeClaimsTemplate for StatefulSet). I'd prefer to defer the specifics of capturing the state to the StatefulSet update proposal and DaemonSet update proposal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@smarterclayton
Copy link
Contributor

Generally LGTM, some structural things and I'll defer to others on the details of hash / label / collisions.

// Data contains the serialized state.
Data runtime.RawExtension
// Revision indicates the revision of the state represented by Data.
Revision int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data and Revision should be inside a Spec block, probably.

Revision is going to be a confusing field given that metadata already has both ResourceVersion and Generation. :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm - I'm not sure they should be a spec block, unless they have status. This is closer to a secret, config map, or user.

@erictune
Copy link
Member

erictune commented May 15, 2017 via email

1. The controller will serialize all the Object's target object state and store
the serialized representation in the ControllerRevision's `.Data`.
1. The controller will store a unique, monotonically increasing
[revision number](#revision-number-selection) in the Revision field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be the same as the controller's metadata.generation? If not, why not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generation is recorded by the api server, revision is a generation materialized by a controller.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is answered below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revision is a generation materialized by a controller.

But not the actual generation number. Revision essentially tracks observed pod template updates.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis could you help understand. I earlier thought revision was populated from generation of the object. Any time an object changes spec, apiserver increments its generation, which sounds exactly like revision of a spec.

@smarterclayton
Copy link
Contributor

smarterclayton commented May 15, 2017 via email

@0xmichalis
Copy link
Contributor

Could we make the field Revision be "ObservedGeneration" or similar? Since generation is on ObjectMeta i don't feel terrible about reusing it.

Revision is updated on observed pod template updates. ObservedGeneration as we use it everywhere today is for tracking spec changes so by doing this we are conflating two different things.

@kow3ns
Copy link
Member Author

kow3ns commented May 15, 2017

@lavalamp I would think ControllerRevision's would have neither a .Spec nor a .Status. They represent a snapshot and not a specification that the controller is attempting to realize.

the current state of the system to the new declared target state.

To facilitate update and rollback for these controllers, and to provide a
primitive that third party controllers can build on, we propose a mechanism

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns are we proposing that this revision tracking can be used in TPR based operators to track TPR Versions ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually

// Data contains the serialized state.
Data runtime.RawExtension
// Revision indicates the revision of the state represented by Data.
Revision int64

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns where is the metadata to tell if its StatefulSets history or DaemonSet's ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can store the metadata inside the object. In general ControllerRefs and labeling should prevent overlap.


1. The controller will [create a snapshot](#version-snapshot-creation) of the
current target Object state.
1. The controller will [reconstruct the history](#history-reconstruction) of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns maybe its just wording, but reconstruction means fixing a corrupted history or creating one from scratch. For every loop of the controller, we would just add a new history if the spec changed, isnt it ?

1. The controller will serialize all the Object's target object state and store
the serialized representation in the ControllerRevision's `.Data`.
1. The controller will store a unique, monotonically increasing
[revision number](#revision-number-selection) in the Revision field.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kargakis could you help understand. I earlier thought revision was populated from generation of the object. Any time an object changes spec, apiserver increments its generation, which sounds exactly like revision of a spec.


1. Select all ControllerRevision Objects labeled as described
[above](#version-snapshot-creation).
1. Filter any ControllerRevisions that do not have a ControllerRef in their

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you probably mean filter out any ControllerRevisions ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter, as a verb, is used correctly here


At the cost of the complexity of implementing both labeling and ownership,
controllers may use a combination of both approaches to mitigate the
deficiencies of each.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns do we need both. why not just start with one and see if we need the other as well ? I feel like ownerReferences gives you garbage collection as well. So we can ignore the labelling of the owning objects until we find a good reason to do it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to the owner reference to ensure that overlap does not occur. We have been bitten in the past by not ensuring the providence of generated objects.

The controller should deserialize the values of the ControllerRevisions
representing their target Object state and perform a deep, semantic equality
test. Here all differences that do not constitute a mutation to the target
Object state are

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns may be mention that we would do this deep semantic equality only when the hash is same

1. The controller will compute the [hash](#hashing) of the
ControllerRevision's `.Data`.
1. The controller will attach a label to the ControllerRevision so that it is
selectable with a low probability of overlap.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kow3ns still cant make out from the sentence, but are we storing the hash of the .Data inside the ControllerRevision or not. I think that will be useful

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. As per previous SIG discussion the hash is used to name the object.

necessary to determine if the current target Object state is equivalent to a
prior version.

Since we [track the version of](#version-tracking) of generated Objects, this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lot of places , you use generated objects, do you mean the Revision Object or the StatefulSet/DaemonSet ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pods. The terminology section makes this explicit.

1. Controllers can reconstruct their revision history.
1. Controllers can't update a ControllerRevision's `.Data`.
1. Controllers can delete a ControllerRevision to maintain their history with
respect to the configured `RevisionHistoryLimit`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we allow creating ControllerRevision in apiserver directly ? We should likely not allow that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the object is namespaced the default RBAC policies should prevent one user from clobbering the revisions created by another, unless they share the same namespace. If more fine grained control is needed RBAC can be adjusted to accommodate this.

- This may not necessarily be all fields of Object.
- The target Object state is the subset of the target state necessary to create
Objects of the generated type(s).
- To make the this concrete, for the StatefulSet controller:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@0xmichalis
Copy link
Contributor

Had some final comments, lgtm overall

`.Spec.Template` and `.Spec.VolumeClaims`.
- The target Object state is the subset of the target state necessary to create
Objects of the generated type(s).
- To make the this concrete, for the StatefulSet controller `.Spec.Template`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/To make the this concrete/To make this concrete/

Objects along with its status.
- For ReplicaSet, DaemonSet, and StatefulSet, the current state of the
corresponding controllers can be derived from Pods they contain and the
ReplicasSetStatus, DaemonSetStatus, and StatefulSetStatus objects

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the current state of a StatefulSet controller also include the PersistentVolumeClaims generated by the controller? Or can that always be derived from its pods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target Object state for StatefulSet contains the .Spec.VolumeClaimsTemplate as well. Though these are currently immutable, and the semantics of mutation are not well defined atm.

This section is presented as a generalization of how an arbitrary controller
can use ControllerRevision to persist a history of revisions to its
specification type Objects. The technique is applicable, without loss of
generality, to the existing Kuberentes controllers that have Pod as a generated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo s/Kuberentes/Kubernetes/

We can use our [hash function](#hashsing) and
[collision resolution](#collision-resolution) scheme to generate a system
wide unique identifier for an Object based on a deterministic non-unique prefix
and a serialized representation of the Object. Kuberentes Object's `.Names` must

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/Kuberentes/Kubernetes/

We can use our [hash function](#hashsing) and
[collision resolution](#collision-resolution) scheme to generate a system
wide unique identifier for an Object based on a deterministic non-unique prefix
and a serialized representation of the Object. Kuberentes Object's `.Names` must

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe reword for clarity? Something like:

Values in the .Name field of Kubernetes Objects must conform to ...

- A revision is considered to be live while any generated Object labeled
with its `.Name` is live.
- This method has the benefit of providing visibility, via the label, to
users with respect to the historical providence of a generated Object.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of providence you mean provenance? Used a couple of other places here as well.

and deciding if a generated Object corresponds to a particular revision
will require testing each Object for membership in the `OwnerReferences`
of all ControllerRevisions.
1. Note that, since we are labeling the generated Objects to indicate their

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this belong as a third bullet point? I guess it applies to both 1 (labeling) and 2 (owner references). The introductory paragraph for this section refers to two methods, so having this here as a third bullet point seems odd. Maybe move it to a note at the bottom of the section?

restart of every Pod in every StatefulSet and DaemonSet in the system when
additions are made to PodTemplateSpec during a master upgrade. It is therefore
necessary to determine if the current target Object state is equivalent to a
prior version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, you care whether the target Object state is equivalent to the current version, not any prior version, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version, in this context, is the state extracted from the specification object. All other state is prior.

necessary to determine if the current target Object state is equivalent to a
prior version.

Since we [track the version of](#version-tracking) of generated Objects, this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed it, but I didn't see a mention of the behavior when the the controller revision is missing for a generated object -- e.g., after an upgrade from a k8s version that didn't support history.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is deferred to the individual proposals for history pertaining to each controller. This behavior is likely not going to be uniform across all controller implementations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I'd suggest including a note to that effect somewhere within this proposal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a section

- If the current version is equivalent to a version prior to its immediate
predecessor, this indicates a rollback.
- If the current version is not equivalent to any prior version, this
indicates an update or a roll forward.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this (determination of a rollback or an update) just informational, or is the controller expected to treat the two cases differently?

Also, detecting rollback is limited to the controller's RevisionHistoryLimit, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StatefulSet cares about rollback vs rollforward. Deployment and DaemonSet probably do not.

@smarterclayton
Copy link
Contributor

This is really close to LGTM. Can reviewers drive any additional questions and resolution in by EOD so we can merge?

continue to (re)create generated Objects based on the current target Object
state.
1. The history should be initialized on the first mutation to the specification
type Object for which the history will be generated.
1. After the history has been initialized, any generated Objects that have no
indication of the revision from which the were generated may be treated as if
indication of the revision from which they were generated may be treated as if
the have a nil revision. That is, without respect to the method of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "as if they have a nil revision."

@mikegrass
Copy link

Thanks for the updates and responses! Lgtm

@smarterclayton
Copy link
Contributor

LGTM as well. Please squash and we'll merge.

@mfojtik
Copy link
Contributor

mfojtik commented May 24, 2017

LGTM

@kow3ns kow3ns merged commit d8a25a8 into kubernetes:master May 24, 2017
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Jun 3, 2017
Automatic merge from submit-queue

Implement Daemonset history

~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged)

Ref kubernetes/community#527 and kubernetes/community#594

@kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis 

---

TODOs:
- [x] API changes
  - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback 
- [x] deployment controller 
  - [x] controller revision
    - [x] owner ref (claim & adoption)
    - [x] history reconstruct (put revision number, hash collision avoidance)
    - [x] de-dup history and relabel pods
    - [x] compare ds template with history 
  - [x] hash labels (put it in controller revision, pods, and maybe deployment)
  - [x] clean up old history 
  - [x] Rename status.uniquifier when we reach consensus in #44774 
- [x] e2e tests 
- [x] unit tests 
  - [x] daemoncontroller_test.go 
  - [x] update_test.go 
  - [x] ~(maybe) storage_test.go // if we do server side rollback~

kubectl part is in #46144

--- 

**Release note**:

```release-note
```
perotinus pushed a commit to kubernetes-retired/cluster-registry that referenced this pull request Sep 2, 2017
Automatic merge from submit-queue

Implement Daemonset history

~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged)

Ref kubernetes/community#527 and kubernetes/community#594

@kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis 

---

TODOs:
- [x] API changes
  - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback 
- [x] deployment controller 
  - [x] controller revision
    - [x] owner ref (claim & adoption)
    - [x] history reconstruct (put revision number, hash collision avoidance)
    - [x] de-dup history and relabel pods
    - [x] compare ds template with history 
  - [x] hash labels (put it in controller revision, pods, and maybe deployment)
  - [x] clean up old history 
  - [x] Rename status.uniquifier when we reach consensus in #44774 
- [x] e2e tests 
- [x] unit tests 
  - [x] daemoncontroller_test.go 
  - [x] update_test.go 
  - [x] ~(maybe) storage_test.go // if we do server side rollback~

kubectl part is in #46144

--- 

**Release note**:

```release-note
```
danehans pushed a commit to danehans/community that referenced this pull request Jul 18, 2023
This reverts commit d42823d05e8a82c4cac50060c95048535702ac79.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet