Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cascading deletion support to federated namespaces #34648

Merged
merged 1 commit into from
Oct 27, 2016

Conversation

nikhiljindal
Copy link
Contributor

@nikhiljindal nikhiljindal commented Oct 12, 2016

Ref #33612

With this change, whenever a federated namespace is deleted with DeleteOptions.OrphanDependents = false, then federation namespace controller first deletes the corresponding namespaces from all underlying clusters before deleting the federated namespace.

cc @kubernetes/sig-cluster-federation @caesarxuchao

Adding support for DeleteOptions.OrphanDependents for federated namespaces. Setting it to false while deleting a federated namespace also deletes the corresponding namespace from all registered clusters.

This change is Reviewable

@nikhiljindal
Copy link
Contributor Author

I am working on updating our e2e test

@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Oct 12, 2016
@nikhiljindal
Copy link
Contributor Author

Updated the PR to include e2e changes. Verified that test/e2e/federated-namespace passes

@nikhiljindal nikhiljindal added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels Oct 13, 2016
Copy link
Member

@caesarxuchao caesarxuchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not familiar with the federation code base. I asked two question, they might be silly. I'll take another pass tomorrow.

// removeFinalizerFunc
func(obj runtime.Object, finalizer string) (runtime.Object, error) {
namespace := obj.(*api_v1.Namespace)
finalizerSet := sets.NewString()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why you need to make the Set. Why not just make []api_v1.FinalizerName directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thx

@@ -371,6 +416,13 @@ func (nc *NamespaceController) delete(namespace *api_v1.Namespace) error {
return fmt.Errorf("failed to delete events list from namespace: %v", err)
}

// Delete the namespace from all underlying clusters.
updatedNamespaceObj, err := nc.deletionHelper.ProcessDeletion(updatedNamespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought ProcessDeletion should take a fed_v1.Namespace, why is it taking a regular namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no special fed_v1.Namespace. We use the same v1.Namespace object in federation

@nikhiljindal
Copy link
Contributor Author

Updated the code as per comments. Ready for another look

return err
}

// Remove kube_api.FinalizerKubernetes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The federated namespace is deleted before the underlying cluster namespaces are deleted from etcd, right? This is consistent with the mode of the garbage collector, i.e., asynchronous, but different from deletion of a cluster namespace, which is synchronous. I'm not sure if this is what you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. federated namespace will not be deleted unless the namespaces have been deleted in all underlying clusters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it blocked? It seems the UpdateOnError is not blocking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was blocked until the Delete request succeeded for all clusters but it wasnt waiting for all resources to go away. Updated the code to wait.

@nikhiljindal
Copy link
Contributor Author

Updated the controller to add a DeleteFederatedDependents finalizer before creating any dependent resources in sub clusters. This now ensures that the federation resource is not deleted unless all dependents have been removed.
PTAL.

@nikhiljindal
Copy link
Contributor Author

Updated the code as discussed. I am now creating the finalizer in ObjectMeta.Finalizers instead of putting it in NamespaceSpec.Finalizers.
Verified that federated-namespace e2e test passes :)

@nikhiljindal nikhiljindal added this to the v1.5 milestone Oct 19, 2016
@k8s-github-robot k8s-github-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Oct 20, 2016
@nikhiljindal
Copy link
Contributor Author

Retrying unit test
@k8s-bot unit test this

1 similar comment
@nikhiljindal
Copy link
Contributor Author

Retrying unit test
@k8s-bot unit test this

@@ -255,6 +350,21 @@ func (nc *NamespaceController) reconcileNamespace(namespace string) {
return
}

glog.V(3).Infof("Ensuring delete dependents finalizer for namespace: %s", baseNamespace.Name)
// Add the DeleteFederatedDependents finalizer before creating a namespace in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ideas: DeleteUnderlyingClusterObjects, DeleteDependentsInUnderlyingClusters. Not sure if DeleteFederatedDependents is intuitive. Your call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to DeleteFromUnderlyingClusters

}

updatedNamespace = updatedNamespaceObj.(*api_v1.Namespace)
err = nc.federatedApiClient.Core().Namespaces().Delete(updatedNamespace.Name, &api_v1.DeleteOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to pass an empty deleteOptions? Can it be nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes nil should work as well. Will try :)

// Right now there is just 5 types of objects: ReplicaSet, Secret, Ingress, Events and Service.
if nc.hasFinalizerFuncInSpec(updatedNamespace, api_v1.FinalizerKubernetes) {
// Delete resources in this namespace.
updatedNamespace, err = nc.removeKubernetesFinalizer(updatedNamespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is deleteNamespaceResourcesAndRemoveNamespaceSpecFinalizer :) Maybe split it to two functions? "KubernetesFinalizer" is not clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can rename it to deleteNamespaceResourcesAndRemoveNamespaceSpecFinalizer but thats a bit long ;)
I think the current name is fine (its a private func).
I dont want to change that code much.
There is a TODO to make it consistent with the kubernetes code. We should be able to reuse the same code in kubernetes and federation to delete namespace resources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I still think "NamespaceSpecFinalizer" is more clear than "KubernetesFinalizer". Your call, as it's a private func.

There is a TODO to make it consistent with the kubernetes code. We should be able to reuse the same code in kubernetes and federation to delete namespace resources.

Not sure I follow this part. What kubernetes code are you referring to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I named it removeKubernetesFinalizer because it removes the finalizer called FinalizerKubernetes.

Not sure I follow this part. What kubernetes code are you referring to?

kubernetes namespace controller also has the same code that deletes all resources in a namespace before removing the finalizer - same as what this function does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I named it removeKubernetesFinalizer because it removes the finalizer called FinalizerKubernetes.

I think FinalizerKubernetes is a bad name :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes now it is. It is clear now that we should have been more specific. I guess the thinking at that time was that this is the finalizer that kubernetes is adding to do its work. Now we have so many so its better to be specific.

@@ -108,11 +114,19 @@ func TestNamespaceController(t *testing.T) {

// Test add federated namespace.
namespaceWatch.Add(&ns1)
// Verify that the DeleteFederatedDependents finalizer is added to the namespace.
// Note: finalize invokes the create action in Fake client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does sound like a bug? Could you open an issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing in #35478

}
if len(operations) > 0 {
// We have deleted a bunch of resources.
// Wait for the store to observe all the deletions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the "wait" is implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait is implemented by not returning success here.
Since len(operations) > 0 we return here without removing the finalizer. Next time this function is called, we will again first compute if we want to do any operations in underlying clusters. If at any time, len(operation) = 0, then we proceed. Else we wait till len(operations) == 0.
Caller is expected to keep calling HandleObjectInUnderlyingClusters until it returns success (err == nil).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I didn't notice you called deliverNamespace to process the namespace again in the future. So it looks good.

@@ -36,12 +36,13 @@ const (
)

// Create/delete ingress api objects
var _ = framework.KubeDescribe("Federation namespace [Feature:Federation]", func() {
var _ = framework.KubeDescribe("Federation namespace [Feature:Federation12]", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is Federation12?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix before committing.

@@ -2210,6 +2210,7 @@ func DumpEventsInNamespace(eventsLister EventsLister, namespace string) {
events, err := eventsLister(v1.ListOptions{}, namespace)
Expect(err).NotTo(HaveOccurred())

By(fmt.Sprintf("Found %d events.", len(events.Items)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to check in this code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes helps in debugging.

return obj, nil
}
namespace.ObjectMeta.Finalizers = newFinalizers
namespace, err := nc.federatedApiClient.Core().Namespaces().Finalize(namespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call Update? Calling Finalize looks like it's delete the namespace.spec.finalizer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?
My expectation is that if only the list of finalizers are being modified, then we should use Finalize(). Else use update().
Is that not correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Finalize is a special method that only exists in the namespace client. So I think we should call Finalize() only when we update the namespace.spec.finalizer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok. Hadnt realized the spec.Finalizer and ObjectMeta.Finalizer difference. You are right. Will update

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}
// Remove kube_api.FinalizerKubernetes
if len(namespace.Spec.Finalizers) != 0 {
return nc.removeFinalizerFromSpec(namespace, api_v1.FinalizerKubernetes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should check if all resources are actually deleted to make the deletion synchronous. Cluster namespace deletion will wait for all resources are deleted. Perhaps this will be easier to implement after you apply the deleteHelper pattern to other resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should check if all resources are actually deleted to make the deletion synchronous.

Yes this is not required right now since all deletions are immediate (there is no cascading deletion). Will need to update this before enabling cascading deletion for any of these resources.
The plan is to refactor kubernetes namespace controller code to make the namespace deletion code reusable and then use it here.
Leaving as is in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Thanks for the explanation.

)

const (
FinalizerDeleteFederatedDependents string = "federation.kubernetes.io/delete-federated-dependents"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment? Also mention the OrphanFinalizer will take precedence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thx

@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 25, 2016
@nikhiljindal
Copy link
Contributor Author

Verified that federation namespace e2e test is passing

@nikhiljindal nikhiljindal force-pushed the NSCasDel branch 2 times, most recently from c286004 to b95fac1 Compare October 25, 2016 22:10
@nikhiljindal
Copy link
Contributor Author

Replaced Finalize() by Update() as per the comment above and squashed commits.
PTAL.

@caesarxuchao
Copy link
Member

LGTM. Thanks @nikhiljindal !

@caesarxuchao caesarxuchao added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit b95fac161531a5df07b034d52047b2351b0bf5f9. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2016
@nikhiljindal
Copy link
Contributor Author

Trying to satisfy bazel and fix unit tests

@nikhiljindal nikhiljindal added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins GCE e2e failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins Kubemark GCE e2e failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GKE smoke e2e failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot cvm gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GKE smoke e2e failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GCE e2e failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE etcd3 e2e failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@nikhiljindal nikhiljindal removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 26, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins unit/integration failed for commit ebbbd02abc6ef6b4da535e39fc1493444771cb7d. Full PR test history.

The magic incantation to run this job again is @k8s-bot unit test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@nikhiljindal
Copy link
Contributor Author

All tests passed. Adding back LGTM

@nikhiljindal nikhiljindal added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 27, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit dcdbf27 into kubernetes:master Oct 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants