Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stacks: multiple namespace support #1311

Merged
merged 4 commits into from Mar 11, 2020
Merged

Conversation

displague
Copy link
Member

@displague displague commented Mar 1, 2020

Description of your changes

Fixes #1051

When the Stack controller reconciles a Stack it will look for the expected CRDs. If the CRDs are not present, the reconcile will be requeued.

Any of the matching CRDs which contain the app.kubernetes.io/managed-by: stack-manager label (which StackInstall controller's unpack provides) will receive additional labels:

  • namespace.crossplane.io/{ns}: "true" an existing label being deferred from unpack to stack reconciliation
  • parent.stacks.crossplane.io/{ns}-{name}: "true" a new label to support multiple parents of a CRD

When a Stack is installed into a namespace the existing CRDs will gain additional copies of these labels, for each namespace or stack within that namespace.

When a Stack is deleted from a namespace the CRDs will have parent.stacks.crossplane.io/{ns}-{name} labels removed by the Stack controller. The namespace.crossplane.io/{ns} label will also be removed if there are no remaining parent.stacks.crossplane.io/{ns}* labels.

When a Stack is deleted and a CRD with the app.kubernetes.io/managed-by: stack-manager label contains none of the these two types of labels, the CRD will be deleted by the StackInstall controller.

Honoring the existing namespace.crossplane.io/{ns}: "true" discoverability labels prevents this PR from deleting existing Stack CRDs.

We may also want to revisit the need for single parent labels which are currently used for detecting and deleting owned resources that cross the namespace/cluster scope boundary.

(Updated comment) Single parent labeling is modified in this PR. CRDs no longer receive these labels on unpack. When a StackInstall is removed, any CRD parent labels referring to that StackInstall are also removed from any matching CRDs.

If two StackInstalls sharing a common CRD are installed and deleted within a reconciliation window a race could have the newly reused CRD deleted before the new Stack can label it.

How has this code been tested?

Upgrading Stack-Manager

  1. Install the Wordpress and GCP Stack before this version of Stack-Manager is applied.
  2. Update Stack Manager to this version.
    • observe existing CRDs are not deleted

Multiple Stacks

  1. Install this version of the Stack Manager

  2. Install Stack GCP into crossplane-system

  3. Install Stack Wordpress into two namespaces

    • observe CRDs are given the labels mentioned in the description above
    $ kubectl get crd -o jsonpath='{.metadata.labels}' wordpressinstances.wordpress.samples.stacks.crossplane.io ; echo 
    map[app.kubernetes.io/managed-by:stack-manager crossplane.io/scope:namespace namespace.crossplane.io/default:true namespace.crossplane.io/precipice:true parent.stacks.crossplane.io/default-stack-wordpress:true parent.stacks.crossplane.io/precipice-stack-wordpress:true]
    
  4. Delete one of these StackInstalls

    • observe CRDs lose the appropriate labels
    $ kubectl delete stackinstall -n default stack-wordpress
    $ kubectl get crd -o jsonpath='{.metadata.labels}' wordpressinstances.wordpress.samples.stacks.crossplane.io ; echo 
    map[app.kubernetes.io/managed-by:stack-manager crossplane.io/scope:namespace namespace.crossplane.io/precipice:true parent.stacks.crossplane.io/precipice-stack-wordpress:true]
    
  5. Delete the other StackInstall

    • observe that the CRD has been deleted

Checklist

I have:

  • Run make reviewable to ensure this PR is ready for review.
  • Ensured this PR contains a neat, self documenting set of commits.
  • Updated any relevant documentation and examples.
  • Reported all new error conditions into the log or as an event, as
    appropriate.

For more about what we believe makes a pull request complete, see our
definition of done.

@displague displague force-pushed the stack-to-stack branch 3 times, most recently from 673b74f to c0e95c7 Compare March 5, 2020 15:54
@displague displague marked this pull request as ready for review March 5, 2020 16:22
@displague
Copy link
Member Author

I'm opening this up for review. I'll be adding some more tests and docs but the code is in the expected final state (pending review feedback).

@displague displague force-pushed the stack-to-stack branch 6 times, most recently from 3db90c7 to 6bc3a78 Compare March 7, 2020 03:07
@displague
Copy link
Member Author

displague commented Mar 8, 2020

Griping about "GoLangCI: Processing Timeout." here: golangci/golangci-worker#56 (comment)

Rebasing on latest master for a force push to trigger the rebuild.

Signed-off-by: Marques Johansson <marques@upbound.io>
Signed-off-by: Marques Johansson <marques@upbound.io>
Copy link
Member

@jbw976 jbw976 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good overall, just some minor comments to discuss!

pkg/controller/stacks/install/installjob.go Outdated Show resolved Hide resolved
pkg/controller/stacks/install/stackinstall.go Outdated Show resolved Hide resolved

func assertNoKubernetesObject(t *testing.T, g *GomegaWithT, got runtime.Object, unwanted metav1.Object, kube client.Client) {
n := types.NamespacedName{Name: unwanted.GetName(), Namespace: unwanted.GetNamespace()}
g.Expect(kube.Get(ctx, n, got)).To(HaveOccurred())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this check specifically for a "not found error", instead of "any error" that it appears to be checking for?

Copy link
Member Author

@displague displague Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a stab at this, but within this function I can't replicate the parameters needed to match kerrors.NewNotFound() (specifically, I can't get the schema.GroupResource{} Resource -- "customresourcedefinitions" in this case).

After looking into how to accomplish this with Gomega, I started wondering why we are introducing Gomega at all.

This assertNoKubernetesObject mirrors assertKubernetesObject. Both are already present between stackinstall_test and stack_test (I don't recall which received assertKubernetesObject first).

In a future PR I think we can replace this with a function that doesn't use Gomega. The semantics and detailed reporting don't provide much advantage over what we already get with cmp.Diff. Without the Gomega semantics I would just check if the error had a 404 Code (which I can still do with Gomega, I'd rather avoid building this function up).

I can revisit this within this PR when the other problems are addressed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am always in support of removing Gomega usage.

pkg/controller/stacks/stack/stack.go Outdated Show resolved Hide resolved
pkg/controller/stacks/stack/stack.go Outdated Show resolved Hide resolved
pkg/controller/stacks/stack/stack.go Outdated Show resolved Hide resolved

return func(ctx context.Context, crds []apiextensions.CustomResourceDefinition) error {

for persona := range roleVerbs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this func looks like a lot of new logic from the diff, but I think that's probably not the case and it's mostly moved/refactored logic. This section of permissions management is the one i'd be the most concerned about for regression risk, so hopefully we have done the testing validation to ensure this still works as expected.

Copy link
Member Author

@displague displague Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the refactor here is hard to follow by diff.

Before

create calls processRBAC
processRBAC calls createPersonaClusterRoles
createPersonaClusterRoles fetches all CRDs using crdsFromStack before doing role specific work
crdsFromStack calls crdListsDiffer which is a naive len(fetched) == len(stack.spec.crds) check (caused #1290)

After

create calls processRBAC (no change)
create calls processCRDs (new)
processRBAC no longer calls createPersonaClusterRoles
processCRDs fetches all CRDs using crdsFromStack (improved for multi-parented and unmanaged CRDs (fixes #1290))
crdsFromStack now verifies that the expected version of the CRD exists in the fetched CRD
crdsFromStack is now usable for deletion where some CRDs could already be missing

processCRDs walks a list of handlers which include:

  • createPersonaClusterRoles (inner crd-loop is unmodified, outer crd fetch + loop + len-check is not needed)
  • multi-parent labeller (new)
  • namespace discovery labeller (moved here away from unpack, because the managed crd may already exist)

processCRDs uses crdListFulfilled (replacing crdListsDiffer) to verify that all expected CRDs are present (managed or not) (hmm, this could be just another "handler" in the list)

pkg/controller/stacks/stack/stack_test.go Outdated Show resolved Hide resolved
pkg/controller/stacks/stack/stack_test.go Show resolved Hide resolved
Signed-off-by: Marques Johansson <marques@upbound.io>
…mplexity)

Signed-off-by: Marques Johansson <marques@upbound.io>
@jbw976 jbw976 merged commit 063505e into crossplane:master Mar 11, 2020
@displague displague added the sm label Mar 12, 2020
@displague displague deleted the stack-to-stack branch March 13, 2020 19:59
@prasek prasek added this to the v0.9 milestone Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support multiple instances of the same StackInstall / Stack in different namespaces
3 participants