Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper handling of cluster-scoped resources in operators #1306

Merged
merged 6 commits into from
Jan 28, 2020

Conversation

kensipe
Copy link
Member

@kensipe kensipe commented Jan 24, 2020

What this PR does / why we need it:

Summary:

  • proper applying and re-applying of cluster-scoped resources in the task_apply.go
  • ownership is NOT set for cluster-scoped resources to the Instances

Fixes #1288
Fixes #1265

Signed-off-by: Ken Sipe kensipe@gmail.com

Signed-off-by: Ken Sipe <kensipe@gmail.com>
pkg/engine/task/task_apply.go Show resolved Hide resolved
pkg/engine/task/task_apply.go Show resolved Hide resolved
pkg/engine/task/task_apply.go Show resolved Hide resolved

// just a small sanity check
assert.True(t, got[schema.GroupVersionKind{Group: "apps", Version: "v1", Kind: "ReplicaSet"}])
assert.False(t, got[schema.GroupVersionKind{Group: "apiextensions.k8s.io", Version: "v1", Kind: "CustomResourceDefinition"}])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we test if a custom resource for both a Namespaced CRD and a Cluster scoped CRD also work here, just so we have it in the event of any regressions (and we're talking, in Kubernetes terms, about two different API servers here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean a CR (not CRD), right? Because CRDs are always cluster-scoped. CRs might be not.

@kensipe
Copy link
Member Author

kensipe commented Jan 27, 2020

tests fail... this test was added as a unit test but it looks like an integration test

--- FAIL: Test_namespacedGVKResources (0.00s)
    task_apply_test.go:196: 
        	Error Trace:	task_apply_test.go:196
        	Error:      	Received unexpected error:
        	            	failed to fetch cluster REST config: could not locate a kubeconfig
        	Test:       	Test_namespacedGVKResources
    task_apply_test.go:199: 
        	Error Trace:	task_apply_test.go:199
        	Error:      	Should be true
        	Test:       	Test_namespacedGVKResources

@kensipe
Copy link
Member Author

kensipe commented Jan 27, 2020

I'm also concerned... what happens if the resource isn't in the map?

@gerred
Copy link
Member

gerred commented Jan 27, 2020

I'm also concerned... what happens if the resource isn't in the map?

an explicit failure with an Event would be good here - this would be the case if a CRD doesn't exist.

@zen-dog
Copy link
Contributor

zen-dog commented Jan 27, 2020

I'm also concerned... what happens if the resource isn't in the map?

Good point. currently, it will be false but... an explicit error might be better? 🤔

@zen-dog
Copy link
Contributor

zen-dog commented Jan 27, 2020

this would be the case if a CRD doesn't exist.

what do you mean? We don't check for specific CRDs, just if GVK GroupVersionKind{Group: "apiextensions.k8s.io", Version: "v1", Kind: "CustomResourceDefinition"} as such is available.

@gerred
Copy link
Member

gerred commented Jan 27, 2020

@zen-dog If the Prometheus Operator ServiceMonitor CRD isn't created but a ServiceMonitor CR is in a plan, this would fail, wouldn't it?

@gerred
Copy link
Member

gerred commented Jan 27, 2020

(or some cluster-scoped form of ServiceMonitor, that is.. so maybe something from Istio?)

@zen-dog
Copy link
Contributor

zen-dog commented Jan 27, 2020

If the Prometheus Operator ServiceMonitor CRD isn't created but a ServiceMonitor CR is in a plan, this would fail, wouldn't it?

this would fail either way (independent of namespaced vs cluster-scoped) because KUDO would fail to apply an unknown resource Kind, right?

@gerred
Copy link
Member

gerred commented Jan 27, 2020

Oh yeah, because nothing happens right now when no match is discovered in the GVK map. ObjectKeyFromObject happens before the actual apply though - so I suppose to bounce off of @kensipe, do we want to catch it in the path for ObjectKeyFromObject before we actually try to apply?

@zen-dog
Copy link
Contributor

zen-dog commented Jan 27, 2020

do we want to catch it in the path for ObjectKeyFromObject before we actually try to apply?

should be a one-liner. I can add it. Though, there are multiple other places where we "let it run and fail" because figuring it out upfront is too much work.

@zen-dog zen-dog changed the title Cluster Resources in Plans Work Proper handling of cluster-scoped resources in operators Jan 27, 2020
Copy link
Contributor

@zen-dog zen-dog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved the "simple fix" presented in the first commit. / cc @kensipe We'll work on a proper one in a followup PR

func namespacedGVKResources() (map[schema.GroupVersionKind]bool, error) {
namespaced := map[schema.GroupVersionKind]bool{}

restCfg, err := config.GetConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should probably inject the config rather than load it here. You would have to find a good place from which to pass it in though, the most upper level for this is manager.GetConfig in main.go and then carry it through controller

…t out

Signed-off-by: Ken Sipe <kensipe@gmail.com>
@kensipe kensipe merged commit 7343faa into master Jan 28, 2020
@kensipe kensipe deleted the ken/cluster-resource-apply branch January 28, 2020 19:01
ANeumann82 pushed a commit that referenced this pull request Feb 13, 2020
Co-authored-by: Aleksey Dukhovniy <adukhovniy@mesosphere.io>
Signed-off-by: Ken Sipe <kensipe@gmail.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
runyontr pushed a commit that referenced this pull request Mar 11, 2020
Co-authored-by: Aleksey Dukhovniy <adukhovniy@mesosphere.io>
Signed-off-by: Ken Sipe <kensipe@gmail.com>
Signed-off-by: Thomas Runyon <runyontr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cluster Scoped Resource Fails in Plans [Spark Operator] Deploy plan hangs due to a transient error
4 participants