Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

Make it easier to understand when the target type is not present in all federated clusters #314

Closed
pmorie opened this issue Oct 8, 2018 · 13 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@pmorie
Copy link
Contributor

pmorie commented Oct 8, 2018

Currently, there is not an easy way to determine when a target type is not present in one or more federated clusters. Example controller logs:

I1008 13:33:49.657514       1 controller.go:212] Running reconcile FederatedTypeConfig for "federation-system/bars.example.io"
I1008 13:33:49.668942       1 controller.go:125] Starting sync controller for "FederatedBar"
I1008 13:33:49.668992       1 controller.go:290] Started sync controller for "FederatedBar"
I1008 13:33:49.669008       1 controller.go:212] Running reconcile FederatedTypeConfig for "federation-system/bars.example.io"
I1008 13:33:49.689848       1 federated_informer.go:211] Cluster federation-system/cluster1 is ready
I1008 13:33:49.697323       1 federated_informer.go:211] Cluster federation-system/cluster2 is ready
E1008 13:33:49.700713       1 reflector.go:205] github.com/kubernetes-sigs/federation-v2/pkg/controller/util/federated_informer.go:430: Failed to list <nil>: the server could not find the requested resource (get bars.example.io)
E1008 13:33:50.703301       1 reflector.go:205] github.com/kubernetes-sigs/federation-v2/pkg/controller/util/federated_informer.go:430: Failed to list <nil>: the server could not find the requested resource (get bars.example.io)
E1008 13:33:51.710217       1 reflector.go:205] github.com/kubernetes-sigs/federation-v2/pkg/controller/util/federated_informer.go:430: Failed to list <nil>: the server could not find the requested resource (get bars.example.io)
E1008 13:33:52.712783       1 reflector.go:205] github.com/kubernetes-sigs/federation-v2/pkg/controller/util/federated_informer.go:430: Failed to list <nil>: the server could not find the requested resource (get bars.example.io)
E1008 13:33:53.715131       1 reflector.go:205] github.com/kubernetes-sigs/federation-v2/pkg/controller/util/federated_informer.go:430: Failed to list <nil>: the server could not find the requested resource (get bars.example.io)
E1008 13:33:54.775163       1 reflector.go:205] github.com/kubernetes-sigs/federation-v2/pkg/controller/util/federated_informer.go:430: Failed to list <nil>: the server could not find the requested resource (get bars.example.io)

Should this be an element of status in FederatedTypeConfig, perhaps?

@gyliu513
Copy link
Contributor

gyliu513 commented Oct 9, 2018

I think this should be a sub requirement for #246 ?

/cc @shashidharatd

@pmorie
Copy link
Contributor Author

pmorie commented Oct 9, 2018 via email

@gyliu513
Copy link
Contributor

gyliu513 commented Oct 9, 2018

@pmorie do you mean that the status of FederatedTypeConfig should include information if the CRD resources exist on member cluster or not?

@pmorie
Copy link
Contributor Author

pmorie commented Oct 15, 2018

@gyliu513 yep, that's what I had meant.

@gyliu513
Copy link
Contributor

@pmorie Can we also treat the CRD exist or not as FederatedTypeConfig status? The federation controller just query each member cluster and get CRD exist or not and fill in the status of FederatedTypeConfig?

@pmorie
Copy link
Contributor Author

pmorie commented Oct 19, 2018

@gyliu513 Yep, that's exactly what I had in mind. Writing up some thoughts from a discussion with @sohankunkerkar here. We propose the following:

Changes to FederatedTypeConfigStatus:

type FederatedTypeConfigStatus struct {
	TargetTypeStatus        TargetTypeOverallStatus
	TargetTypeClusterStatus []TargetTypeClusterStatus
}

type TargetTypeOverallStatus string

const (
	TargetTypeOverallStatus PresentInAllClusters    = "PresentInAllClusters"
	TargetTypeOverallStatus NotPresentInAllClusters = "NotPresentInAllClusters"
)

type TargetTypeClusterStatus struct {
	ClusterName string
	Status      TargetTypeStatus
}

type TargetTypeStatus string

const (
	TargetTypeStatusPresent    TargetTypeStatus = "Present"
	TargetTypeStatusNotPresent TargetTypeStatus = "NotPresent"
)

Changes to sync controller

The sync controller will be responsible for maintaining the new fields of status defined above:

  • The sync controller will determine the status of the target type in each FederatedCluster in the namespace - present or not
  • The overall status is driven off the status in each cluster - if the target type is not present in one or more clusters, the overall status is NotPresentInAllClusters

There are 3 different categories of target types:

  1. Types from the kubernetes API that are embedded in the Kubernetes API server
  2. Types from aggregated APIs that are deployed by users/operators in each cluster
  3. Custom types created by CRDs

The sync controller, in its reconciliation loop, will use the discovery endpoint to determine whether the target type is present in a particular cluster.

Examples

Target type not present in one cluster:

# rest of resource omitted
status:
  targetTypeOverallStatus: "NotPresentInAllClusters"
  targetTypeStatus:
  - clusterName: cluster1
    status: "Present"
  - clusterName: cluster2
    status: "NotPresent"

Target type present in all clusters:

# rest of resource omitted
status:
  targetTypeOverallStatus: "PresentInAllClusters"
  targetTypeStatus:
  - clusterName: cluster1
    status: "Present"
  - clusterName: cluster2
    status: "Present"

@pmorie
Copy link
Contributor Author

pmorie commented Oct 19, 2018

^ names of fields are just suggestions, happy to change names if folks come up with better suggestions.

@gyliu513
Copy link
Contributor

@pmorie the design looks good to me. I also found that @marun is now working to propagate cluster-scoped resources, then after this project finished, we can leverage it to make sure the cluster-scoped exist on the specified cluster.

@pmorie
Copy link
Contributor Author

pmorie commented Nov 16, 2018

@font is going to pick up this work now that #403 is merged

We were brainstorming today and @font observed that status should also include whether federated APIs referenced from FederatedTypeConfig exist in the host cluster

@marun marun added this to the v1beta1 milestone Apr 10, 2019
@marun marun added kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Apr 10, 2019
@marun marun modified the milestones: v1beta1, v1beta2 May 1, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 30, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 29, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests

6 participants