Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface "stale" GroupVersions from AggregatedDiscovery #116145

Merged
merged 1 commit into from Mar 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
44 changes: 35 additions & 9 deletions staging/src/k8s.io/client-go/discovery/aggregated_discovery.go
Expand Up @@ -24,43 +24,69 @@ import (
"k8s.io/apimachinery/pkg/runtime/schema"
)

// StaleGroupVersionError encasulates failed GroupVersion marked "stale"
// in the returned AggregatedDiscovery format.
type StaleGroupVersionError struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse the existing struct? https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L358

I'd imagine we may have clients who already cast to that type already and introducing a new type might be a bit confusing? Logically, stale is equivalent to discovery failed.

Copy link
Contributor Author

@seans3 seans3 Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ErrGroupDiscoveryFailed error is supposed to encasulate all the Group/Versions which failed, which is why it stores a map[schema.GroupVersion]error. So it would be awkward (especially printing the error) if there were a hierarchy of these errors. BTW, the ErrGroupDiscoveryFailed error is what gets returned to the discovery interface user storing the individual errors in the map (as an example, see ServerGroupsAndResources).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

gv schema.GroupVersion
}

func (s StaleGroupVersionError) Error() string {
return fmt.Sprintf("stale GroupVersion discovery: %v", s.gv)
}

// SplitGroupsAndResources transforms "aggregated" discovery top-level structure into
// the previous "unaggregated" discovery groups and resources.
func SplitGroupsAndResources(aggregatedGroups apidiscovery.APIGroupDiscoveryList) (*metav1.APIGroupList, map[schema.GroupVersion]*metav1.APIResourceList) {
func SplitGroupsAndResources(aggregatedGroups apidiscovery.APIGroupDiscoveryList) (
*metav1.APIGroupList,
map[schema.GroupVersion]*metav1.APIResourceList,
map[schema.GroupVersion]error) {
// Aggregated group list will contain the entirety of discovery, including
// groups, versions, and resources.
// groups, versions, and resources. GroupVersions marked "stale" are failed.
groups := []*metav1.APIGroup{}
failedGVs := map[schema.GroupVersion]error{}
resourcesByGV := map[schema.GroupVersion]*metav1.APIResourceList{}
for _, aggGroup := range aggregatedGroups.Items {
group, resources := convertAPIGroup(aggGroup)
group, resources, failed := convertAPIGroup(aggGroup)
groups = append(groups, group)
for gv, resourceList := range resources {
resourcesByGV[gv] = resourceList
}
for gv, err := range failed {
failedGVs[gv] = err
}
}
// Transform slice of groups to group list before returning.
groupList := &metav1.APIGroupList{}
groupList.Groups = make([]metav1.APIGroup, 0, len(groups))
for _, group := range groups {
groupList.Groups = append(groupList.Groups, *group)
}
return groupList, resourcesByGV
return groupList, resourcesByGV, failedGVs
}

// convertAPIGroup tranforms an "aggregated" APIGroupDiscovery to an "legacy" APIGroup,
// also returning the map of APIResourceList for resources within GroupVersions.
func convertAPIGroup(g apidiscovery.APIGroupDiscovery) (*metav1.APIGroup, map[schema.GroupVersion]*metav1.APIResourceList) {
func convertAPIGroup(g apidiscovery.APIGroupDiscovery) (
*metav1.APIGroup,
map[schema.GroupVersion]*metav1.APIResourceList,
map[schema.GroupVersion]error) {
// Iterate through versions to convert to group and resources.
group := &metav1.APIGroup{}
gvResources := map[schema.GroupVersion]*metav1.APIResourceList{}
failedGVs := map[schema.GroupVersion]error{}
group.Name = g.ObjectMeta.Name
for i, v := range g.Versions {
version := metav1.GroupVersionForDiscovery{}
for _, v := range g.Versions {
gv := schema.GroupVersion{Group: g.Name, Version: v.Version}
if v.Freshness == apidiscovery.DiscoveryFreshnessStale {
failedGVs[gv] = StaleGroupVersionError{gv: gv}
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the legacy scenario, we return something in the error &ErrorGroupDiscoveryFailed{Groups: failedGroups} https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L428 for any gvs that were unreachable. Can we propagate this information to ServerGroupsAndResources() in the aggregated discovery as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. Lots of plumbing complexity. Let me know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to return the resources even if the gv is stale? I noticed that's what we do currently https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L531, though not really sure about the use case for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question still stands but I don't know what the answer should be. Mostly pertains to aggregated apiservers being unavailable since local apiservers almost always will not encounter this. cc @deads2k

Copy link
Contributor Author

@seans3 seans3 Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think unaggregated discovery is returning resources from failed GV's. The resources are only added if they are non-nil here: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L529. But if the GV failed, then the resource list is always nil here: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L345. We can discount the 404, core/v1 toleration here: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L342. The comment here, I believe is just wrong now.

					// even in case of error, some fallback might have been returned
					groupVersionResources[groupVersion] = apiResourceList

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the resource list will always be nil in https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L529, thanks for confirming! Yeah I agree the comment is wrong, we shouldn't ever hit that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not having resources for stale discovery information seems valid since otherwise we will not converge with kube-apiservers that have restarted. If we eventually have a persisted set of resources so stale resources would be consistent among all kube-apiservers, I'd be in favor of doing so.

}
version := metav1.GroupVersionForDiscovery{}
version.GroupVersion = gv.String()
version.Version = v.Version
group.Versions = append(group.Versions, version)
if i == 0 {
// PreferredVersion is first non-stale Version
Copy link
Member

@Jefftree Jefftree Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? I'd imagine the preferred gv wouldn't change based on stale and should be based on what was registered with the server?

Copy link
Contributor Author

@seans3 seans3 Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question; I'd like to hear from others. I'm not sure it would be possible to have the PreferredVersion be a failed GroupVersion. The ServerPreferredResources and ServerPreferredNamespacedResources would be returning resources from failed GV's. In fact, those resource lists would probably be empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the current aggregated functionality of not allowing a failed GroupVersion be the PreferredVersion is correct. It appears to be the same as the unaggregated functionality. The following code shows that failed GroupVersion discovery requests return a nil for the resource list: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/discovery/discovery_client.go#L345. So a failed GroupVersion is not a PreferredVersion in the unaggregated case.

if group.PreferredVersion == (metav1.GroupVersionForDiscovery{}) {
group.PreferredVersion = version
}
resourceList := &metav1.APIResourceList{}
Expand All @@ -76,7 +102,7 @@ func convertAPIGroup(g apidiscovery.APIGroupDiscovery) (*metav1.APIGroup, map[sc
}
gvResources[gv] = resourceList
}
return group, gvResources
return group, gvResources, failedGVs
}

// convertAPIResource tranforms a APIResourceDiscovery to an APIResource.
Expand Down
182 changes: 181 additions & 1 deletion staging/src/k8s.io/client-go/discovery/aggregated_discovery_test.go
Expand Up @@ -31,6 +31,7 @@ func TestSplitGroupsAndResources(t *testing.T) {
agg apidiscovery.APIGroupDiscoveryList
expectedGroups metav1.APIGroupList
expectedGVResources map[schema.GroupVersion]*metav1.APIResourceList
expectedFailedGVs map[schema.GroupVersion]error
}{
{
name: "Aggregated discovery: core/v1 group and pod resource",
Expand Down Expand Up @@ -90,6 +91,7 @@ func TestSplitGroupsAndResources(t *testing.T) {
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{},
},
{
name: "Aggregated discovery: 1 group/1 resources at /api, 1 group/2 versions/1 resources at /apis",
Expand Down Expand Up @@ -179,6 +181,7 @@ func TestSplitGroupsAndResources(t *testing.T) {
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{},
},
{
name: "Aggregated discovery: 1 group/2 resources at /api, 1 group/2 resources at /apis",
Expand Down Expand Up @@ -313,6 +316,7 @@ func TestSplitGroupsAndResources(t *testing.T) {
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{},
},
{
name: "Aggregated discovery: multiple groups with cluster-scoped resources",
Expand Down Expand Up @@ -447,6 +451,7 @@ func TestSplitGroupsAndResources(t *testing.T) {
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{},
},
{
name: "Aggregated discovery with single subresource",
Expand Down Expand Up @@ -534,6 +539,7 @@ func TestSplitGroupsAndResources(t *testing.T) {
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{},
},
{
name: "Aggregated discovery with multiple subresources",
Expand Down Expand Up @@ -633,11 +639,185 @@ func TestSplitGroupsAndResources(t *testing.T) {
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{},
},
{
name: "Aggregated discovery: single failed GV at /api",
agg: apidiscovery.APIGroupDiscoveryList{
Items: []apidiscovery.APIGroupDiscovery{
{
Versions: []apidiscovery.APIVersionDiscovery{
{
Version: "v1",
Resources: []apidiscovery.APIResourceDiscovery{
{
Resource: "pods",
ResponseKind: &metav1.GroupVersionKind{
Group: "",
Version: "v1",
Kind: "Pod",
},
Scope: apidiscovery.ScopeNamespace,
},
{
Resource: "services",
ResponseKind: &metav1.GroupVersionKind{
Group: "",
Version: "v1",
Kind: "Service",
},
Scope: apidiscovery.ScopeNamespace,
},
},
Freshness: apidiscovery.DiscoveryFreshnessStale,
},
},
},
},
},
// Single core Group/Version is stale, so no Version within Group.
expectedGroups: metav1.APIGroupList{
Groups: []metav1.APIGroup{{Name: ""}},
},
// Single core Group/Version is stale, so there are no expected resources.
expectedGVResources: map[schema.GroupVersion]*metav1.APIResourceList{},
expectedFailedGVs: map[schema.GroupVersion]error{
{Group: "", Version: "v1"}: StaleGroupVersionError{gv: schema.GroupVersion{Group: "", Version: "v1"}},
},
},
{
name: "Aggregated discovery: single failed GV at /apis",
agg: apidiscovery.APIGroupDiscoveryList{
Items: []apidiscovery.APIGroupDiscovery{
{
ObjectMeta: metav1.ObjectMeta{
Name: "apps",
},
Versions: []apidiscovery.APIVersionDiscovery{
{
Version: "v1",
Resources: []apidiscovery.APIResourceDiscovery{
{
Resource: "deployments",
ResponseKind: &metav1.GroupVersionKind{
Group: "apps",
Version: "v1",
Kind: "Deployment",
},
Scope: apidiscovery.ScopeNamespace,
},
{
Resource: "statefulsets",
ResponseKind: &metav1.GroupVersionKind{
Group: "apps",
Version: "v1",
Kind: "StatefulSets",
},
Scope: apidiscovery.ScopeNamespace,
},
},
Freshness: apidiscovery.DiscoveryFreshnessStale,
},
},
},
},
},
// Single apps/v1 Group/Version is stale, so no Version within Group.
expectedGroups: metav1.APIGroupList{
Groups: []metav1.APIGroup{{Name: "apps"}},
},
// Single apps/v1 Group/Version is stale, so there are no expected resources.
expectedGVResources: map[schema.GroupVersion]*metav1.APIResourceList{},
expectedFailedGVs: map[schema.GroupVersion]error{
{Group: "apps", Version: "v1"}: StaleGroupVersionError{gv: schema.GroupVersion{Group: "apps", Version: "v1"}},
},
},
{
name: "Aggregated discovery: 1 group/2 versions/1 failed GV at /apis",
agg: apidiscovery.APIGroupDiscoveryList{
Items: []apidiscovery.APIGroupDiscovery{
{
ObjectMeta: metav1.ObjectMeta{
Name: "apps",
},
Versions: []apidiscovery.APIVersionDiscovery{
// Stale v2 should report failed GV.
{
Version: "v2",
Resources: []apidiscovery.APIResourceDiscovery{
{
Resource: "daemonsets",
ResponseKind: &metav1.GroupVersionKind{
Group: "apps",
Version: "v2",
Kind: "DaemonSets",
},
Scope: apidiscovery.ScopeNamespace,
},
},
Freshness: apidiscovery.DiscoveryFreshnessStale,
},
{
Version: "v1",
Resources: []apidiscovery.APIResourceDiscovery{
{
Resource: "deployments",
ResponseKind: &metav1.GroupVersionKind{
Group: "apps",
Version: "v1",
Kind: "Deployment",
},
Scope: apidiscovery.ScopeNamespace,
},
},
},
},
},
},
},
// Only apps/v1 is non-stale expected Group/Version
expectedGroups: metav1.APIGroupList{
Groups: []metav1.APIGroup{
{
Name: "apps",
Versions: []metav1.GroupVersionForDiscovery{
{
GroupVersion: "apps/v1",
Version: "v1",
},
},
// PreferredVersion must be apps/v1
PreferredVersion: metav1.GroupVersionForDiscovery{
GroupVersion: "apps/v1",
Version: "v1",
},
},
},
},
// Only apps/v1 resources expected.
expectedGVResources: map[schema.GroupVersion]*metav1.APIResourceList{
{Group: "apps", Version: "v1"}: {
GroupVersion: "apps/v1",
APIResources: []metav1.APIResource{
{
Name: "deployments",
Namespaced: true,
Group: "apps",
Version: "v1",
Kind: "Deployment",
},
},
},
},
expectedFailedGVs: map[schema.GroupVersion]error{
{Group: "apps", Version: "v2"}: StaleGroupVersionError{gv: schema.GroupVersion{Group: "apps", Version: "v2"}},
},
},
}

for _, test := range tests {
apiGroups, resourcesByGV := SplitGroupsAndResources(test.agg)
apiGroups, resourcesByGV, failedGVs := SplitGroupsAndResources(test.agg)
assert.Equal(t, test.expectedFailedGVs, failedGVs)
assert.Equal(t, test.expectedGroups, *apiGroups)
assert.Equal(t, test.expectedGVResources, resourcesByGV)
}
Expand Down
25 changes: 18 additions & 7 deletions staging/src/k8s.io/client-go/discovery/cached/memory/memcache.go
Expand Up @@ -136,32 +136,38 @@ func (d *memCacheClient) ServerGroupsAndResources() ([]*metav1.APIGroup, []*meta
// GroupsAndMaybeResources returns the list of APIGroups, and possibly the map of group/version
// to resources. The returned groups will never be nil, but the resources map can be nil
// if there are no cached resources.
func (d *memCacheClient) GroupsAndMaybeResources() (*metav1.APIGroupList, map[schema.GroupVersion]*metav1.APIResourceList, error) {
func (d *memCacheClient) GroupsAndMaybeResources() (*metav1.APIGroupList, map[schema.GroupVersion]*metav1.APIResourceList, map[schema.GroupVersion]error, error) {
d.lock.Lock()
defer d.lock.Unlock()

if !d.cacheValid {
if err := d.refreshLocked(); err != nil {
return nil, nil, err
return nil, nil, nil, err
}
}
// Build the resourceList from the cache?
var resourcesMap map[schema.GroupVersion]*metav1.APIResourceList
var failedGVs map[schema.GroupVersion]error
if d.receivedAggregatedDiscovery && len(d.groupToServerResources) > 0 {
resourcesMap = map[schema.GroupVersion]*metav1.APIResourceList{}
failedGVs = map[schema.GroupVersion]error{}
for gv, cacheEntry := range d.groupToServerResources {
groupVersion, err := schema.ParseGroupVersion(gv)
if err != nil {
return nil, nil, fmt.Errorf("failed to parse group version (%v): %v", gv, err)
return nil, nil, nil, fmt.Errorf("failed to parse group version (%v): %v", gv, err)
}
if cacheEntry.err != nil {
failedGVs[groupVersion] = cacheEntry.err
} else {
resourcesMap[groupVersion] = cacheEntry.resourceList
}
resourcesMap[groupVersion] = cacheEntry.resourceList
}
}
return d.groupList, resourcesMap, nil
return d.groupList, resourcesMap, failedGVs, nil
}

func (d *memCacheClient) ServerGroups() (*metav1.APIGroupList, error) {
groups, _, err := d.GroupsAndMaybeResources()
groups, _, _, err := d.GroupsAndMaybeResources()
if err != nil {
return nil, err
}
Expand Down Expand Up @@ -235,14 +241,19 @@ func (d *memCacheClient) refreshLocked() error {

if ad, ok := d.delegate.(discovery.AggregatedDiscoveryInterface); ok {
var resources map[schema.GroupVersion]*metav1.APIResourceList
gl, resources, err = ad.GroupsAndMaybeResources()
var failedGVs map[schema.GroupVersion]error
gl, resources, failedGVs, err = ad.GroupsAndMaybeResources()
if resources != nil && err == nil {
// Cache the resources.
d.groupToServerResources = map[string]*cacheEntry{}
d.groupList = gl
for gv, resources := range resources {
d.groupToServerResources[gv.String()] = &cacheEntry{resources, nil}
}
// Cache GroupVersion discovery errors
for gv, err := range failedGVs {
d.groupToServerResources[gv.String()] = &cacheEntry{nil, err}
}
d.receivedAggregatedDiscovery = true
d.cacheValid = true
return nil
Expand Down