Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support both v1beta1 and v1 admission control webhooks #124

Merged
merged 9 commits into from
Sep 19, 2021

Conversation

adamwg
Copy link
Contributor

@adamwg adamwg commented Sep 16, 2021

We have a number of checks that operate on admission control webhook configuration. Older clusters support only v1beta1 of admission control, while newer clusters support v1. Currently clusterlint fails to run on these older clusters because we can't fetch v1 admission control objects from them.

This PR makes two changes to fix this:

  1. When listing objects, ignore "not found" errors, which mean the cluster doesn't support the resource we're trying to list.
  2. Duplicate our existing admission control webhook checks for v1beta1, so that older clusters get the same checks as newer clusters.

While we're here, enhance the errors we return when listing objects fails so that we can tell which resource we failed to list.

Client auth plugins are already loaded in objects.go, so no need for the import
in object_filter.go.
If a resource version doesn't exist in a cluster, there can't be any objects of
that kind for us to check, so ignore the error when trying to list them.

While we're here, enhance the errors we get when listing objects so that we can
tell which object kind failed.
In order to support both older clusters (which have v1beta1) and newer
clusters (which have v1), fetch both versions of webhook configurations. A
subsequent commit will add checks for the v1beta1 versions.
Add v1beta1 versions of our webhook checks, so that they can be run against
older clusters that don't support admission control v1.
Since we now ignore not found errors, it's possible for some object lists to be
nil. Fill them in with empty lists so checks can retain their existing
non-nilness assumptions.
In clusters that support both v1beta1 and v1 admission control, webhook
configurations will be returned twice: once in each version. Skip the v1beta1
tests if v1 objects exist, so that we don't end up with duplicate diagnostics.
return
})
g.Go(func() (err error) {
objects.SystemNamespace, err = client.Namespaces().Get(gCtx, metav1.NamespaceSystem, metav1.GetOptions{})
if err != nil {
err = fmt.Errorf("failed to fetch namespace %q: %s", metav1.NamespaceSystem, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we annotate this one as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I get it, there's no generic kind here (or rather, we're already using it for the equivalent List() call).

return fmt.Errorf("failed to fetch %s: %s", kind, err)
}

func objectsWithoutNils(objects *Objects) *Objects {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now becomes a bit of a maintenance burden since we always have to remember to extend the function when a new Kubernetes resource is supported.

As an alternative suggestion, we could move the initialization of the non-nil *Lists into a new NewObjects() factory method and then adjust the individual goroutines in FetchObjects() like this:

g.Go(func() error {
    objs, err := client.PersistentVolumes().List(gCtx, opts)
    if aerr := annotateFetchError("PersistentVolumes", err); aerr != nil {
        return aerr
    }
    objects.PersistentVolumes = objs
    return nil
})

It's a bit more verbose, but to me it feels more natural to work with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, the above doesn't work because we need to distinguish between no errors at all, ignorable errors where we do should not set the objects (not found), and true errors. This will blow up the code by quite a bit, so I think your original approach has the better trade-offs.

}

func objectsWithoutNils(objects *Objects) *Objects {
if objects.Nodes == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I delete the content of this function except for the return statement, the tests still pass.

We should probably add a test to validate that we are handling not-found resources appropriately.

return objectsWithoutNils(objects), nil
}

func annotateFetchError(kind string, err error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test for this one wouldn't hurt I think.

Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the tests myself real quick to save one roundtrip.

@timoreimann timoreimann merged commit 5eeabb8 into master Sep 19, 2021
@timoreimann timoreimann deleted the awg/old-clusters branch September 19, 2021 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants