Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PodTemplate semantic validation for Elasticsearch #3020

Merged
merged 5 commits into from
May 5, 2020

Conversation

barkbay
Copy link
Contributor

@barkbay barkbay commented Apr 30, 2020

This PR adds a semantic validation for the content of the podTemplate field of the Elasticsearch resources.

It is achieved by performing a dry-run request which attempts to create a Pod reusing the content of the podTemplate. The validation is only performed when a StatefulSet is created or updated.

Fixes #2266

@barkbay barkbay added >enhancement Enhancement of existing functionality v1.2.0 labels Apr 30, 2020
@barkbay
Copy link
Contributor Author

barkbay commented May 4, 2020

The validation fails with the following error on GKE 1.14: Internal error occurred: admission webhook "pod-ready.common-webhooks.networking.gke.io" does not support dry run

I think it means that the validation proposed here should only be a "best effort" validation: as long as the error is not explicitly a metav1.StatusReasonInvalid (422) we should not block the reconciliation loop.

Copy link
Contributor

@sebgl sebgl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

pkg/controller/elasticsearch/sset/validation.go Outdated Show resolved Hide resolved
test/e2e/es/podtemplate_test.go Outdated Show resolved Hide resolved
// Mutate Elasticsearch with an invalid PodTemplate
b.UpgradeTestSteps(k),
).WithStep(
test.Step{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation feels a bit off here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed but this file is formatted with go fmt 🤷‍♂️

) error {
var statusErr *k8serrors.StatusError
if !errors.As(err, &statusErr) {
// Not a Kubernetes API error (e.g. timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we could always reach a timeout in some situations (eg. particular proxy configuration). Hard to know for sure, so I'm fine moving on with this.
Should we add some debug logs here just in case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we reach a timeout here it is either:

  • a transient error, in this case it is fair to try again in the next reconciliation loop
  • a permanent error, in which case the StatefulSet update/creation that should happen right after will also fail

The error is already reported in the controller log, I'm not sure we should duplicate it.

@barkbay
Copy link
Contributor Author

barkbay commented May 4, 2020

To explain a little bit more the issue with the admission webhook: as explained in the documentation:

If the request would trigger an admission controller which would have side effects, the request will be failed rather than risk an unwanted side effect. [...] admission webhooks can declare in their configuration object that they do not have side effects by setting the sideEffects field to “None”. If a webhook actually does have side effects, then the sideEffects field should be set to “NoneOnDryRun”

In order for an admission webhook to state if it supports dry-run its sideEffects field must be set accordingly. If this field is not set the default value is Unknown. In this case the request fails with something like this: admission webhook "iam-for-pods.amazonaws.com" does not support dry run

$ k get MutatingWebhookConfiguration -o yaml
apiVersion: v1
items:
- apiVersion: admissionregistration.k8s.io/v1
  kind: MutatingWebhookConfiguration
  metadata:
    name: pod-identity-webhook
  webhooks:
  - admissionReviewVersions:
    - v1beta1
    failurePolicy: Ignore
    matchPolicy: Exact
    name: iam-for-pods.amazonaws.com
     ....
    sideEffects: Unknown

Unfortunately on EKS it seems that the pod-identity-webhook has not been updated to support dry run request (tested on EKS 1.16 ).

Regarding GKE it seems that built-in admission webhooks are compatible since 1.15.

I will update the e2e test to ensure that it is only run on GKE for K8S > 1.15.

TL;DR: even if dry run requests have been promoted to beta in K8S 1.13 the admission webhooks are not all compatible with that feature 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality v1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reconciliation is blocked when a Pod can't be created
3 participants