Skip to content

Webhooks sometimes fail with certificate errors in e2e #5690

Open
@nojnhuh

Description

@nojnhuh

Sometimes in e2e when a workload cluster is created, the CAPZ webhooks reject the requests with errors like this:

Internal error occurred: failed calling webhook "default.azurecluster.infrastructure.cluster.x-k8s.io": failed to call webhook: Post "https://capz-webhook-service.capz-system.svc:443/mutate-infrastructure-cluster-x-k8s-io-v1beta1-azurecluster?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority, Internal error occurred: failed calling webhook "default.azuremachinetemplate.infrastructure.cluster.x-k8s.io": failed to call webhook: Post "https://capz-webhook-service.capz-system.svc:443/mutate-infrastructure-cluster-x-k8s-io-v1beta1-azuremachinetemplate?timeout=10s": tls: failed to verify certificate: x509: certificate signed by unknown authority

I don't remember ever seeing a similar message for any CAPI or other provider webhooks.

Which jobs are flaky:
https://storage.googleapis.com/k8s-triage/index.html?text=Internal%20error%20occurred%3A%20failed%20calling%20webhook&job=cluster-api-provider-azure

Which tests are flaky:

Testgrid link:

Reason for failure (if possible):

Anything else we need to know:

/kind flake

[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/flakeCategorizes issue or PR as related to a flaky test.

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions