Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Controller giving "context deadline exceeded" #127

Closed
jpohjolainen opened this issue Sep 29, 2020 · 10 comments
Closed

[BUG] Controller giving "context deadline exceeded" #127

jpohjolainen opened this issue Sep 29, 2020 · 10 comments
Labels
bug Something isn't working

Comments

@jpohjolainen
Copy link

Describe the bug
Last week all worked well with controller and injector. But this week I created new cluster, and I'm having trouble with the controller (quick test on Injector gave the same error). It get's "context deadline exceeded" errors and doesn't sync.

To Reproduce
Install controller 1.1.0 to AKS

Expected behavior
It gets secrets synced from AKV

Logs

time="2020-09-29T09:55:44Z" level=info msg="Log level set to 'debug'"
W0929 09:55:44.490018       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
time="2020-09-29T09:55:44Z" level=info msg="Creating event broadcaster"
time="2020-09-29T09:55:44Z" level=info msg="Setting up event handlers"
time="2020-09-29T09:55:44Z" level=info msg="Starting AzureKeyVaultSecret controller"
time="2020-09-29T09:55:44Z" level=info msg="Waiting for informer caches to sync"
time="2020-09-29T09:55:44Z" level=info msg="Starting workers"
time="2020-09-29T09:55:44Z" level=info msg="Started workers"
time="2020-09-29T09:55:44Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:55:44Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:55:44Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:55:44Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:56:12Z" level=debug msg="AzureKeyVaultSecret 'secret-sync' added. Adding to queue."
time="2020-09-29T09:56:12Z" level=debug msg="Handling 'default/secret-sync' in default queue..."
time="2020-09-29T09:56:14Z" level=debug msg="AzureKeyVaultSecret 'secret-sync' added to Azure queue to check if changed in Azure."
time="2020-09-29T09:56:42Z" level=error msg="error syncing 'default/secret-sync': failed to get secret from Azure Key Vault for secret 'default'/'secret-sync', error: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded, requeuing"
time="2020-09-29T09:56:42Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:56:42Z" level=debug msg="Handling 'default/secret-sync' in default queue..."
time="2020-09-29T09:56:44Z" level=debug msg="AzureKeyVaultSecret 'secret-sync' added to Azure queue to check if changed in Azure."
time="2020-09-29T09:57:12Z" level=debug msg="Handling 'default/secret-sync' in Azure queue..."
time="2020-09-29T09:57:12Z" level=debug msg="Checking state for default/secret-sync in Azure"
time="2020-09-29T09:57:12Z" level=debug msg="Getting secret value for default/secret-sync in Azure"
time="2020-09-29T09:57:12Z" level=error msg="error syncing 'default/secret-sync': failed to get secret from Azure Key Vault for secret 'default'/'secret-sync', error: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded, requeuing"
time="2020-09-29T09:57:12Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:57:12Z" level=debug msg="Handling 'default/secret-sync' in default queue..."
time="2020-09-29T09:57:14Z" level=debug msg="AzureKeyVaultSecret 'secret-sync' added to Azure queue to check if changed in Azure."
time="2020-09-29T09:57:42Z" level=error msg="failed to get secret value for 'default/secret-sync' from Azure Key vault 'xxxxxxxx' using object name 'my-secret', error: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded"
time="2020-09-29T09:57:42Z" level=error msg="error syncing 'default/secret-sync': Failed to get secret for 'secret-sync' from Azure Key Vault 'xxxxxxxxx', requeuing"
time="2020-09-29T09:57:42Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:57:42Z" level=error msg="error syncing 'default/secret-sync': failed to get secret from Azure Key Vault for secret 'default'/'secret-sync', error: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded, requeuing"
time="2020-09-29T09:57:42Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:57:42Z" level=debug msg="Handling 'default/secret-sync' in default queue..."
time="2020-09-29T09:57:44Z" level=debug msg="AzureKeyVaultSecret 'secret-sync' added to Azure queue to check if changed in Azure."
time="2020-09-29T09:57:49Z" level=debug msg="AzureKeyVaultSecret 'secret-sync' deleted. Adding to delete queue."
time="2020-09-29T09:58:12Z" level=error msg="error syncing 'default/secret-sync': failed to get secret from Azure Key Vault for secret 'default'/'secret-sync', error: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded, requeuing"
time="2020-09-29T09:58:12Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:58:12Z" level=debug msg="Handling 'default/secret-sync' in default queue..."
time="2020-09-29T09:58:12Z" level=debug msg="Handling error for 'default/secret-sync' in AzureKeyVaultSecret: azurekeyvaultsecret.spv.no \"secret-sync\" not found"
time="2020-09-29T09:58:12Z" level=debug msg="Error for 'default/secret-sync' was 'Not Found'"
E0929 09:58:12.133783       1 handler.go:299] AzureKeyVaultSecret 'default/secret-sync' in work queue no longer exists
time="2020-09-29T09:58:12Z" level=info msg="Successfully synced AzureKeyVaultSecret 'default/secret-sync' with Kubernetes Secret"
time="2020-09-29T09:58:12Z" level=debug msg="Processing next work item in queue..."
time="2020-09-29T09:58:12Z" level=debug msg="Handling 'default/secret-sync' in default queue..."
time="2020-09-29T09:58:12Z" level=debug msg="Handling error for 'default/secret-sync' in AzureKeyVaultSecret: azurekeyvaultsecret.spv.no \"secret-sync\" not found"
time="2020-09-29T09:58:12Z" level=debug msg="Error for 'default/secret-sync' was 'Not Found'"
E0929 09:58:12.294025       1 handler.go:299] AzureKeyVaultSecret 'default/secret-sync' in work queue no longer exists
time="2020-09-29T09:58:12Z" level=info msg="Successfully synced AzureKeyVaultSecret 'default/secret-sync' with Kubernetes Secret"
time="2020-09-29T09:58:12Z" level=debug msg="Processing next work item in queue..."

Additional context
I tried this on 1.16.13 and 1.17.9 AKS clusters. We use Terraform to create it, so this has been created same way last week also.

I don't remember seeing the W0929 last week, and I'm wondering has Azure changed something?

@jpohjolainen jpohjolainen added the bug Something isn't working label Sep 29, 2020
@jpohjolainen jpohjolainen changed the title [BUG] [BUG] Controller giving "context deadline exceeded" Sep 29, 2020
@torresdal
Copy link
Collaborator

torresdal commented Sep 29, 2020

@jpohjolainen You sure outgoing requests works as expected in your cluster? The errors basically say they timeout when talking to AKV...

Though they seam to be eventually consistent...

@jpohjolainen
Copy link
Author

@torresdal atleast I can launch ubuntu image in the cluster and connect outside from it. dns and everything works there.

@torresdal
Copy link
Collaborator

@jpohjolainen only time we've seen issues like this in azure, is when you use the basic loadbalancer. Using the Standard SKU for loadbalancer fixed a lot of outgoing request issues we had...

Default timeout for accessing AKV is 30 seconds... We could make this configurable, but thinking 30 sec is a lot...

@jpohjolainen
Copy link
Author

jpohjolainen commented Sep 29, 2020

@torresdal LB SKU is Standard

Our test deployment also works. For it we use cert-manager to get certs, and it worked same as usually. And the app is accessible through the LB.

@torresdal
Copy link
Collaborator

torresdal commented Sep 29, 2020

But to answer you question directly: I doubt this is a bug. We have not seen this in any of our clusters running latest stable version of akv2k8s and no one else has reported this. Most likely reason is something in you cluster/azure... But feel free to keep this issue open and report any findings. We're happy to help if you find anything pointing to an issue with akv2k8s.

@jpohjolainen
Copy link
Author

This was my own silly thing. I forgot to change the name of the key vault in the azurekeyvaultsecret resource 🤦

Sometimes one can be so blind to own changes.

Sorry about this.

@torresdal
Copy link
Collaborator

@jpohjolainen No problem 👍

@skayred
Copy link

skayred commented Mar 15, 2022

If someone faces the issue while secret and vault names are correct - check that you have ca-certificates installed in the container where you try to inject the secrets. I had this problem with a scratch container.

@paulissoft
Copy link

I can confirm the comment by @skayred and it helped me to find an elegant solution.

I use the PostgREST container image to expose a PostgreSQL REST API and that container is based on a scratch container. The solution is start an init container that populates the /etc/ssl/certs directory using the kalaksi/ca-certificates image. Then attach that directory to the PostgREST container.

Here an excerpt of my deployment YAML:

      initContainers:
      - name: ca-certificates
        image: kalaksi/ca-certificates
        volumeMounts:
        - name: ca-certificates-directory
          mountPath: /etc/ssl/certs        
      containers:
      - image: postgrest/postgrest:${postgrest_version}
        name: postgrest
        env:
        - name: PGRST_ADMIN_SERVER_PORT
          valueFrom:
            configMapKeyRef:
              key: PGRST_ADMIN_SERVER_PORT
              name: ${postgrest_configmap}
        - name: PGRST_DB_ANON_ROLE
          valueFrom:
            configMapKeyRef:
              key: PGRST_DB_ANON_ROLE
              name: ${postgrest_configmap}
        - name: PGRST_DB_SCHEMA
          valueFrom:
            configMapKeyRef:
              key: PGRST_DB_SCHEMA
              name: ${postgrest_configmap}
        - name: PGRST_DB_URI
          value: ${PGRST_DB_URI}
        - name: PGRST_JWT_SECRET
          valueFrom:
            configMapKeyRef:
              key: jwks
              name: aad
        - name: PGRST_JWT_ROLE_CLAIM_KEY
          valueFrom:
            configMapKeyRef:
              key: PGRST_JWT_ROLE_CLAIM_KEY
              name: ${postgrest_configmap}            
        - name: PGRST_LOG_LEVEL
          valueFrom:
            configMapKeyRef:
              key: PGRST_LOG_LEVEL
              name: ${postgrest_configmap}
        - name: PGRST_OPENAPI_MODE
          valueFrom:
            configMapKeyRef:
              key: PGRST_OPENAPI_MODE
              name: ${postgrest_configmap}
        - name: PGRST_OPENAPI_SERVER_PROXY_URI
          valueFrom:
            configMapKeyRef:
              key: PGRST_OPENAPI_SERVER_PROXY_URI
              name: ${postgrest_configmap}
        ports:
        - containerPort: 3000
        - containerPort: ${PGRST_ADMIN_SERVER_PORT}
        volumeMounts:
        - name: ca-certificates-directory  
          mountPath: /etc/ssl/certs
      volumes:
      - name: ca-certificates-directory  
        emptyDir: {}

It works like a charm.

@arynaq
Copy link

arynaq commented Jun 7, 2022

Thank you @paulissoft

You saved me a headache, I was using mysql:5.7 exact same issue, resolved by bootstrapping kalaksi/ca-certificates into the pod w/initcontainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants