Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context Canceled with Azure Managed Identity #3296

Closed
alenesho116 opened this issue Jun 27, 2022 · 9 comments
Closed

Context Canceled with Azure Managed Identity #3296

alenesho116 opened this issue Jun 27, 2022 · 9 comments
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity

Comments

@alenesho116
Copy link

alenesho116 commented Jun 27, 2022

Upgraded our cluster aad-pod-identity helm chart to 4.1.10 and we are now getting the following error when trying to use an Azure Queue Scaled Job with Keda:

2022-06-27T10:49:21-07:00 1.6563521617735345e+09	ERROR	azure_queue_scaler	error)	{"error": "-> github.com/Azure/azure-pipeline-go/pipeline.NewError, /go/pkg/mod/github.com/!azure/azure-pipeline-go@v0.2.3/pipeline/error.go:157\nHTTP request failed\n\nGet \"https://usdevstorage.queue.core.windows.net/queue1/messages?numofmessages=32&peekonly=true&timeout=61\": context canceled\n"}
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scalers.(*azureQueueScaler).IsActive
2022-06-27T10:49:21-07:00 	/workspace/pkg/scalers/azure_queue_scaler.go:160
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).getScaledJobMetrics
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/cache/scalers_cache.go:257
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).IsScaledJobActive
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/cache/scalers_cache.go:124
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/scale_handler.go:286
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/scale_handler.go:149
2022-06-27T10:49:21-07:00 1.656352161774509e+09	ERROR	azure_queue_scaler	error)	{"error": "Get \"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com%2F\": context canceled"}
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scalers.(*azureQueueScaler).IsActive
2022-06-27T10:49:21-07:00 	/workspace/pkg/scalers/azure_queue_scaler.go:160
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).getScaledJobMetrics
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/cache/scalers_cache.go:262
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).IsScaledJobActive
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/cache/scalers_cache.go:124
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).checkScalers
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/scale_handler.go:286
2022-06-27T10:49:21-07:00 github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).startScaleLoop
2022-06-27T10:49:21-07:00 	/workspace/pkg/scaling/scale_handler.go:149
2022-06-27T10:49:21-07:00 1.65635216177456e+09	DEBUG	scalehandler	Error getting scaler.IsActive, but continue	{"ScaledJob": "azure-queue-scaledjob", "Scaler": "cache.ScalerBuilder:", "Error": "Get \"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com%2F\": context canceled"}

This same managed identity worked with the same permissions before until we upgraded and it seems like it might be a permission timeout but not too sure.

Expected Behavior

The Keda operator should be able to read the Azure queue and scaled up our application pods to handle the objects in the queue.

Actual Behavior

Keda operator fails with a context canceled error when running ScaledJob.

Steps to Reproduce the Problem

  1. Create Azure managed identity with queue contributer permissions to the storage account
  2. deploy keda helm chart with the podIdentity.activeDirectory.identity set on an aks cluster
  3. create azure queue scaled job and add one item to queue

Specifications

  • KEDA Version: 2.7.2
  • Platform & Version: AKS
  • Kubernetes Version: 1.22
  • Scaler(s): Azure Storage Queue Scaler
@alenesho116 alenesho116 added the bug Something isn't working label Jun 27, 2022
@tomkerkhove tomkerkhove transferred this issue from kedacore/charts Jun 28, 2022
@brainslush
Copy link

What happens when you downgrade the aad-pod-identity helm chart?

We have seen similar flavoured but different issues since last Friday (24. June 2022) unrelated to Azure Managed Identities

@alenesho116
Copy link
Author

So when we use aad-pod-identity helm chart at version 4.1.8, this works fine but the issue began with chart version 4.1.10.

Others are also reporting time out issues at the token retrieval with pod identity when using a similar setup as KEDA with azure Azure/aad-pod-identity#1287

@alenesho116
Copy link
Author

Just tested a downgrade of aad-pod-identity helm chart and still getting same error.

@brainslush
Copy link

Just assume for a second this error is unrelated to the Helm chart. When did you see this error for the first time? I'm trying to pinpoint the source of the issue and somehow I get the feeling that Azure might be the source.

@JorTurFer
Copy link
Member

Are there any errors in aad-pod-identity pods?

@stale
Copy link

stale bot commented Sep 3, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Sep 3, 2022
@stale
Copy link

stale bot commented Sep 10, 2022

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Sep 10, 2022
@v-shenoy v-shenoy reopened this Sep 11, 2022
@stale stale bot removed the stale All issues that are marked as stale due to inactivity label Sep 11, 2022
@stale
Copy link

stale bot commented Nov 10, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Nov 10, 2022
@stale
Copy link

stale bot commented Nov 17, 2022

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity
Projects
Status: Done
Development

No branches or pull requests

4 participants