Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: set context timeout for access to kv #152

Merged
merged 2 commits into from
Jul 16, 2020
Merged

Conversation

aramase
Copy link
Member

@aramase aramase commented Jul 13, 2020

What this PR does / why we need it:

  • Sets context timeout to 1m50s for access to kv. This ensures the request is terminated and error is returned when time to access kv takes > 2m. The default volume timeout for kubelet is 2m3s and this change honors the value. If the context in provider is not set, then the driver will terminate the process based on the context, but the error from provider will not be surfaced to the user.

Sample error when network policies deny outbound traffic in cluster -

time="2020-07-09T19:07:35Z" level=info msg="time=\"2020-07-09T18:57:35Z\" level=info msg=\"using pod identity to access keyvault\"\ntime=\"2020-07-09T18:57:35Z\" level=info msg=\"mounting secrets store object content for kube-system/nginx-secrets-store-env-var\"\ntime=\"2020-07-09T18:57:35Z\" level=info msg=\"objects: array:\\n  - |\\n    objectName: kv\\n    objectType: secret           # object types: secret, key or cert\\n    objectVersion: \\\"\\\"            # [OPTIONAL] object versions, default to latest if empty\\n\"\ntime=\"2020-07-09T18:57:35Z\" level=info msg=\"unmarshaled keyVaultObjects: [{kv   secret}]\"\ntime=\"2020-07-09T18:57:35Z\" level=info msg=\"keyVaultObjects len: 1\"\ntime=\"2020-07-09T18:57:35Z\" level=info msg=\"azure: using pod identity to retrieve token\"\ntime=\"2020-07-09T18:58:05Z\" level=info msg=\"accesstoken: eyJ0##### REDACTED #####l20w\"\ntime=\"2020-07-09T18:58:05Z\" level=info msg=\"clientid: ee32##### REDACTED #####0e49\"\n"
time="2020-07-09T19:07:35Z" level=error msg="error invoking provider, err: exit status 1, output: time=\"2020-07-09T19:07:35Z\" level=fatal msg=\"[error] : failed to get objectType:secret, objectName:kv, objectVersion:: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: Get https://kv.vault.azure.net/secrets/kv/?api-version=2016-10-01: dial tcp: i/o timeout\"\n for pod: c1f448b7-b752-431e-ad33-521c2abb8204, ns: kube-system"
time="2020-07-09T19:07:35Z" level=error msg="GRPC error: error mounting secret time=\"2020-07-09T19:07:35Z\" level=fatal msg=\"[error] : failed to get objectType:secret, objectName:kv, objectVersion:: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: Get https://kv.vault.azure.net/secrets/kv/?api-version=2016-10-01: dial tcp: i/o timeout\"\n for pod: c1f448b7-b752-431e-ad33-521c2abb8204, ns: kube-system"

The request only times out after 10m as there is no context defined

Which issue(s) this PR fixes (optional, using fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when the PR gets merged):
Fixes #

Special notes for your reviewer:

@aramase
Copy link
Member Author

aramase commented Jul 13, 2020

/azp run pr-e2e-azure

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@aramase aramase requested a review from ritazh July 14, 2020 00:59
cmd/main.go Outdated
// setting the context to 1m50s will ensure request is terminated if taking longer/unable to establish connection due to underlying network error and
// the correct error is returned back to the driver
// Value is set to 1m50s to provide enough time for driver to complete outstanding operations
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Minute+50*time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make this configurable and default to 1m50s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ritazh Good idea! We can add a context-timeout flag in the provider and default that to 1m50s for the time being. For addition of this flag as mandatory in the driver, I'll add an item in the agenda next week so we can discuss how we want to do this? This would require an update in all the providers to support the flag before we add to driver.

Copy link
Member

@ritazh ritazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@aramase aramase merged commit 5f8ce21 into Azure:master Jul 16, 2020
@aramase aramase deleted the context branch July 16, 2020 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants