-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS IRSA S3 Support #2373
AWS IRSA S3 Support #2373
Conversation
Thanks @matty-rose !! This is great! /cc @surajkota |
@yuzisun: GitHub didn't allow me to request PR reviews from the following users: surajkota. Note that only kserve members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for the PR, I will support reviewing this |
9cb0a35
to
7d05583
Compare
Hi @surajkota @yuzisun I think this is ready for review now |
"s3SecretAccessKeyName": "AWS_SECRET_ACCESS_KEY", | ||
"s3Endpoint": "s3.amazonaws.com", | ||
"s3UseHttps": "1", | ||
"s3Region": "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unsure about having empty values here since I couldn't see any other place in the configmap where this was done, but it makes sense to me to have them here so that users know all the possible configuration options without needing to look through the source code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, maybe add a comment with the AWS/S3 reference you had in the other place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have added the quick links above the credentials entry in configmap
func BuildS3EnvVars(annotations map[string]string, s3Config *S3Config) []v1.EnvVar { | ||
envs := []v1.EnvVar{} | ||
|
||
if s3Endpoint, ok := annotations[InferenceServiceS3SecretEndpointAnnotation]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this s3 endpoint if/else logic block is unchanged, just moved files from s3_secret.go
to here since I use it in both s3_secret.go
and s3_service_account.go
/* | ||
For a quick reference about AWS ENV variables: | ||
AWS Cli: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html | ||
Boto: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the reference!
Thanks @matty-rose for the awesome work with detailed descriptions!! The PR looks great! Would appreciate if you can help update the website doc for the IAM Role support here |
7d05583
to
a1d0e23
Compare
@yuzisun have opened a website PR here: kserve/website#178 |
@surajkota Do you have comments here? |
Apologies, this slipped my list because of a customer request. Will take a look tomorrow, I hope that is okay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patience, great work @matty-rose. I have left a few questions on the PR. And I have a one more:
What is the difference between the S3 download code in the following locations:
- https://github.com/kserve/kserve/blob/master/python/kserve/kserve/storage.py#L147 -this is used by the storage initializer
- https://github.com/kserve/kserve/blob/master/pkg/agent/storage/utils.go#L124-L153 - when does this code get executed?
@@ -143,14 +143,24 @@ data: | |||
"cpuLimit": "1", | |||
"storageSpecSecretName": "storage-config" | |||
} | |||
# For a quick reference about AWS ENV variables: | |||
# AWS Cli: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html | |||
# Boto: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables | |||
credentials: |- | |||
{ | |||
"gcs": { | |||
"gcsCredentialFileName": "gcloud-application-credentials.json" | |||
}, | |||
"s3": { | |||
"s3AccessKeyIDName": "AWS_ACCESS_KEY_ID", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few questions:
- there are 2 variables used to construct the service endpoint URL:
s3Endpoint
ands3UseHttps
. Depending on the value of useHttps the final AWS_ENDPOINT_URL is set. @yuzisun Is there a particular reason to do it this way? Just for my information - how is s3VerifySSL used?
- In Test B: Deploy S3 Example using only service account (no secret) configuring all values inside inferenceservice configmap, does the user still need to keep "s3AccessKeyIDName": "AWS_ACCESS_KEY_ID" and "s3SecretAccessKeyName": "AWS_SECRET_ACCESS_KEY", in configmap? It can be a little confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 1) I think this is mostly for backwards compatibility and integration with Kubeflow environment, for example tfserving was using S3_ENDPOINT, I am not sure if this is still the case currently.
for 2) Same as other setting, when s3VerifySSL
is set to false it then does not verify the server side certificates which is mostly for testing purpose.
for 3) I do not think these are needed if we are using service account without secret, @matty-rose maybe we can add a comment in the configmap to state which of the settings are only needed for static credentials?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep for 3. those fields are not required in the configmap if not using static credentials, have added a comment to make this clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- One change I would recommend based on storage-initializer: Let boto3 decide the endpoint for S3 #2377 is to remove the default value for all the fields except access and secret key name, especially s3Endpoint. This PR lgtm once this change is made.
- Thanks for adding the comment in this file. I think this should go to the website PR as well and also add information about default values used by each or atleast some of these like https, endpoint, region in website. I will try to review that one today.
- I could not find any reference for
S3_VERIFY_SSL
variable in boto3 or aws go sdk docs. So, I am not sure how its being used. Dont want to block the PR for this though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the change removing the endpoint/use https default values and will try find some time this week to make further updates to the website PR to describe the use of these better. Thanks for your review @surajkota
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@surajkota S3_VERIFY_SSL
/S3_ENDPOINT
/S3_USE_HTTPS
were originally inherited from tensorflow/tfserving(see https://docs.w3cub.com/tensorflow~guide/deploy/s3). I think we may no longer need these as we always use storage initializer to download models to a local volume and tfserving runtime never directly use these environment variables to get models from s3 in kserve. I created an issue to track this to clean up these variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, got it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved the PR, maybe we should remove s3VerifySSL
from this configmap since it will not be used or not talk about it on the website for now
For 1) It is used for single model serving storage initializer where we inject the init container to download models before starting the model server. For historical reasons they are implemented differently, we are planning to merge these implementation with ModelMesh model puller. |
a1d0e23
to
58916f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the responses, One change request, after that this PR lgtm
@@ -143,14 +143,24 @@ data: | |||
"cpuLimit": "1", | |||
"storageSpecSecretName": "storage-config" | |||
} | |||
# For a quick reference about AWS ENV variables: | |||
# AWS Cli: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html | |||
# Boto: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables | |||
credentials: |- | |||
{ | |||
"gcs": { | |||
"gcsCredentialFileName": "gcloud-application-credentials.json" | |||
}, | |||
"s3": { | |||
"s3AccessKeyIDName": "AWS_ACCESS_KEY_ID", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- One change I would recommend based on storage-initializer: Let boto3 decide the endpoint for S3 #2377 is to remove the default value for all the fields except access and secret key name, especially s3Endpoint. This PR lgtm once this change is made.
- Thanks for adding the comment in this file. I think this should go to the website PR as well and also add information about default values used by each or atleast some of these like https, endpoint, region in website. I will try to review that one today.
- I could not find any reference for
S3_VERIFY_SSL
variable in boto3 or aws go sdk docs. So, I am not sure how its being used. Dont want to block the PR for this though
Adds S3 configuration options through the following - `inferenceservice` configmap (global options) - k8s service account annotations for AWS IRSA - k8s secret annotations attached to k8s service account for static credentials Signed-off-by: Matthew Rose <matthew.rose@maxkelsen.com>
58916f4
to
f41e964
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work and explanation! Once we take it to the website, I think it would make users happy
Thanks @matty-rose for the great work and @surajkota for the detailed review! /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: matty-rose, surajkota, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Adds S3 configuration options through the following - `inferenceservice` configmap (global options) - k8s service account annotations for AWS IRSA - k8s secret annotations attached to k8s service account for static credentials Signed-off-by: Matthew Rose <matthew.rose@maxkelsen.com> Signed-off-by: Matthew Rose <matthew.rose@maxkelsen.com> Signed-off-by: alexagriffith <agriffith96@gmail.com>
What this PR does / why we need it:
At the moment to use AWS IRSA functionality for storage initializer, dummy kubernetes secrets need to be created to get kserve to correctly parse s3 annotations and configure storage init container correctly with environment variables (see #2003 (comment))
This PR allows configuration of S3 storage init three ways now:
inferenceservice
configmap so multiple k8s service accounts/IRSA can be configured without needing to repeat annotationsWhich issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #2113
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevent result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
For both tests I followed the developer guide, using
istioctl
1.14.3 to deploy Istio, deploying Knative serving 1.6, and then running themake deploy-dev
command on this branch.Describing pod with
kubectl describe pod -n default -l serving.kserve.io/inferenceservice=mnist-s3
gives the following configuration on storage-init containerinferenceservice
configmapDescribing pod with
kubectl describe pod -n default -l serving.kserve.io/inferenceservice=mnist-s3-configmap
gives following configuration on storage init containerSpecial notes for your reviewer:
Checklist:
Release note: