-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Google Cloud Storage support to Storage Spec #3495
Open
tjandy98
wants to merge
9
commits into
kserve:master
Choose a base branch
from
tjandy98:storage-spec-gcs
base: master
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
ee9f271
Add gcs storage spec support
tjandy98 98c58bf
Add test for gs storage type
tjandy98 7e72cbe
Merge branch 'master' into storage-spec-gcs
tjandy98 1187eb9
Merge branch 'master' into storage-spec-gcs
tjandy98 1b99800
Fix formatting
tjandy98 d73d26a
Merge branch 'storage-spec-gcs' of https://github.com/tjandy98/kserve…
tjandy98 f917fa1
Flake8
tjandy98 36a0b80
Use `from_service_account_info` for loading service account
tjandy98 7041a31
Trigger CI
tjandy98 File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
# limitations under the License. | ||
|
||
import base64 | ||
import binascii | ||
import glob | ||
import gzip | ||
import json | ||
|
@@ -36,6 +37,7 @@ | |
from botocore import UNSIGNED | ||
from botocore.client import Config | ||
from google.auth import exceptions | ||
from google.oauth2 import service_account | ||
from google.cloud import storage | ||
|
||
MODEL_MOUNT_DIRS = "/mnt/models" | ||
|
@@ -147,6 +149,24 @@ def _update_with_storage_spec(): | |
f.write(value) | ||
f.flush() | ||
|
||
if storage_secret_json.get("type", "") == "gs": | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add a test? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe move gs and other supported types to constants? |
||
if storage_secret_json.get("base64_service_account", "") != "": | ||
try: | ||
base64_service_account = storage_secret_json.get( | ||
"base64_service_account", "" | ||
) | ||
service_account_str = base64.b64decode( | ||
base64_service_account | ||
).decode("utf-8") | ||
service_account_dict = json.loads(service_account_str) | ||
os.environ["GOOGLE_SERVICE_ACCOUNT"] = json.dumps( | ||
service_account_dict | ||
) | ||
except binascii.Error: | ||
raise RuntimeError("Error: Invalid base64 encoding.") | ||
except UnicodeDecodeError: | ||
raise RuntimeError("Error: Cannot decode string.") | ||
|
||
@staticmethod | ||
def get_S3_config(): | ||
# default s3 config | ||
|
@@ -222,8 +242,10 @@ def _download_s3(uri, temp_dir: str): | |
# Skip where boto3 lists the directory as an object | ||
if obj.key.endswith("/"): | ||
continue | ||
# In the case where bucket_path points to a single object, set the target key to bucket_path | ||
# Otherwise, remove the bucket_path prefix, strip any extra slashes, then prepend the target_dir | ||
# In the case where bucket_path points to a single object, | ||
# set the target key to bucket_path | ||
# Otherwise, remove the bucket_path prefix, strip any extra slashes, | ||
# then prepend the target_dir | ||
# Example: | ||
# s3://test-bucket | ||
# Objects: /a/b/c/model.bin /a/model.bin /model.bin | ||
|
@@ -258,7 +280,8 @@ def _download_s3(uri, temp_dir: str): | |
logging.info("Downloaded object %s to %s" % (obj.key, target)) | ||
file_count += 1 | ||
|
||
# If the exact object is found, then it is sufficient to download that and break the loop | ||
# If the exact object is found, then it is sufficient to download that | ||
# and break the loop | ||
if exact_obj_found: | ||
break | ||
if file_count == 0: | ||
|
@@ -275,9 +298,18 @@ def _download_s3(uri, temp_dir: str): | |
@staticmethod | ||
def _download_gcs(uri, temp_dir: str): | ||
try: | ||
storage_client = storage.Client() | ||
credentials = None | ||
if "GOOGLE_SERVICE_ACCOUNT" in os.environ: | ||
google_service_account = json.loads( | ||
os.environ["GOOGLE_SERVICE_ACCOUNT"] | ||
) | ||
credentials = service_account.Credentials.from_service_account_info( | ||
google_service_account | ||
) | ||
storage_client = storage.Client(credentials=credentials) | ||
except exceptions.DefaultCredentialsError: | ||
storage_client = storage.Client.create_anonymous_client() | ||
|
||
bucket_args = uri.replace(_GCS_PREFIX, "", 1).split("/", 1) | ||
bucket_name = bucket_args[0] | ||
bucket_path = bucket_args[1] if len(bucket_args) > 1 else "" | ||
|
@@ -359,9 +391,9 @@ def _download_hdfs(uri, out_dir: str): | |
# Remove hdfs:// or webhdfs:// from the uri to get just the path | ||
# e.g. hdfs://user/me/model -> user/me/model | ||
if uri.startswith(_HDFS_PREFIX): | ||
path = uri[len(_HDFS_PREFIX) :] | ||
path = uri[len(_HDFS_PREFIX) :] # noqa: E203 | ||
else: | ||
path = uri[len(_WEBHDFS_PREFIX) :] | ||
path = uri[len(_WEBHDFS_PREFIX) :] # noqa: E203 | ||
|
||
if not config["HDFS_ROOTPATH"]: | ||
path = "/" + path | ||
|
@@ -443,7 +475,8 @@ def _download_azure_blob(uri, out_dir: str): # pylint: disable=too-many-locals | |
) | ||
if token is None: | ||
logging.warning( | ||
"Azure credentials or shared access signature token not found, retrying anonymous access" | ||
"Azure credentials or shared access signature token not found, \ | ||
retrying anonymous access" | ||
) | ||
|
||
blob_service_client = BlobServiceClient(account_url, credential=token) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? Why wasn't this failing before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @terrytangyuan , thanks for your feedback.
Yes, this change is intentional. Before this PR, the 'gs' type wasn't supported, the test would fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite related to Dan comment, but I don't think it makes sense to force this to fail, by doing this change. The test should be reworked to properly validate the GS support (or fully remove it, and create a new one from stracth).