Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage] Fix sky storage delete for externally deleted buckets #1875

Merged
merged 7 commits into from
Apr 29, 2023

Conversation

romilbhardwaj
Copy link
Collaborator

Closes #1322.

If a bucket is deleted externally, then store.delete() now does a no-op (instead of creating and deleting like before). The logging has been updated.

Now:

(base) ➜  repro git:(storage_deletionfix) ✗ SKYPILOT_DEBUG=0 sky storage delete -a
Deleting all storage objects.
# Shows spinner Deleting S3 bucket romilmybucket.
I 04-16 22:47:19 storage.py:1029] S3 bucket romilmybucket may have been deleted externally. Removing from local state.

Tested (run the relevant ones):

@romilbhardwaj
Copy link
Collaborator Author

Thanks for the quick review @landscapepainter! Ready for another look.

landscapepainter

This comment was marked as outdated.

@landscapepainter landscapepainter dismissed their stale review April 21, 2023 06:11

More reviews needed

Copy link
Collaborator

@landscapepainter landscapepainter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested more complicated scenarios and confirmed the log as expected:

  1. create 3 of s3 buckets and 2 of gcs buckets. Remove them all externally and remove them with sky.
  2. create 3 of r2 buckets, remove two of them externally, and remove them all with sky.


# Delete bucket externally
cmd = self.cli_delete_cmd(store_type, tmp_scratch_storage_obj.name)
subprocess.check_output(cmd)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if the bucket was externally removed successfully with assert should be more complete. Perhaps something like,

Suggested change
subprocess.check_output(cmd)
out = subprocess.check_output(cmd)
if isinstance(store_type, storage_lib.StoreType.S3) or isinstance(store_type, storage_lib.StoreType.R2):
assert 'remove_bucket' not in out.decode('utf-8').lower()
elif isinstance(store_type, storage_lib.StoreType.GCS):
assert 'removing' not in out.decode('utf-8').lower()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the removal fails, the awscli and gsutil return a non-zero error code (In which case we'd get a CalledProcessError, which would catch it).

f'Failed to delete S3 bucket {bucket_name}.')
if 'NoSuchBucket' in e.output.decode('utf-8'):
logger.debug(
_BUCKET_EXTERNALLY_DELETED_DEBUG_MESSAGE.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps leaving the same comment as L1635

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered in the other comment

f'Failed to delete R2 bucket {bucket_name}.')
if 'NoSuchBucket' in e.output.decode('utf-8'):
logger.debug(
_BUCKET_EXTERNALLY_DELETED_DEBUG_MESSAGE.format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding Same comment as L1635

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered in the other comment

raise exceptions.StorageBucketDeleteError(
f'Failed to delete S3 bucket {bucket_name}.')
if 'NoSuchBucket' in e.output.decode('utf-8'):
logger.debug(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have L1026, can we remove this message, applying the same to gcs and r2?
Screenshot 2023-04-22 at 7 56 37 PM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this was intentionally left in as logger.debug (which is not shown to user, unlike logger.info). It should be fine to have verbose logging in debug.

Copy link
Collaborator

@landscapepainter landscapepainter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to go!

@romilbhardwaj romilbhardwaj merged commit 1be6d22 into master Apr 29, 2023
15 checks passed
@romilbhardwaj romilbhardwaj deleted the storage_deletionfix branch April 29, 2023 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Storage] Confusing logging in sky storage delete
2 participants