Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob.rewrite() does not work with batches. #32

Closed
bits01 opened this issue Mar 19, 2018 · 11 comments
Closed

Blob.rewrite() does not work with batches. #32

bits01 opened this issue Mar 19, 2018 · 11 comments
Assignees
Labels
api: storage Issues related to the googleapis/python-storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@bits01
Copy link

bits01 commented Mar 19, 2018

google-cloud-storage v1.8.0

Not sure whether rewrites are supposed to work when batched, but it would be nice and useful if they did, otherwise there's no efficient way to copy lots of blobs across buckets in different locations or with different encryption keys.

Example:

with gcs_client.batch():
    dest_blob.rewrite(src_blob)

Traceback:

Traceback (most recent call last):
  File "batch_test.py", line 10, in <module>
    dest_blob.rewrite(src_blob)
  File ".../pyvirtenv/python-common/lib/python2.7/site-packages/google/cloud/storage/blob.py", line 1359, in rewrite
    rewritten = int(api_response['totalBytesRewritten'])
  File ".../pyvirtenv/python-common/lib/python2.7/site-packages/google/cloud/storage/batch.py", line 105, in __getitem__
    raise KeyError('Cannot get item %r from a future' % (key,))
KeyError: "Cannot get item 'totalBytesRewritten' from a future"
@tseaver
Copy link
Contributor

tseaver commented Mar 19, 2018

I'm afraid that the current batch implementation does not deal well with any methods whose return values depend on the response payload (as opposed to just being able to update the relevant blobs / buckets from those responses when the batch completes).

Adding support for such methods would be a large undertaking.

@tseaver tseaver closed this as completed Mar 19, 2018
@bits01
Copy link
Author

bits01 commented Mar 19, 2018

Could the issue be left open and marked as a future enhancement request?

@chemelnucfin
Copy link
Contributor

@bits01 I have added this issue to our feature request project

@bits01
Copy link
Author

bits01 commented Mar 19, 2018

Thank you. Should the issue be re-opened then? It's currently marked closed.

@tseaver tseaver reopened this Jun 19, 2018
@IlyaFaer
Copy link

IlyaFaer commented Jul 29, 2019

I think, it can be done in fashion of: googleapis/google-cloud-python#8618. But first it's better to deal with 8618

@tseaver tseaver changed the title storage: blob.rewrite() does not work with batches Storage: blob.rewrite() does not work with batches. Jul 29, 2019
@tseaver
Copy link
Contributor

tseaver commented Jul 29, 2019

@IlyaFaer, what is the issue that needs to be dealt with first? googleapis/google-cloud-python#8616 doesn't seem related.

@IlyaFaer
Copy link

@tseaver, sorry, mistake, 8618. I'm waiting for review in it, and if changes will be accepted, we could use something alike in this issue

@verasativa
Copy link

Just to be clear: at this moment the most reliable way to copy a lots of files from one bucket to another (retry strategy, batch requests and object rewrite) in python is subprocess.run(["gsutil", "cp",...]) ?

I was comparing contributors here to gsutil repo, and looks like are totally different and isolated teams, why?

@crwilcox crwilcox transferred this issue from googleapis/google-cloud-python Jan 31, 2020
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label Jan 31, 2020
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 3, 2020
@frankyn frankyn added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Feb 4, 2020
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 4, 2020
@frankyn frankyn removed the 🚨 This issue needs some love. label Feb 4, 2020
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 4, 2020
@frankyn frankyn added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Feb 5, 2020
@tseaver tseaver changed the title Storage: blob.rewrite() does not work with batches. Blob.rewrite() does not work with batches. Aug 17, 2020
@tseaver tseaver removed the priority: p2 Moderately-important priority. Fix may not be included in next release. label Aug 17, 2020
@tseaver
Copy link
Contributor

tseaver commented Aug 17, 2020

Noted during review: Blob.rewrite returns a 3-tuple, (rewrite_token, bytes_rewritten, object_size), in order to permit tracking progress on the rewrite (it may only complete partially).

@billyvg
Copy link

billyvg commented Mar 31, 2021

This is also an issue when trying to use batch() to copy blobs while preserving the ACL of the original blob.

@cojenco cojenco self-assigned this Jun 6, 2023
@cojenco
Copy link
Contributor

cojenco commented Jun 6, 2023

Closing; We've added clarifications on the limited Batch support in the python client, see details in #1045

@cojenco cojenco closed this as completed Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/python-storage API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

9 participants