-
Notifications
You must be signed in to change notification settings - Fork 1.3k
tree: Implement upload_fobj protocol to all trees #5307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dvc/tree/gdrive.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unclear, internally pydrive2 uses MediaIoBaseUpload
https://github.com/iterative/PyDrive2/blob/ee368315132bd31bb643bf3d8af088cc512e39f3/pydrive2/files.py#L776-L778
the initializer of this wrapper says chunk size should be lower than 5MB
https://googleapis.github.io/google-api-python-client/docs/epy/googleapiclient.http.MediaIoBaseUpload-class.html#__init__
though the default chunk size is 100 MB (if I am not missing anything) (line #4721 https://googleapis.github.io/google-api-python-client/docs/epy/googleapiclient.http-pysrc.html#MediaIoBaseUpload.__init__
cc: @shcheklein
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, information is confusing, probably they forgot to update the comment, or it should be interpreted in some other way. I see for example, that rclone has 8MB - https://forum.rclone.org/t/google-drive-and-optimal-drive-chunk-size/1186/11 ... and people mention 256M when they deal with very large files. I think it would be a good option to have, and we might indeed consider making it smaller by default to reduce memory pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use 64 MiB on all other providers, though I think there is no way (as of now) to pass the chunk_size into MediaIoBaseUpload. Maybe we should consider adding it as a feature to pydrive2?
e357e85 to
1875155
Compare
|
Whoa, something strange is going on with the tests. I don't think it is related to your PR. |
|
@isidentical Fixed the issue in #5333 . Please rebase. |
799b013 to
ce5579a
Compare
ce5579a to
1a43bfc
Compare
|
|
||
| # Needed for some providers, and http open() | ||
| CHUNK_SIZE = 64 * 1024 * 1024 # 64 MiB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it is now only used in http, so I guess no need to have it in the base class.
| # Needed for some providers, and http open() | |
| CHUNK_SIZE = 64 * 1024 * 1024 # 64 MiB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the open() call of _transfer_file()
Co-authored-by: Ruslan Kuprieiev <kupruser@gmail.com>
efiop
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! 🔥
All trees
Previously implemented
Implemented in this PR