Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make gsutil rsync recursive #297

Merged
merged 4 commits into from Mar 16, 2023
Merged

make gsutil rsync recursive #297

merged 4 commits into from Mar 16, 2023

Conversation

KBodolai
Copy link

@KBodolai KBodolai commented Dec 13, 2022

Small change to allow gsutil to do rsync recursively, syncing all the folders.

Context: We had some issues using it with gcloud, since make sync_data_up would not upload all the data folders. Adding the -r flag fixes this issue.

@KBodolai
Copy link
Author

Unsure of why that failed. Is there anything I can do further to help with this?

@pjbull
Copy link
Collaborator

pjbull commented Jan 28, 2023

Thanks @KBodolai. Rebasing on to the latest v2 should work now to run the tests.

@KBodolai
Copy link
Author

ok, I've done that, hopefully it'll be it!

@KBodolai
Copy link
Author

ok, so these passed (yay!)

There's another parameter that can be fed to gsutil for threaded processing (-m), which provides huge performance boosts, although they mention in some cases it may slow the syncing down.

I believe it will typically only result in speedups for the setting this is used in, but not quite certain, what do you think?

@pjbull
Copy link
Collaborator

pjbull commented Mar 16, 2023

@KBodolai I agree that 99% of the use cases here will be cloud <-> local, not local <-> local so -m likely makes sense. Happy to wait on this til that is added

@KBodolai
Copy link
Author

right, I'll add it and set up a test project to double check it does well, I'll ping you when it's ready :)

@KBodolai
Copy link
Author

@pjbull , actually, I stayed a bit late after work and tested if with a bunch of heavy lidar files, all working fine!

@pjbull pjbull merged commit 4d04432 into drivendata:v2 Mar 16, 2023
8 checks passed
milescsmith pushed a commit to milescsmith/cookiecutter-data-science that referenced this pull request Sep 20, 2023
* make gsutil rsync recursive

* reformatted with black

* add threaded flag for gsutil rsync
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants