Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make gsutil rsync recursive #297

Merged
merged 4 commits into from Mar 16, 2023
Merged

make gsutil rsync recursive #297

merged 4 commits into from Mar 16, 2023

Conversation

KBodolai
Copy link
Contributor

@KBodolai KBodolai commented Dec 13, 2022

Small change to allow gsutil to do rsync recursively, syncing all the folders.

Context: We had some issues using it with gcloud, since make sync_data_up would not upload all the data folders. Adding the -r flag fixes this issue.

@KBodolai
Copy link
Contributor Author

Unsure of why that failed. Is there anything I can do further to help with this?

@pjbull
Copy link
Member

pjbull commented Jan 28, 2023

Thanks @KBodolai. Rebasing on to the latest v2 should work now to run the tests.

@KBodolai
Copy link
Contributor Author

ok, I've done that, hopefully it'll be it!

@KBodolai
Copy link
Contributor Author

ok, so these passed (yay!)

There's another parameter that can be fed to gsutil for threaded processing (-m), which provides huge performance boosts, although they mention in some cases it may slow the syncing down.

I believe it will typically only result in speedups for the setting this is used in, but not quite certain, what do you think?

@pjbull
Copy link
Member

pjbull commented Mar 16, 2023

@KBodolai I agree that 99% of the use cases here will be cloud <-> local, not local <-> local so -m likely makes sense. Happy to wait on this til that is added

@KBodolai
Copy link
Contributor Author

right, I'll add it and set up a test project to double check it does well, I'll ping you when it's ready :)

@KBodolai
Copy link
Contributor Author

@pjbull , actually, I stayed a bit late after work and tested if with a bunch of heavy lidar files, all working fine!

@pjbull pjbull merged commit 4d04432 into drivendataorg:v2 Mar 16, 2023
milescsmith pushed a commit to milescsmith/cookiecutter-data-science that referenced this pull request Sep 20, 2023
* make gsutil rsync recursive

* reformatted with black

* add threaded flag for gsutil rsync
pjbull added a commit that referenced this pull request May 22, 2024
* [WIP] New version with cleaner options (#162)

* WIP - New version with cleaner options

* Fix find-replace error (#177)

* Remove unnecessary .gitkeep

* Remove unused tox.ini

* Split reqs into dev/non-dev

* Add basic packages support

* Add tests for testing environment creation and requirements

* Set up CI with Azure Pipelines (#194)

* Change archived asciinema example (#163)

* Change archived asciinema example

* Update README.md

Fix Asciinema powerline error

* Update docs to show updated asciinema example

* Added source and destination to Make data target (#169)

* Fix broken Airflow link (#182)

* Fixed: Typo in Makefile (#184)

Fixed typo in Makefile, section "Set up python interpreter environment": intalled --> installed

* Set up CI with Azure Pipelines

[skip ci]

* Update azure-pipelines.yml for Azure Pipelines

* Update azure-pipelines.yml for Azure Pipelines

* Update azure-pipelines.yml for Azure Pipelines

* str paths for windows support

* handle multiple data providers (#199)

* Add missing env directory bin/activate path

* Remove version from PYTHON_INTERPRETER command

* Search for virtualenvwrapper.sh path if executable not found

* Try chardet for character encoding detection

* Specify python and virtualenv binaries for virtualenvwrapper

* Add shebang to virtualenvwrapper.sh

* Diagnostic

* Try virtualenvwrapper-win

* Set encoding if detected None

* Fixes to Mac and Windows tests on Azure pipelines (#217)

* Temporarily comment out py36

* Update azure-pipelines.yml

* Fix tests on Windows and Mac (#1)

* Temporarily remove py37

* Update virtualenv_harness.sh

* put py37 back in

* Set encoding to utf-8

* Comment out rmvirtualenv

* Update test_creation.py

* Update virtualenv_harness.sh

* Add --show-capture

* Update azure-pipelines.yml

* Update azure-pipelines.yml

* Update test_creation.py

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update Makefile

* Update virtualenv_harness.sh

* Update cookiecutter.json

* Update cookiecutter.json

* Update virtualenv_harness.sh

* Update Makefile

* Update Makefile

* Update Makefile

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update Makefile

* Update Makefile

* Update Makefile

* Update Makefile

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update Makefile

* Update Makefile

* Update virtualenv_harness.sh

* Update Makefile

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update test_creation.py

* Update azure-pipelines.yml

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update virtualenv_harness.sh

* Update cookiecutter.json

* Update conda_harness.sh

* Update conda_harness.sh

* Update conda_harness.sh

Co-authored-by: Eric Jalbert <ericmjalbert@users.noreply.github.com>
Co-authored-by: Jonathan Raviotta <jraviotta@users.noreply.github.com>
Co-authored-by: Wes Roach <wesr000@gmail.com>
Co-authored-by: Christopher Geis <16896724+geisch@users.noreply.github.com>
Co-authored-by: Peter Bull <pjbull@gmail.com>
Co-authored-by: Ian Preston <17241371+ianepreston@users.noreply.github.com>
Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>
Co-authored-by: inchiosa <4316698+inchiosa@users.noreply.github.com>

* More graceful deprecation

* Make tests pass locally

* test version match installed version

* Remove unused imports

* Unremove used import

* Move to GH Actions

* Fix typo

* Test non-windows

* Add netlify configs

* Update suggestion to keep using deprecated cookiecutter template (#231)

* Add mkdocs requirements file to docs directory

* Try setting python version in runtime txt for netlify

* Trigger build

* Python 3.8 netlify

* Python 3.6 netlify

* Do not specify python runtime for netlify

* Use 3.7

This reverts commit 898d7d3.

Co-authored-by: James Myatt <james@jamesmyatt.co.uk>
Co-authored-by: drivendata <info@drivendata.org>
Co-authored-by: Eric Jalbert <ericmjalbert@users.noreply.github.com>
Co-authored-by: Jonathan Raviotta <jraviotta@users.noreply.github.com>
Co-authored-by: Wes Roach <wesr000@gmail.com>
Co-authored-by: Christopher Geis <16896724+geisch@users.noreply.github.com>
Co-authored-by: Ian Preston <17241371+ianepreston@users.noreply.github.com>
Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>
Co-authored-by: inchiosa <4316698+inchiosa@users.noreply.github.com>
Co-authored-by: Robert Gibboni <robert@drivendata.org>

* docs: add dev-requirements, remove tox.ini (#254)

* Formatting

* Docs and test cleanup

* 3.10 specificaiton

* Passing tests and lint

* testing improvements and hooks refactor

* Lint, format, license (#247)

* format config files

* make format

* update readme

* no license, py version, update lint format commands

* revert accidental lic removal

Co-authored-by: Peter Bull <pjbull@gmail.com>

* Updating the Makefile

* Removing deprecation message from v2

* update pip, reqs, test CI commands

* Update docs build

* Revert netlify changes

* Use default netlify build runtime

* make gsutil rsync recursive (#297)

* make gsutil rsync recursive

* reformatted with black

* add threaded flag for gsutil rsync

* V2 bug fixes, modernization, house-keeping, and README update (#313)

* Escape single quotes in sphinx conf. Fixes #142

* Use pyproject.toml package metadata. Fix quote escaping. Fixes #248

* Remove unnecessary pep8 configuration

* Update gitignores

* Use pyproject.toml for package metadata

* Update README for v2

* Add Python 3.11 and 3.12 to CI

* Remove unneeded requirements file reference

* Docs updates

* Update linting and tests

* Fix lint

* Missed one

* Split out code quality job

* Hold on 3.12 for now

* Fix wrong conda environment command

* Fix option name

* Change back and remove python spec since it should already be in environment.yml

* Change hyperlinks to https that support it

* Update joblib link. Supersedes #251

* Remove Python 3.12 for now

* Pin Python version in project package metadata

* Use compatible release operator

---------

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>

* Add windows to CI (#319)

* Add windows to CI

* Try different shebang per actions/runner-images#7253 (comment)

* Revert "Try different shebang per actions/runner-images#7253 (comment)"

This reverts commit 4a8f093.

* Try setting PATH

* Reduce matrix for testing

* Set default shell to bash

* Fix wrong syntax

* Explicitly set bash path in test

* Fix syntax error

* Fix mistake

* Try installing GNU make instead of using available mingw32-make

* Undo explicit bash executable

* Set conda executable

* Fix black

* Add conda directory to system paths

* Clearer phony declaration

* Add diagnostic printing

* More printing

* Check which make

* Try adding Git bin directory to path

* Explicit bash shell

* Try adding explicit PATH

* Try making path style consistent

* Explicit conda executable

* try using which

* Try Scripts directory for adding to PATH

* Try removing unnecessary conda stuff

* Separate check step

* Install from conda-forge

* Missing colon

* Try setup miniconda to see if it's faster

* Change default shell

* Use conda environment

* Skip which conda

* Add to path anyways

* Try micromamba

* Use mamba as conda executable

* Fix typo

* micromamba shell hook

* Missing quote

* Full micromamba path; add caching

* Try path instead

* Remove backslash

* Use mambaforge instead of micromamba

* Add conda to PATH

* Use conda activate

* Try permissions hack

* Remove unneeded packages

* Use different repo name per config

* Cleanup

* Fallbacks

* Fix backwards check

* Print test durations

* Try conda and defaults again

* Don't change shell

* Normal shell

* Clean up

* More cleanup

* More cleanup

---------

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>

* Optionally remove boilerplate in initial setup (#286)

* Make skeleton code optional

* Updates to skeleton code

* make lint pass

* add --profile black to isort command for black compatibility

* add --profile black to generated Makefile

* remove setup.py from linting/formatting;

* fix code scaffold tests; revert adding isort to project makefile

---------

Co-authored-by: Peter Bull <pjbull@gmail.com>

* Replace sphinx with mkdocs as default docs option, add none option (#295)

* create alternative mkdocs docs path and add option to use sphinx, mkdocs, or none

* add tests that docs folder looks correct

* add author if provided to mkdocs config

* fix yml typo

* make mkdocs default, do not remove sphinx yet

* reconfigure tests to cycle through all possible values for fields that are nominally independent

* change formatting

* cast docs paths to strings to satisfy shutil move in python 3.8

* remove sphinx

* remove docs folder tree; make docs removal generic

* Create new screencast automatically on docs build (#339)

* Create new screencast automatically on docs build

* dev reqs

* Remove tree from replay

* Update docs/scripts/generate-termynal.py

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

* Update docs/scripts/generate-termynal.py

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

* Apply suggestions from code review

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

* Apply suggestions from code review

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

---------

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

* Update Makefile help command to work on all platforms (#335)

* update python script and variable reference so help printing works on all platforms

* add blank make command testing output of available commands

* add test for makefile help text

* Update docs to material theme (#341)

* update docs to material theme

* update style and logo

* rearrange and update content

* docs/requirements.txt

* update workflow to skip if docs-only change, remove netlify config

* write to tempfile instead of hardcoded dir to prevent unintended persistent files from mkdocs-gen-files

* Update docs to use multiple pages (#344)

* change site structure to use multiple pages

* add navigation.instant

* remove padding override

* remove toc.follow

* revert content change

* Add workflows for Python 3.12 (#329)

* add python 3.12 vm env to test suite and update pyproject.toml classifiers to indicate support

* Bump actions versions

* Swap out fake/abandoned jupyter metapackage for jupyterlab and notebook

* Try conda-forge

---------

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>

* Update opinions (#345)

* update opinions

* fix dag link

* typos and formatting

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* try rewrite of dag section

* formatting nits and adding subtitles

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Update docs/docs/opinions.md

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* fix link to contributing page

---------

Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>

* Badge + docs updates (#348)

* Badge on docs updates

* default to CCDS template

* Style updates

* darken links a bit

* block quotes

* formatting and bare ccds

* Apply suggestions from code review

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

---------

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

* Minor CSS Tweaks (#349)

* tweak code block formatting and color correctness

* update note infobox styles

* Add commandline options table (#350)

* Add commandline options table

* wack isort ordering

* Make sections

* Format

* Use shutil.copytree instead of distutils (#353)

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>

* Add documentation about Make on Windows (#355)

* Fix typo

* Add instructions for Make on Windows

* Fix typo

* Copy edits

* Add note about help

* Consistent code formatting

---------

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>

* V2 Modernize boilerplate (#354)

* WIP

* Remove setuptools, add tqdm

* Lint

* Don't need setuptools or wheel

* Add comment fences

* Can't believe E266 is a thing

* Whitespace

* Add linting to tests

* Remove template from project linting and formatting

* Reorganize boilerplate modules

* Get rid of editable check

* Actually run linting

* Actually test linting

* Update verify_files test

* Update verify_folders

* Remove duplicate file

* Fix lint

---------

Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>

* Updates from review of V2 docs (#362)

* Review comments

* 3.8 minimum

* macos-13

* try setting up conda

* don't use imported variable

* Update docs/docs/opinions.md

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

---------

Co-authored-by: Chris Kucharczyk <chris@drivendata.org>

---------

Co-authored-by: James Myatt <james@jamesmyatt.co.uk>
Co-authored-by: drivendata <info@drivendata.org>
Co-authored-by: Eric Jalbert <ericmjalbert@users.noreply.github.com>
Co-authored-by: Jonathan Raviotta <jraviotta@users.noreply.github.com>
Co-authored-by: Wes Roach <wesr000@gmail.com>
Co-authored-by: Christopher Geis <16896724+geisch@users.noreply.github.com>
Co-authored-by: Ian Preston <17241371+ianepreston@users.noreply.github.com>
Co-authored-by: Jay Qi <jayqi@users.noreply.github.com>
Co-authored-by: inchiosa <4316698+inchiosa@users.noreply.github.com>
Co-authored-by: Robert Gibboni <robert@drivendata.org>
Co-authored-by: Phil <4689707+tgrrr@users.noreply.github.com>
Co-authored-by: Tammy Glazer <47676359+tamara-glazer@users.noreply.github.com>
Co-authored-by: AllenDowney <downey@allendowney.com>
Co-authored-by: Kristian Bodolai <69843715+KBodolai@users.noreply.github.com>
Co-authored-by: Jay Qi <2721979+jayqi@users.noreply.github.com>
Co-authored-by: Chris Kucharczyk <chris@drivendata.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants