review export code #123

cseed · 2016-01-07T20:37:42Z

No description provided.

cseed · 2016-01-18T22:47:06Z

Duplicate of: #108

* added --max-age option. * bump version

* update * update * update * updatE * Create README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * update * Update README.md * Update README.md * Update README.md * Update README.md * update * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * added default Spark configs to init_notebook.py * updatE * update * update * add time delay to allow Jupyter to start * add time delay to allow Jupyter to start * revert log change * update * updatE * updated README * set maxResultSize property to unlimited * changed default worker to n1-standard-8 * merged init_default.py functionality into init_notebook.py * merged init_default.py functionality into init_notebook.py * updated README * updated README * updated README * updated README * updated README * updated README * updated README * updated README * updated README * updated README * updated README * fixed order of code in init script * fixed ports argument in start-up script * moved waiting for Jupyter code to init script * updated alias code block * fixed init script filename * Google Chrome check needs to be fixed * added gitignore * highmem worker by default with --vep. * added --hash option to start_cluster.py to reference older Hail builds * Merge pull request #5 from Nealelab/dev added --hash option to start_cluster.py to reference older Hail builds * decoupled default conf in Jupyter notebook Spark from /etc/spark/conf/spark-defaults.conf * typo in submit_cluster * modified init_notebook script * Update stop_cluster.py * Now passes extra properties to gcloud * added ability to specifiy custom Hail jar and zip for Jupyter notebook on startup * Some tightening of options * Moving into main * Removed duplicate keyword argument * remove duplicate argument * Added diagnose_cluster.py Compiles log files for a cluster to a local directory or a google bucket ``` python diagnose_cluster.py -n my-cluster -d my-cluster-diagnose/ python diagnose_cluster.py -n my-cluster -d gs://my-bucket/my-cluster-diagnose/ ``` ``` usage: diagnose_cluster.py [-h] --name NAME --dest DEST [--hail-log HAIL_LOG] [--overwrite] [--no-diagnose] [--compress] [--workers [WORKERS [WORKERS ...]]] [--take TAKE] optional arguments: -h, --help show this help message and exit --name NAME, -n NAME Cluster name --dest DEST, -d DEST Directory for diagnose output -- must be local --hail-log HAIL_LOG, -l HAIL_LOG Path for hail.log file --overwrite Delete dest directory before adding new files --no-diagnose Do not run gcloud dataproc clusters diagnose --compress, -z GZIP all files --workers [WORKERS [WORKERS ...]] Specific workers to get log files from --take TAKE Only download logs from the first N workers ``` - Runs `gcloud dataproc clusters diagnose` - Grabs following log files from master node ``` /var/log/hive/hive-* /var/log/google-dataproc-agent.0.log /var/log/dataproc-initialization-script-0.log /var/log/hadoop-mapreduce/mapred-mapred-historyserver* /var/log/hadoop-hdfs/*-m.* /var/log/hadoop-yarn/yarn-yarn-resourcemanager-*-m.* /home/hail/hail.log # can be modified with command line argument ``` - Grabs following log files from workers ``` /var/log/hadoop-hdfs/hadoop-hdfs-datanode-*.* /var/log/dataproc-startup-script.log /var/log/hadoop-yarn/yarn-yarn-nodemanager-*.* /var/log/hadoop-yarn/userlogs/* ``` Output directory has following structure: ``` diagnostic.tar master/my-cluster-m/... workers/my-cluster-w-*/... hadoop-yarn/userlogs/application*/container* ``` * sec worker fix * Break apart ssh options. Saw failures with some version of gcloud/ssh. * Exposed --metadata and fixed problem with creating directory * Recapitulating subprocess fixes of PR #11 * Fix typo in README * Added executable * Updates to support multiple Hail versions and new deployment locations. - init_notebook is now versioned for compatibility. This commit uses version 2, which I've uploaded to gs://hail-common/init_notebook-2.py. - Hail now deploys both 0.1 and devel versions, so I added an argument to allow either to be used. The stable version should of course be used by default. - The init arg is now empty by default, because the init_notebook script should always be run (and requires the compatibility version to decide the correct path). It is still possible to use additional init actions. * small fix in init_notebook; updated submit script to reflect new Hail deployment * packaged commands under umbrella 'cluster' module * updated diagnose * updated readme; added --quiet flag to stop command * updated readme with optional arguments * Update LICENSE.txt * make notebook default for cluster connect * Overhaul CLI using argparse subparsers; interface change - More informative help messages - Added default args to --help output - Interface change: module comes before name * Fixed HAIL_VERSION metadata variable. * updated setup.py to reflect v1.1 * changed some instances of check_call to call to avoid redundant errors * Remove zsh artifacts from README * added --args option to submit script to allow passing arguments to submitted Hail scripts * incremented to v1.1.2 * Update README.md * Remove sleep * removed Anaconda from notebook init; added --pkgs option to cluster start * Fix deployment issues by bumping compatibility version * fixed jar distribution issues * forgot something * Updating spark version to dataproc 1.2 * a few fixes for 2.2.0 * COMPAT version changes * Made the os.mkdir statements safer and free from race conditions * Fix cloudtools to work with Hail devel / 0.2 (#47) * Update README.md (#48) * Unify hail 0.1 and 0.2 again, fix submit (#49) * Unify hail 0.1 and 0.2 again, fix submit * Fixed submit help message * Bump version * Update init_notebook.py (#51) * Add parsimonious (#52) * Parameterize master memory fraction (#53) * Parameterize master memory fraction * Parameterize master memory fraction * Parameterize master memory fraction * add bokeh to imports (#54) * Use specific version of decorator (#56) * Update README.md (#57) * add modify jar and zip (#59) * * Fixed zip copying (#60) * Added gs:// support * rolling back google-cloud version (#62) * moved up package installation in init script (#63) * use beta for max-idle option (#61) * use beta for max-idle option * bug fix * added Intel MKL to init script (#64) * added Intel MKL to init script * fix * another fix * Update default version to devel / spark 2.2.0; update README (#65) * Update default version to devel / spark 2.2.0; update README * fix * Added initialization time-out option. (#71) * add async option to stop (#73) * check for errors in start, stop, submit, and list (#74) * update version to 1.14 (#75) * Syntax error (#76) * fix syntax error * bump versino * add a bucket parameter (#78) * add a bucket parameter * also document deployment * use config files to set some default properties (#77) * do... something * set image based on spark version * tweak to run using paths that deploy will spit out * fix * fix rebase * Set up Continuous Integration (#80) * wip hail ci * fix formatting * ignore emacs temp files * add cluster sanity checks * Update setup.py * Update cluster-sanity-check-0.2.py * Fix CI Build (#81) * Update hail-ci-build.sh * Update hail-ci-build.sh * add more necessary things * fix build image and update file * fix build image maybe * use python2 * fix image * Update hail-ci-build.sh * Update hail-ci-build.sh * Continuous Deployment (#82) * add deploy script * document deployment secret creation * fix readme * fix if check * ensure twine is in build image * kick ci * set required property? apparently? * bump to 0.2 (#79) * add make to image (#85) * fix deploy (#86) * copy some lessons from hail (#84) copying some ideas from the discussion at #4241 * Update hail-ci-deploy.sh (#87) * fix (#88) * fix cloudtools published check (#89) * add warning, versioned hash lookup (#90) * fix deploy script version checking (#92) * Test python 3.6 and fix python 3.7 incompatibility (#91) * test python3 * also fix async is reserved word * checked in bad build file * unneeded var * shush pip * kick ci * update build hash * Ignore INT and TERM in shutdown_cluster * parse init script list (#94) * parse init script list * Update __init__.py * switched devel vep to use docker init (#96) * bump version for vep init (#98) * deploy python2 and python3 to pypi (#93) * Update start.py (#99) * Update start.py * Update start.py * Update __init__.py (#100) * fix python3 deploy (#101) * Fix pkgs logic (#102) * Adding more options to modify (#67) * Added options to modify clusters * Update modify.py * Add a max-idle 40m to test clusters (#103) * Add a max-idle 40m to test clusters * need gcloud beta components * Pin dependency versions (#105) * pin dependency versions * update the version of cloudtools * install all packages together to ensure dependencies are calculated together * fail when subprocess fails * fix conda invocation * compatibility with python2 * Revert "fail when subprocess fails" This reverts commit 25e7c0a524823d91894b538427f179611e79f271. * blah * wtf * if was backwards * restart tests * Improve Error Messages when Subprocesses Fail (#111) * add and use safe_call * fail when subprocess fails * use safe_call * use safe_call extensively * simplify and make correct safe_call * fix splat * fix * foo * update verison (#113) * Added describe to get Hail file info/schema (#112) * Added describe to get Hail file info/schema * f -> format * Update setup.py * Update __init__.py (#115) * Fix cloudtools (#116) * fix * bump version * fix (#117) * bump ver (#118) * fixed describe ordering for python2 (#119) * devel => 0.2 (#121) * add latest (#120) * added --max-age option. (#123) * added --max-age option. * bump version * update to 3.0.0 (#122) * update to 3.0.0 * bump * bump * s/devel/0.2 * Fix packages again (#124) * Fix packages again * fix * Add 'modify' and 'list' command docs (#125) * Update connect.py (#126) * Rollout fix for chrome 72 (#130) * Add python files or folders from environment variable; zip files together (#127) * Add python files or folders from environment variable; zip files together * bumping version * files -> pyfiles * missed one * overloaded variable * updating VEP init script (#129) * updating VEP init script * Update __init__.py * files -> pyfiles once more (#131) * fix for jupyter/tornado incompatibility (#133) * Adding project flag (#134) * Adding project flag * Adding configuration option as well * Adding support for GRCh38 VEP (#135) * Adding support for GRCh38 VEP * version bump * Fixing VEP version for 38 (#136) * Adding support for GRCh38 VEP * version bump * fix for 38 VEP version * Update __init__.py * Disable stackdriver on cloudtools clusters (#138) * Update default spark version (#139) * Update default spark version * Clean up imports * allowing pass-through args for submit (#140) * allowing pass-through args for submit * bump version * moar version * moved cloudtools to subdirectory project for inclusion in monorepo * moved .gitignore * bump * bump

Merge upstream (includes 0.2.73)

cseed self-assigned this Jan 7, 2016

cseed added the duplicate label Jan 18, 2016

cseed closed this as completed Jan 18, 2016

tpoterba pushed a commit to tpoterba/hail that referenced this issue Feb 12, 2019

added --max-age option. (hail-is#123)

23f56f3

* added --max-age option. * bump version

daniel-goldstein referenced this issue in daniel-goldstein/hail Feb 3, 2022

Merge pull request #123 from populationgenomics/upstream-jul23

c6f6f09

Merge upstream (includes 0.2.73)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

review export code #123

review export code #123

cseed commented Jan 7, 2016

cseed commented Jan 18, 2016

review export code #123

review export code #123

Comments

cseed commented Jan 7, 2016

cseed commented Jan 18, 2016