-
Notifications
You must be signed in to change notification settings - Fork 515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Allow install-conda-env.sh to access conda env create functionality #58
ENH: Allow install-conda-env.sh to access conda env create functionality #58
Conversation
ENH: Add spark-csv as dependency to Jupyter pyspark kernel BUG: Change from tee to > for jupyter log file redirect BUG: jupyter command not found, updating PATH from within launch-jupyter.sh the same way setup-jupyter.sh does ENH: Clean up launch-jupyter-interface.sh ENH: Use python yaml package to parse cluster metadata BUG: Fix PATH in kernel-based setup/launch scripts ENH: Add zone as argument for launch-jupyter-interface.sh
We found a Contributor License Agreement for you (the sender of this pull request) and all commit authors, but as best as we can tell these commits were authored by someone else. If that's the case, please add them to this pull request and have them confirm that they're okay with these commits being contributed to Google. If we're mistaken and you did author these commits, just reply here to confirm. |
@dennishuo: any advice for how to get @jeffkpayne all squared away with the CLA on this PR? Thanks! |
@nehalecky There's still no automated process for the multi-author case, but any "LGTM" or +1 comment from @jeffkpayne on this PR will satisfy the current requirements for a committer to merge the PR despite the googlebot label. |
LGTM |
@@ -13,15 +13,17 @@ sudo apt-get install -y git | |||
# > Command failed: /bin/bash -x -e | |||
# > /home/vagrant/miniconda-3.18.3/conda-bld/work/conda_build.sh | |||
# If directory doesn't exist, attempt to create it. | |||
if [ ! -d "~/.local/share/jupyter/" ]; then | |||
if [[ ! -d "~/.local/share/jupyter/" ]] | |||
then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, do you currently have a particular shell style guide preference? We haven't quite been consistent ourselves in enforcing shell style, though we've been more strict in some subprojects than others (especially if produced initially within Google) and generally follow the Google bash style guide
For now I'm okay with some variation since we haven't really been providing guidance on style guide decisions yet and certainly no need to block this PR on updating style consistency on unrelated parts of the scripts, but as a heads up, in the near future we may go through and do some cleanup to improve style consistency between the init actions (which in this case would change back to "if [[ ! -d ... ]]; then" all on one line).
Let me know if such refactoring would be a potential pain point for you to re-merge into your forks and/or whether you'd prefer to update any pieces accordingly within this PR before I test and merge it.
Hi @dennishuo, thanks for the comments and feedback above. In short, we'd be fine with changes you're proposing, and agree that it'd be great to follow a shell style convention—thanks for the link! We'd like to help, but likely better that you do a first pass, which would help us learn the style as applied to this code. At quick glance, I didn't see a convention for how to include default setting for environment variables, and hope you find our convention useful. Expanding on a few more thoughts:
Indeed, the complexity of our init action scripts continues to grow in a number of ways:
We're of the opinion that, before we try and reinvent a provisioning library entirely in bash, we're at a critical point now where we should consider alternatives. We also agree that python would be a straightforward option, and in particular adoption of a higher level provisioning library (like ansible and salt, which already have playbooks and recipes that support installing many), would be a significant benefit (as mentioned in #15). Other alternatives might be in support of containers, etc. We see this as saving us a considerable amount of effort supporting and expanding capabilities in provisioning a Dataproc cluster. So much so that, we're considering an init action to simply install one of these libraries across the cluster so we can leverage such functionality, however, we'd like to engage with you to ensure our efforts align with whatever roadmap you have established for this. Hope that's a possibility. Thanks again, @dennishuo, appreciate your efforts. :) |
@nehalecky good points; the good news is that even though initially we had required strictly shell scripts for init actions, nowadays any executable can be used as an init action, so it should work today to use Python in place of bash, just making sure to have the shebang specifying python Of course, that wouldn't solve the problems of custom shared init-action libraries, managing init action dependencies, etc. The roadmap for higher-level provisioning libraries is still a bit murky, but we'll be happy to start bouncing ideas back and forth. Ansible might not be a good fit because we'd generally want to make sure the provisioning system plays well with runtime scaling adding/removing VMs safely even in the middle of a job, so running on new nodes and keeping the hosts file in sync could be tricky. On the other hand it may still be possible to reuse Ansible modules to some extent inside of init actions. If you were to create an init action for installing a provisioning framework across the cluster, would it be more for extensibility of what to deploy on the initial deployment, or would it be as an admin tool for continuing to modify the cluster as it's already running? I filed #59 for further discussion of extensibility models. |
@dennishuo. Thanks. Wow, great to learn about support for python scripts, too! Excited to jump in with further discussions on #59 and hope to do so by the end of this week. :) |
Looks good, tested cleanly:
|
This is a substantial refactor of the conda init action:
bootstrap-conda.sh
andinstall-conda-env.sh
, allowing more configuration of both