New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLURM: script.job_environment does not support --export=ALL ? #109

Open
michaelkarlcoleman opened this Issue Nov 27, 2018 · 14 comments

Comments

Projects
None yet
3 participants
@michaelkarlcoleman
Copy link

michaelkarlcoleman commented Nov 27, 2018

Not sure if this is missing doc or missing feature. The job_environment YAML setting does not appear to support the ALL possibility, nor NONE, nor the export of a variable without specifying its value. The ALL case is particularly important, as it looks like OOD is itself hard-coding a default of NONE.

A partial workaround that could be documented is to put --export=ALL in native.

From the srun man page:

       --export=<environment variables [ALL] | NONE>
              Identify which environment variables are propagated  to  the  launched
              application,  by  default  all  are  propagated.  Multiple environment
              variable names should be comma separated.  Environment variable  names
              may  be  specified to propagate the current value (e.g. "--export=EDI-
              TOR")  or  specific  values  may  be  exported  (e.g.   "--export=EDI-
              TOR=/bin/emacs").   In  these  two examples the environment propagated
              will only contain the variable "EDITOR"  and  nothing  else.   If  one
              desires  to  add  to  the environment instead of replacing it have the
              argument include ALL (e.g. "--export=EDITOR,ALL").  This  will  propa-
              gate  "EDITOR"  along with the current environment.  If one desires no
              environment variables be propagated use the argument NONE.  Regardless
              of  this setting, the appropriate "SLURM_*" task environment variables
              are always exported to the environment.

@MorganRodgers MorganRodgers transferred this issue from OSC/ood-documentation Nov 28, 2018

@ericfranz

This comment has been minimized.

Copy link
Contributor

ericfranz commented Nov 28, 2018

@michaelkarlcoleman what does your job_environment yaml look like?

I would expect that you would be able to do this:

script:
  job_environment:
    SBATCH_EXPORT: "ALL"

Are you doing this and it still not working?

I'm also curious what environment variables you want to propogate from the PUNs. I think the value of SBATCH_EXPORT being set to NONE by default was because when launching the PUNs sudo strips the environment, so the user's bashrc/bash_profile, for example, will not be sourced and the PUN environment will contain env vars only relevant to the web environment, not the compute environment. Perhaps these assumptions are wrong. Understanding your use case will be helpful when we update the documentation.

As for exporting specific environment variables instead of none, I guess with job_environment if SBATCH_EXPORT worked we could specify which env vars to set instead of ALL.

FWIW this commit introduced this, though unfortunately the commit doesn't have a good comment so it isn't clear why.

@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Nov 28, 2018

Hmm. Have to admit I find that code pretty meta and confusing. (Haven't used Ruby in a decade, which doesn't help.) It looks like the script.job_environment value will be put in a dict (which has a default pair mapping SBATCH_EXPORT to NONE), and then that dict will be turned into a srun --export flag.

So for your example, something like

srun --export=SBATCH_EXPORT=ALL

rather than the

srun --export=ALL

I'd like to use. This doesn't seem like it would work, and in testing it seems not to. (Wiping out cached files and processes is relatively complex, so it's possible that my test was not correct.)

Looking at that commit and considering the above, I'm not sure the manipulation of SBATCH_EXPORT there makes sense.

As for the particulars of our site, I haven't run it all the way down, but our setup depends on exporting the entire environment from the user's login node (where srun is run) to the compute nodes. There are minuses to this, but it does generally make life easier for users. (This is a bit complicated the fact that SLURM was recently changed from a default like --export=ALL to a default like --export=NONE.) Anyway, without this setting, variables like HOME and PATH are unset, leading to immediately disaster when trying to run a OOD job/desktop.

@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Nov 28, 2018

I ran into another issue with mate desktop startup, and upon reflection, it appears to be related to this. The symptom is that mate startup fails with a bunch of glib schema errors. It turns out that this can be worked around by simply unsetting XDG_DATA_DIRS in the mate script. That variable is apparently being set with its value from one of the Passenger processes, due to the ALL setting above. Unfortunately, this blocks search of the standard location for compiled schemas.

Rather than unsetting, a plausible alternative would be to add the default values for this variable, which appear to be /usr/local/share/:/usr/share/.

Even better would be to avoid passing most of the OOD-specific variable changes through to the users' processes. There are quite a few others, but most seem not to matter than much, at least in testing so far.

@ericfranz

This comment has been minimized.

Copy link
Contributor

ericfranz commented Nov 29, 2018

It looks like the script.job_environment value will be put in a dict (which has a default pair mapping SBATCH_EXPORT to NONE), and then that dict will be turned into a srun --export flag.

Close. The dict or Ruby Hash eventually is passed as the env argument to Open3.capture3 which under the hood calls execve system call, which accepts the environment as an argument (The envp of int execve(const char *filename, char *const argv[], char *const envp[]);) So the --export flag is never used. But it appears that if the SBATCH_EXPORT env var is set and the --export flag is set, the --export flag takes precedence.

Also, the adapter is actually executing sbatch instead of srun. Does that pose a problem? Are there reasons why you would want to run srun instead?

As for setting the environment similar to what users have when logging into a login node, it might actually be good to ask what other Slurm sites do. Because the PUNs are started using sudo, which wipes the environment, so by default most of the user's environment, and especially customizations in ~./bash_profile and ~./bashrc are not loaded.

I'm wondering, would it be possible for you to produce a set of commands that were executed at the start of any job to setup the user's environment properly, while retaining the current SBATCH_EXPORT=NONE or --export=NONE? There is a mechanism in place in OnDemand to support this type of configuration. Otherwise I'll see what other Slurm sites do.

Also I do apologize for the code confusion, there is a bit of misdirection that doesn't help when coming to the code the first time. I think there is a lot of room for improvement in this area.

@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Nov 30, 2018

I might be missing what's going on, but this line makes me feel that the environment is indeed being passed via --export. I haven't double-checked this via strace, though.

Regarding sbatch, yes that's preferable. (Confusingly, outside of a batch job, srun basically does the same thing, so many use it as a synonym.)

Not sure whether our site is unusual in defaulting to passing the environment to SLURM jobs. I wasn't around for the initial setup, but we're using Bright Cluster Manager, and perhaps we picked it up from that. Unfortunately, changing the default behavior now is almost unthinkable, as we have hundreds of users with scripts depending on current behavior. As a practical matter, we'll have to modify the OOD scripts to accommodate this.

It's not obvious to me what's really best with respect to the flow of env vars through the processes. It'd be nice if the env of sbatch itself was "sterilized", not including the changes needed to run the other bits of OOD, but I realize that that's not necessarily trivial to implement.

My primary reason for mentioning it is that the current failure mode is pretty obscure, and I suspect a lot of potential OOD sites would just give up. Some mention of the issue in the install docs would be useful I think.

@ericfranz

This comment has been minimized.

Copy link
Contributor

ericfranz commented Dec 1, 2018

Ah you are right. So it makes sense now: SBATCH_EXPORT=NONE is being used to block miscellaneous env vars in the environment, by default. And since ood_core offers as part of the abstract "Script" object a way to specify custom env vars for the job, those env vars are set using --export which override the env var SBATCH_EXPORT=NONE so presumably only the env vars specified get passed to the job.

For your particular problem I see two challenges:

  1. How to load the same environment variables that are loaded in the user's environment on a login node when they go to submit the job. This is challenging because the environment is already stripped by the time nginx_stage is called to launch the users PUN. The question is, what is the way, with a stripped environment, to restore the environment for a user sufficiently. This would either need to be code run on the web node prior to job submission, or code run at the beginning of the batch job to restore the proper environment (but could be something injected into each job). I think this depends on how similar the web node is to the login node (or if OnDemand is running on a login node).

  2. How to submit those environment variables as arguments to the job while excluding the other environment variables that are set by NGINX and Passenger (and even before that) so that the job runs as expected (i.e. sterilized as you put it). I see several possibilities. In Ruby, Open3 which uses spawn underneath (which ultimately uses execve) supports specifying a list of env vars to unset. Also it supports excluding any env vars not passed in the first argument to spawn. So if we could solve 1. and run a command or set of commands to get the desired environment, we could pass that Open3 with the correct options to avoid passing the rest of the environment variables that are web specific.

We can continue discussing on this issue or if you were able to provide me an HPC account I could help you find a solution, and then document the general solution here - and apply the appropriate fixes as necessary. If you are interested email me at efranz@osc.edu.

@ericfranz

This comment has been minimized.

Copy link
Contributor

ericfranz commented Dec 1, 2018

Two thoughts:

  1. I'll reach out to a colleague here who has more experience with Slurm than I and see what he says.
  2. One consideration to get the environment required is to do something like ssh login_host_name bash -lc 'env' or bash -lc 'env' and the parse the output to build a hash for the desired environment. That hash could be passed Open3/spawn. That makes some big assumptions though, like bash is the user's shell and the environment the user desires to have when they submit a job is the environment the user has after first ssh-ing to the login node.
@ericfranz

This comment has been minimized.

Copy link
Contributor

ericfranz commented Dec 2, 2018

Do you have it setup so that from OnDemand a user can ssh from the web node to the login node without inserting their credentials? If so it might be possible to write a wrapper script around sbatch that actually executes ssh login_host 'sbatch ...'. If this worked well maybe we would extend the ood_core adapter to support this with a configuration option (i.e. execute command via ssh on the login node configured for the cluster in the cluster config).

@treydock

This comment has been minimized.

Copy link

treydock commented Dec 3, 2018

@ericfranz Using --export=ALL is isn't uncommon but according to the man page --export=ALL is the default if --export isn't used. I think the only way to allow a useful environment to get passed to the job is what you described, using a wrapper to SSH to login node with full environment then run sbatch with either no --export flag or with --export=ALL. I think trying to use environment of web node won't help as probably not very common for things like modules environment to exist on web nodes which is one reason --export=ALL was handy, so that modules loaded at job submission are also loaded at job startup.

@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Dec 3, 2018

Regarding the sterile environment, I think @ericfranz has the right concept with the ssh, but it looks like there's a more efficient/portable way, doing something like this

env -i bash -l -c 'sbatch yada yada...'

This basically wipes the environment, and then running a login shell puts the user's vanilla environment back. Looks pretty similar to the ssh idea, but without having to worry about whether that would work. For many other shells, you could probably use $SHELL instead of bash. I haven't tried to sift this into the OOD code, but it seems plausible.

In my mind this is close to ideal, as OOD can't really run in the user's vanilla environment (as it might contain all sorts of oddities that would break OOD), but there's no good reason not to start sbatch (and other such commands) in something exceedingly close to the vanilla environment, with perhaps just a very few changes as needed.

Regarding the current SBATCH_EXPORT=NONE, I think that should be removed completely. The only effect of this is that it is passed into the compute node job, where it can only cause trouble. (In SLURM, you can run srun commands within a job to start sub-parts of a job, which this variable would affect, probably to the user's surprise.)

As hinted previously, the sbatch --export flag behavior is a bit odd. @treydock is correct that the default is ALL. However, if you name even a single variable (e.g., --export=PATH), the ALL is undone and only named variables are sent, unless ALL is explicitly mentioned. This is what the man page says, if you read it very carefully. Unfortunately, until very recently (around SLURM 17.11), that "undone" bit was not implemented to match the man page, and instead left the entire environment in place. So, lots of opportunities for confusion.

As mentioned previously, the $XDG_DATA_DIRS setting being passed wrecks the startup of mate in a puzzling way. But there are lots of other odd things being passed. Here's a diff of the OOD environment (at mate startup) versus a vanilla environment. (Some deltas that don't appear important stripped out.)

Lots of potential issues here. Many of these added *PATH directories don't happen to exist on our compute nodes, but it doesn't seem impossible that some might get added, potentially breaking things. The PYTHON* vars are potential trouble. Likewise the USERNAME/LOGNAME/TERM/LANG vars.

Which of these added/changed environment variables are actually needed within the batch job? I'm guessing only two or three at most, but not sure. These could be added to the --export environment.

(As a suggestion, it'd be nice to standardize on just one of OOD_ and ONDEMAND_ as a prefix for OOD vars.)

(Also, as a detail, we're running the OOD front end on our login nodes. Not sure how this would work for the separate web and login node setup.)

+BUNDLE_BIN_PATH=/opt/rh/rh-ruby22/root/usr/share/gems/gems/bundler-1.7.8/bin/bundle
+BUNDLE_GEMFILE=/var/www/ood/apps/sys/dashboard/Gemfile
+BUNDLE_ORIG_MANPATH=/opt/rh/v8314/root/usr/share/man:/opt/rh/git19/root/usr/share/man:/opt/rh/nodejs010/root/usr/share/man:/opt/rh/rh-ruby22/root/usr/sh\
are/man:/opt/rh/rh-passenger40/root/usr/share/man:/opt/rh/nginx16/root/usr/share/man:

+CPATH=/packages/intel/17/linux/include/intel64:/cm/shared/apps/slurm/17.11/include:/opt/rh/v8314/root/usr/include

+DOCUMENT_ROOT=/var/www/ood/apps/sys/dashboard/public

+GEM_HOME=/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby
+GEM_PATH=

-HISTCONTROL=ignoredups
-HISTSIZE=1000

+IN_PASSENGER=1

-LANG=en_US.UTF-8
+LANG=C

-LD_LIBRARY_PATH=/gpfs/packages/src/gbench-src-2.12.8/local/ncbi-vdb-2.8.2/lib64:/packages/intel/17/linux/tbb/lib/intel64/gcc4.7:/packages/intel/17/linux\
/lib/intel64:/cm/shared/apps/slurm/17.11/lib/slurm:/cm/shared/apps/slurm/17.11/lib
+LD_LIBRARY_PATH=/packages/intel/17/linux/tbb/lib/intel64/gcc4.7:/packages/intel/17/linux/lib/intel64:/cm/shared/apps/slurm/17.11/lib/slurm:/cm/shared/ap\
ps/slurm/17.11/lib:/opt/rh/v8314/root/usr/lib64:/opt/rh/nodejs010/root/usr/lib64:/opt/rh/rh-ruby22/root/usr/lib64:/opt/rh/rh-passenger40/root/usr/lib64

-LESSOPEN='||/usr/bin/lesspipe.sh %s'

-LIBRARY_PATH=/packages/intel/17/linux/lib/intel64:/cm/shared/apps/slurm/17.11/lib/slurm:/cm/shared/apps/slurm/17.11/lib
+LIBRARY_PATH=/packages/intel/17/linux/lib/intel64:/cm/shared/apps/slurm/17.11/lib/slurm:/cm/shared/apps/slurm/17.11/lib:/opt/rh/v8314/root/usr/lib64

-LOGNAME=mcolema5
+LOGNAME=root

-LS_COLORS='rs=0:'...etc

-MAIL=/var/spool/mail/mcolema5
+MAIL=/var/mail/root

-MANPATH=/packages/intel/man/common:/cm/shared/apps/slurm/17.11/share/man:/usr/share/lmod/lmod/share/man:/usr/local/share/man:/usr/share/man/overrides:/u\
sr/share/man:/opt/ibutils/share/man:/cm/local/apps/environment-modules/current/share/man
+MANPATH=/packages/intel/man/common:/cm/shared/apps/slurm/17.11/share/man:/usr/share/lmod/lmod/share/man:/opt/rh/v8314/root/usr/share/man:/opt/rh/git19/r\
oot/usr/share/man:/opt/rh/nodejs010/root/usr/share/man:/opt/rh/rh-ruby22/root/usr/share/man:/opt/rh/rh-passenger40/root/usr/share/man:/opt/rh/nginx16/roo\
t/usr/share/man::

+NODE_ENV=production
+NODE_PATH=/opt/rh/rh-passenger40/root/usr/share/passenger/node

+ONDEMAND_PORTAL=ondemand
+ONDEMAND_TITLE='Open OnDemand'
+ONDEMAND_VERSION=1.3.7
+OOD_DATAROOT=/home/mcolema5/ondemand/data/sys/dashboard

-PATH=/packages/intel/17/linux/bin/intel64:/cm/shared/apps/slurm/17.11/sbin:/cm/shared/apps/slurm/17.11/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:\
/usr/sbin:/opt/ibutils/bin:/opt/dell/srvadmin/bin:/home/mcolema5/.local/bin:/home/mcolema5/bin
+PATH=/packages/intel/17/linux/bin/intel64:/cm/shared/apps/slurm/17.11/sbin:/cm/shared/apps/slurm/17.11/bin:/opt/TurboVNC/bin:/var/www/ood/apps/sys/dashb\
oard/vendor/bundle/ruby/bin:/opt/rh/v8314/root/usr/bin:/opt/rh/git19/root/usr/bin:/opt/rh/nodejs010/root/usr/bin:/opt/rh/rh-ruby22/root/usr/bin:/opt/rh/r\
h-passenger40/root/usr/bin:/opt/rh/rh-passenger40/root/usr/sbin:/opt/rh/nginx16/root/usr/bin:/opt/rh/nginx16/root/usr/sbin:/sbin:/bin:/usr/sbin:/usr/bin

+PASSENGER_APP_ENV=production
+PASSENGER_BASE_URI=/pun/sys/dashboard
+PASSENGER_DEBUG_DIR=/tmp/passenger.spawn-debug.XXXX2A6Qq6

+PKG_CONFIG_PATH=/opt/rh/v8314/root/usr/lib64/pkgconfig:/opt/rh/rh-ruby22/root/usr/lib64/pkgconfig:/opt/rh/rh-passenger40/root/usr/lib64/pkgconfig:/opt/r\
h/nginx16/root/usr/lib64/pkgconfig

+PYTHONPATH=/opt/rh/v8314/root/usr/lib/python2.7/site-packages:/opt/rh/nodejs010/root/usr/lib/python2.7/site-packages
+PYTHONUNBUFFERED=1

-QT_GRAPHICSSYSTEM_CHECKED=1

+QUERY_STRING=_=1543622010354
+RACK_BASE_URI=/pun/sys/dashboard
+RACK_ENV=production
+RAILS_ENV=production
+RAILS_RELATIVE_URL_ROOT=/pun/sys/dashboard
+REMOTE_ADDR=unix:
+REMOTE_PORT=
+REQUEST_METHOD=GET
+REQUEST_URI='/pun/sys/dashboard/batch_connect/sessions.js?_=1543622010354'
+RUBYLIB=/opt/rh/rh-ruby22/root/usr/share/gems/gems/bundler-1.7.8/lib
+RUBYOPT=-rbundler/setup

+SBATCH_EXPORT=NONE

+SCGI=1

+SECRET_KEY_BASE=<omitted>

+SERVER_ADDR=unix:/var/run/nginx/mcolema5/passenger.sock
+SERVER_NAME=localhost
+SERVER_PORT=
+SERVER_PROTOCOL=HTTP/1.1
+SERVER_SOFTWARE=nginx/1.6.2
-SHOST=talapas-ln2
+SHOST=n140

+SUDO_COMMAND='/opt/ood/nginx_stage/sbin/nginx_stage pun -u mcolema5 -a https%3a%2f%2ftalapas-ln2.uoregon.edu%3a443%2fnginx%2finit%3fredir%3d%24http_x_fo\
rwarded_escaped_uri'
+SUDO_GID=48
+SUDO_UID=48
+SUDO_USER=apache

-TERM=screen
+TERM=unknown

+USERNAME=root

+WEBSOCKIFY_CMD=websockify

+WSGI_ENV=production

+XDG_DATA_DIRS=/opt/rh/rh-ruby22/root/usr/share

+X_SCLS='v8314 git19 nodejs010 rh-ruby22 rh-passenger40 nginx16 '

+host=n140
+port=5901

# shell functions, bodies omitted
+create_passwd ()
+port_used ()
+random_number ()
+source_helpers ()
+wait_until_port_used ()
@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Dec 4, 2018

Regarding that formula above, testing it out some, this might give a better result:

env -i HOME=~ bash -l -c 'sbatch yada yada...'

Specifically, HOME seems not to get set unless passed this way, at least in our environment (RHEL 7).

@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Dec 4, 2018

Until/unless this is changed, here's the bit I'm using in the meantime (in addition to the --export=ALL). Depending on the app, it might also be worthwhile to set TERM to something.

This could go in the before.sh.erb file.

unset XDG_DATA_DIRS
unset SBATCH_EXPORT
unset MAIL
unset PYTHONPATH
unset PYTHONUNBUFFERED

export LOGNAME=$(whoami)
export USERNAME=$LOGNAME
@michaelkarlcoleman

This comment has been minimized.

Copy link
Author

michaelkarlcoleman commented Dec 4, 2018

Regarding my comment above about SBATCH_EXPORT, I'm less sure now about how this actually works. I noticed this line in the nginx log that appears to show that it is indeed making it into the environment in which sbatch is called.

App 125411 stdout: [2018-12-04 13:13:35 -0800 ]  INFO "execve = [{\"SBATCH_EXPORT\"=>\"NONE\", \"SLURM_CONF\"=>\"/cm/shared/apps/slurm/var/etc/slurm.conf\
"}, \"/cm/shared/apps/slurm/current/bin/sbatch\", \"-D\", \"/home/mcolema5/ondemand/data/sys/dashboard/batch_connect/dev/jupyter/output/b0aa0bf8-316b-4ac8
-9f50-97299c63fca0\", \"--mail-type\", \"BEGIN\", \"-J\", \"sys/dashboard/dev/jupyter\", \"-o\", \"/home/mcolema5/ondemand/data/sys/dashboard/batch_connec
t/dev/jupyter/output/b0aa0bf8-316b-4ac8-9f50-97299c63fca0/output.log\", \"-p\", \"test\", \"-t\", \"01:00:00\", \"--export=ALL\", \"--parsable\"]"
@ericfranz

This comment has been minimized.

Copy link
Contributor

ericfranz commented Dec 5, 2018

Yep.

  1. OodCore::Job::Adapters::Slurm object, in the OodCore::Job::Adapters::Slurm#submit method converts the OodCore::Job::Script instance to an array of Slurm specific arguments
  2. OodCore::Job::Adapters::Slurm#submit calls an internal object's method OodCore::Job::Adapters::Slurm::Batch#submit_string
  3. submit_string calls Open3.capture3 with the array of arguments and the environment "SBATCH_EXPORT" => "NONE"
  4. Open3.capture3 under the hood calls execve.

The history of the this code organization goes back to when we had a single Torque specific gem https://github.com/osc/pbs-ruby which is a Ruby interface to Torque, and then we add this generic adapter in front OodCore::Job::Adapters::Torque. Since there was the precedence of this organization, and it made things easier to think about it was the easiest path forward at the time.

Now I believe that since we have 5 adapters (SGE coming soon will be the 5th) we better understand the common problems between each and we could refactor this code so that we don't need both OodCore::Job::Adapters::Slurm and OodCore::Job::Adapters::Slurm::Batch objects. Having less redirection would improve both code readability and performance - though performance in this case is only relevant for qstat/squeue/bjobs etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment