Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading software environment on blues #56

Closed
rljacob opened this issue Feb 4, 2016 · 9 comments
Closed

Loading software environment on blues #56

rljacob opened this issue Feb 4, 2016 · 9 comments
Labels

Comments

@rljacob
Copy link
Member

rljacob commented Feb 4, 2016

We were having trouble running on blues. It worked fine if you did "qsub $CASE.run" but a user would get library path errors if they executed $CASE.submit.

@jayeshkrishna traced it to the way CIME loads the machine specific environment in the job submission scripts. This bug is repeated here for further discussion and to check the solution.

@rljacob
Copy link
Member Author

rljacob commented Feb 4, 2016

Details from @jayeshkrishna:

"In blues we use the PBS job scheduler and the "-V" option is used within the
job submission scripts. The "-V" imports environment from the caller
processes' environment. PBS modifies some environment variables (like PATH)
in the user's (caller process's) environment while retaining (if "-V" option is used)
the other environment variables from the user's environment.

When submitting a job, the submit script ($CASE.submit),
loads the environment (using ModuleLoader) before submitting the run script
($CASE.run) using the qsub command.

The job submission perl script in CIME loads the environment by running (it reads the env
set by env_mach_specific and explicitly sets its own env accordingly) env_mach_specific.
The module loader in CIME has optimizations to avoid reloading the environment if it is already
loaded (see "CIME_MODULES_LOADED" in ModuleLoader.pm).

Since variables like PATH are modified by PBS and the run script retains the
environment of the submit script, when the run script tries to load the machine specific env
the module loader skips it due to the optimization mentioned above. Hence the PATH
(and other required environment variables) is not set correctly in the run script.
i.e., In the run script PATH is not set correctly while CIME_MODULES_LOADED is set."

@rljacob
Copy link
Member Author

rljacob commented Feb 4, 2016

Again from @jayeshkrishna:

"There are two possible fixes to the problem,

  • Remove the "-V" option passed to qsub (see settings for pbs in config_batch,xml)
<directive default="/bin/bash" > -S {{ shell }} -V </directive>
  • Disable the optimization in CIME that prevents reloading of the user environment (See ModuleLoader.pm)
if(defined $ENV{CIME_MODULES_LOADED}) {return $self};

"

@rljacob
Copy link
Member Author

rljacob commented Feb 4, 2016

We've decided to disable the optimization in CIME that prevents reloading of the user environment. Does anyone know a good reason to keep that optimization?

@rljacob rljacob added the ty: Bug label Feb 4, 2016
@jedwards4b
Copy link
Contributor

I do - the user environment sometimes causes errors that are difficult to
debug and/or testers set up environments and forget to tell other people
that they need them. We want to keep the cime environment self contained
to the extent possible.

On Thu, Feb 4, 2016 at 4:29 PM, Robert Jacob notifications@github.com
wrote:

We've decided to disable the optimization in CIME that prevents reloading
of the user environment. Does anyone know a good reason to keep that
optimization?


Reply to this email directly or view it on GitHub
#56 (comment).

Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO

@rljacob
Copy link
Member Author

rljacob commented Feb 5, 2016

So you'd be in favor of taking out the -V ?

@jedwards4b
Copy link
Contributor

Yes - I think that we've seen this problem on several machines and the
solution has been to take out the -V.

On Thu, Feb 4, 2016 at 5:36 PM, Robert Jacob notifications@github.com
wrote:

So you'd be in favor of taking out the -V ?


Reply to this email directly or view it on GitHub
#56 (comment).

Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO

@rljacob
Copy link
Member Author

rljacob commented Feb 5, 2016

Thanks. But is that line to prevent reloading in ModuleLoader.pm necessary for any reason?

@jedwards4b
Copy link
Contributor

It was intended to avoid repeating tasks that had already been done. But
it's not in the latest CESM-development or ESMCI master.

On Thu, Feb 4, 2016 at 7:26 PM, Robert Jacob notifications@github.com
wrote:

Thanks. But is that line to prevent reloading in ModuleLoader.pm necessary
for any reason?


Reply to this email directly or view it on GitHub
#56 (comment).

Jim Edwards

CESM Software Engineer
National Center for Atmospheric Research
Boulder, CO

@rljacob
Copy link
Member Author

rljacob commented Feb 10, 2016

Fixed in ACME.

@rljacob rljacob closed this as completed Feb 10, 2016
jedwards4b added a commit that referenced this issue May 9, 2016
Bug fixes, documentation improvement, build flags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants