Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default_qsub_arguments -V not working #781

Merged
merged 1 commit into from Sep 13, 2018

Conversation

sakshamgarg
Copy link
Contributor

@sakshamgarg sakshamgarg commented Aug 16, 2018

Bug/feature Description

  • qsub -V option is not working when it is set using default_qsub_arguments.
  • Bugs: Once the default_qsub_arguments is set for -V option, "qstat -f" doesn't display environment variable except those which are related to PBS or which are set using -v option.

Affected Platform(s)

  • ALL

Cause / Analysis / Design

  • Bugs: Environment variable list is not stored after checking that -V option is set or not in default_qsub_arguments.

Solution Description

  • After the options for default_qsub_arguments are processed, -V option is checked whether it is set or not.
  • If the -V option is set then using the variable 'environ', all the environment variable are stored as a list in a variable.

Testing logs/output

Checklist:

For further information please visit the Developer Guide Home.

@sakshamgarg sakshamgarg force-pushed the default_arg_V branch 2 times, most recently from e2bf2a7 to a7bc129 Compare August 16, 2018 11:40
src/cmds/qsub.c Outdated

/* qsub_envlist might be needed if -V option is set in default flags.
* It will be unset if not needed then. */
qsub_envlist = env_array_to_varlist(envp);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to call the function env_array_to_varlist() only when desired i.e. when V_opt is set either while calling qsub or through default_qsub_arguments.
In other words before calling the function set_job_env() make sure that V_opt(This should be set only when qsub is called through -V option or default_qsub_arguments contains -V option) is set or not. If set then only call env_array_to_varlist to populate qsub_envlist and then pass it to set_job_env()

@suresh-thelkar
Copy link

Also we need to write a few PTL test cases to verify the fix. Following are the test cases that I can think of right now.

  1. -V option is not set in qsub but in default_qsub_arguments it is set.
  2. -V option is not set in qsub but in default_qsub_arguments it is set. Submit a job script with and without -V option.
  3. Submit a test case by only setting default_qsub_arguments through job script i.e. default_qsub_arguments should not contain -V option.

If required all the above test cases can be clubbed into a single test case.

"""
os.environ["SET_IN_SUBMISSION"] = "true"
self.server.manager(MGR_CMD_SET, SERVER, {'default_qsub_arguments': '-V'})
j = Job(ROOT_USER)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do need to submit job as root user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to submit the job through a single user only, need not be a root user.
This is needed because the sample environment variable which is set will not be accessed by the job running on other users. The sample environment variable is set only for the current user.

"\nexit 0" +
"\nfi" )
jid = self.server.submit(j)
self.server.log_match(jid + ";Exit_status=0", id=jid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing new line.

j = Job(ROOT_USER)
j.create_script("#PBS -N default_qsub_arguments"
"\n#PBS -V" +
"\nif [ -z '$SET_IN_SUBMISSION' ]" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In if [ -z '$SET_IN_SUBMISSION' ], I doubt that this SET_IN_SUBMISSION will get expanded because it is in single quote

Copy link

@suresh-thelkar suresh-thelkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the suggested code changes. It looks much better now. Here are a few more comments on PTL.

This Test case exports an environment variable. It sets
-V option through a job script directive and then changes
the value of the environment variable. It expects another job
script to have the changed value of the environment varible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of saying another job script we can say that "The job under execution should get the updated value of the environment variable" since its(Job Script Directives) precedence is higher than the others"

script to have the changed value of the environment varible.
"""
os.environ["SET_IN_SUBMISSION"] = "true"
j = Job(ROOT_USER)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ROOT_USER you can actually use one of TEST_USER variables.

os.environ["SET_IN_SUBMISSION"] = "true"
self.server.manager(MGR_CMD_SET, SERVER,
{'default_qsub_arguments': '-V'})
j = Job(ROOT_USER)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of ROOT_USER you can actually use one of TEST_USER variables.

'SET_IN_SUBMISSION=%s' % env_var)},
id=jid)

def test_option_V_with_script(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to rename the test case as test_qsub_opt_V_with_jobscript().

@sakshamgarg
Copy link
Contributor Author

Thanks for all the suggestions and comments. I have updated the PTL test and tested it on the test machine. I have attached the log file for the same.
LOG_default_qsub_argument_V.txt

src/cmds/qsub.c Outdated
@@ -4914,6 +4914,10 @@ do_submit(char *retmsg)
if (rc != 0)
return (rc);
}

/* Get required environment variables */
if (V_opt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curios to know: Can you do little qsub performance test with large environment, before and after this change? I am suspecting that by doing this it will slow down qsub. (you use below script to do test)

#!/bin/bash

qmgr -c "s s default_qsub_arguments='-V'"
for i in $(seq 1 1 100)
do
    export TEST_VAR${i}="$(printf "${i}%.0s" {1..1000})"
done
start=$(date +'%s')
for i in $(seq 1 1 100)
do
    qsub -- /bin/sleep 1000
done
end=$(date +'%s')
echo $((end - start))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said yesterday during our offline discussion it is not good to compare qsub performance with large environment before and after the current code fix. Prior to this code change the function env_array_to_varlist() was not getting called when we set default_qsub_arguments and hence the current bug.
As part of the fix this function call is reintroduced at the right place. So if we compare with and without code fix the performance of qsub after the fix is expected to slow down with -V option. So the ideal test case for performance measure would be to compare qsub with a older version where this bug was not present vs with the current code fix. @sakshamgarg is verifying this test case and will be posting the results and logs to the ticket.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@suresh-thelkar Thanks for update, I was not aware that this issue was regression issue. So now I am fine if we don't want to do performance measure as with -V qsub will be slow is expected, but still up to you and @sakshamgarg for doing perf measurement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with what @suresh-thelkar has suggested in the second part. I did the qsub performance test, between the older version where this bug was not there and the current code fix which has been done, using the above python script. There is no significant change in the performance test.

@sakshamgarg
Copy link
Contributor Author

I have done manual testing for the test case which uses a job script to set -V option as I was facing user permission issue while trying to automate the test.

This test case exports an environment variable. It sets -V option through a job script directive and then changes the value of the environment variable. It expects the job under execution to get the updated value of the environment variable since its (Job Script Directive's) precedence is higher than the others.

Below is the log for manual testing and I have attached the script files.

script.sh.txt
sample.py.txt

[hadoop@centos2 ~]$ export EXAMPLE=true
[hadoop@centos2 ~]$ qsub script.sh
12558.centos2
[hadoop@centos2 ~]$ cat sample.e12558
[hadoop@centos2 ~]$ cat sample.o12558
Value of EXAMPLE variable before changing:
EXAMPLE=true
Value of EXAMPLE variable after changing
EXAMPLE=false
job script returned changed value

@suresh-thelkar
Copy link

@sakshamgarg, Thanks for making all the suggested changes and PTL test cases. I sign off.

Copy link
Contributor

@hirenvadalia hirenvadalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, I don't think this is right fix as I did quick test as below:

#!/bin/bash

qmgr -c "s s default_qsub_arguments = '-V'"
export EXAMPLE=true
qsub -- /bin/sleep 1000
export EXMAPLE=false
qsub -- /bin/sleep 1000

Means for first job EXAMPLE should be true and for second it should be false
But with this fix, EXAMPLE is always (atleast for 60 sec) true

[root@testdev pbspro]# qstat -f | grep 'EXAMPLE'
	/pbs/bin:/opt/pbs/sbin,PWD=/pbspro,EXAMPLE=true,HOME=/root,SHLVL=2,
	/pbs/bin:/opt/pbs/sbin,PWD=/pbspro,EXAMPLE=true,HOME=/root,SHLVL=2,

I know this happens because of background qsub daemon has old environ when it was forked.
So we need to look for another (right one) fix.

@suresh-thelkar
Copy link

Good catch @hirenvadalia. It looks like qsub's environment is not passed to qsub daemon that we launch internally. Whenever a request is made to qsub daemon we should pass the new environment to it if V_opt is set. If passing this environment turns out to be a costly operation with respect to performance then we can also think of not redirecting the request to qsub daemon when V_opt is set.
Also it would be better to check whether this behaviour was working earlier or not. This might help us in measuring the qsub performance with large environment correctly.

@sakshamgarg
Copy link
Contributor Author

sakshamgarg commented Aug 31, 2018

Below is the log after fixing the issue as reported by @hirenvadalia.

[saksham@centos ~]$ qmgr -c "s s default_qsub_arguments='-V'"
[saksham@centos ~]$ export EXAMPLE=true
[saksham@centos ~]$ qsub -- /bin/sleep 1000
6.centos
[saksham@centos ~]$ export EXAMPLE=false
[saksham@centos ~]$ qsub -- /bin/sleep 1000
7.centos
[saksham@centos ~]$ qstat -f | grep EXAMPLE
/jre,LANG=en_US.UTF-8,GDM_LANG=en_US.UTF-8,EXAMPLE=true,
/jre,LANG=en_US.UTF-8,GDM_LANG=en_US.UTF-8,EXAMPLE=false,
[saksham@centos ~]$

I am also attaching the logs for the manual testing as described in previous comment.
log_manualTesting.txt

Copy link

@suresh-thelkar suresh-thelkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making all the code changes. I sign off.

src/cmds/qsub.c Outdated Show resolved Hide resolved
validate default_qsub_argument option.
"""

def test_option_V(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test for recently discovered use cases (EXAMPLE=true and false one)


class TestDefaultQsubArgument(TestFunctional):
"""
validate default_qsub_argument option.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default_qsub_argument == default_qsub_arguments

Also I suggest this docstring to something like this: Validate whether qsub sends all environ when -V is enabled via default_qsub_arguments and or on command line.

env_var = os.environ.get("SET_IN_SUBMISSION")
self.server.expect(JOB, {'Variable_List': (MATCH_RE,
'SET_IN_SUBMISSION=%s' % env_var)},
id=jid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test case for testing passing full environ via -V on qsub command line

test/tests/functional/pbs_default_qsub_argument.py Outdated Show resolved Hide resolved
os.environ["SET_IN_SUBMISSION"] = "true"
self.server.manager(MGR_CMD_SET, SERVER,
{'default_qsub_arguments': '-V'})
j = Job(PBSROOT_USER)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are setting SET_IN_SUBMISSION in current user's session and submitting job as pbsroot user, which internally PTL will do sudo qsub, and SET_IN_SUBMISSION might not get passed by sudo.
I suggest to submit job as current user's only, you can use self.du.get_current_user() to get current user name

test/tests/functional/pbs_default_qsub_argument.py Outdated Show resolved Hide resolved
test/tests/functional/pbs_default_qsub_argument.py Outdated Show resolved Hide resolved
from tests.functional import *


class TestDefaultQsubArgument(TestFunctional):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to put these test cases in already existing test suite under https://github.com/PBSPro/pbspro/blob/master/test/tests/functional/pbs_passing_environment_variable.py

src/cmds/qsub.c Outdated
if (V_opt)
qsub_envlist = env_array_to_varlist(envp);

qsub_envlist = env_array_to_varlist(envp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling env_array_to_varlist in every condition with definitely decrease performance of qsub even with out -V because env_array_to_varlist is little heavy function which traverse whole environ char by char.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your statement that it will affect performance of qsub.
The other alternative solution to this problem was to connect to the server at the very starting, before it was loading value to qsub_envlist in the "main" function, and fetch the default qsub arguments from the server. But this renders the use of a daemon pointless.
Although this result in performance degradation but it is the only (valid) solution for this bug.
If needed, we can enforce that -V option never be set in default arguments to ensure performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sakshamgarg Another solution might be possible which is let qsub daemon ask foreground qsub to send envp is -V is enabled in default_qsub_arguments on server.
Flow will be something like this:

  1. foreground qsub send all data to daemon qsub except envp if no -V on command line, if -V is passed on qsub command line then pass envp along with another data
  2. daemon qsub will accept data
  3. if -V is enabled in default_qsub_arguments then ask foreground to send envp if not already send
  4. daemon qsub will accept envp from foreground qsub
  5. daemon qsub will submit job

This solution will only qsub performance only in case of -V only which is expected but it doesn't affect performance normal qsub and still we can have daemon qsub

@subhasisb
Copy link
Collaborator

As discussed, the easiest (and right) solution would be to drop backgrounding in case a -V is detected in the qsub default arguments (or qsub command line). Passing on large environments from froreground to background qsubs would be performing bad enough that we might as well do a implied "-f" or direct connect to server (a.k.a no backgrounding)

src/cmds/qsub.c Outdated
@@ -164,6 +164,7 @@
#define QSUB_DMN_TIMEOUT_LONG 60 /* timeout for qsub background process */
#define QSUB_DMN_TIMEOUT_SHORT 5

#define VOPT_ERROR 4321 /* Error code if -V option is detected in background qsub*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having option 'V' is not an error. May be we can rename it like VOPT_SET.

src/cmds/qsub.c Outdated
@@ -4877,6 +4878,28 @@ do_connect(char *server_out, char *retmsg)
return 0;
}

static int
parse_dfltqsubargs(char *retmsg)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write the doxygen header for this function.

src/cmds/qsub.c Outdated
*/
refresh_dfltqsubargs();
rc = parse_dfltqsubargs(retmsg);
if(rc || V_opt){

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can make it more readable by saying rc == 0 && V_opt.

src/cmds/qsub.c Outdated
*/
*do_regular_submit = 0;
if(rc != VOPT_ERROR)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking whether VOPT_ERROR is not set, do a check for rc == VOPT_ERROR, if yes bypass the following code. Basically you need to optimise the code here.

@subhasisb
Copy link
Collaborator

@sakshamgarg the server forces the background daemons to update the default qsub args (When it changes via qmgr), this code is already there. We need to check it after do_dir() in do_submit() and return with a "Unique" error code. This should percolate all the way back to the foreground qsub as a submission return code (only that it is not), and we need to check that in the foreground and set do_forground_qsub=1

src/cmds/qsub.c Outdated
@@ -5982,6 +5961,8 @@ fork_and_stay(void)
/* set single threaded mode */
pbs_client_thread_set_single_threaded_mode();

/* set when background qsub is running */
is_background = 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

src/cmds/qsub.c Outdated
@@ -236,6 +236,7 @@ static char *pbs_hostvar = NULL; /* buffer containing ",PBS_O_HOST=" and host na
static int pbs_o_hostsize = sizeof(",PBS_O_HOST=") + 1; /* size of prefix for hostvar */
static char *display; /* environment variable DISPLAY */

static int is_background = 0; /* flag to check if background qsub is active */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the comment to something like "am i the background process? This variable is set only once and is read-only afterwards"

src/cmds/qsub.c Outdated
/*
* get environment variable if -V option is set. Return the code
* VOPT_SET if -V option is detected in background qsub.
*/
if(V_opt) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before (

src/cmds/qsub.c Outdated
if(V_opt) {
if(is_background)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before (

src/cmds/qsub.c Outdated
rc = -1;
sprintf(retmsg, "Failed to recv data from background qsub\n");
/* Error message will be printed in caller */
if(rc != VOPT_SET){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space before (

src/cmds/qsub.c Outdated
@@ -164,7 +164,7 @@
#define QSUB_DMN_TIMEOUT_LONG 60 /* timeout for qsub background process */
#define QSUB_DMN_TIMEOUT_SHORT 5

#define VOPT_ERROR 4321 /* Error code if -V option is detected in background qsub*/
#define VOPT_SET 4 /* return code if -V option is detected in background qsub*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably best to rename this return code to be more generic, so that it can be used later for other similar purposes, if necessary. Rename it to something like DMN_CANT_SERVE (or think something better)

src/cmds/qsub.c Outdated
if(V_opt) {
if(is_background)
return VOPT_SET;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it would be better if we can check this in do_daemon_stuff and exit the daemon in this return case. This way future qsubs will not have to continuously consult with the background qsub just to know that it should not be going to the background

src/cmds/qsub.c Outdated
@@ -6246,9 +6246,8 @@ main(int argc, char **argv, char **envp) /* qsub */
basic_envlist = job_env_basic();
if (basic_envlist == NULL)
exit_qsub(3);
if(V_opt) {
if(V_opt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still has an indentation problem

src/cmds/qsub.c Outdated

/*
* Something bad happened, either background submitted
* and failed to send us response, or it failed before
* submitting. If background qsub detects -V option, then
* submit the job through foreground.
*/
if(rc != VOPT_SET){
if (rc != DMN_REFUSE_EXIT){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still has a indentation issue between ){

Copy link

@suresh-thelkar suresh-thelkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the code changes. You can implement optional minor comment that I have given if you would like to. However I sign off.

@@ -4915,6 +4922,16 @@ do_submit(char *retmsg)
return (rc);
}

/*
* get environment variable if -V option is set. Return the code

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you repharse it as "get the variable environ which points to an array of pointers to strings of the user environment if -V option is set....".

@sakshamgarg
Copy link
Contributor Author

Attaching the logs of PTL tests.
ptl_output_-V.txt

Copy link
Contributor

@hirenvadalia hirenvadalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sakshamgarg sakshamgarg force-pushed the default_arg_V branch 2 times, most recently from c27c79c to 28b20e8 Compare September 12, 2018 22:17
Not starting background qsub if -V option is set
@sakshamgarg
Copy link
Contributor Author

sakshamgarg commented Sep 13, 2018

@subhasisb I have squashed all commits and rebased.

@subhasisb subhasisb merged commit 7847e71 into openpbs:master Sep 13, 2018
@sakshamgarg sakshamgarg deleted the default_arg_V branch September 17, 2018 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants