Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qstat -f fails for more than 48 job ids #1047

Merged
merged 3 commits into from Mar 28, 2019

Conversation

Projects
None yet
3 participants
@Bhagat-Rajput
Copy link
Contributor

commented Mar 20, 2019

Bug/feature Description

  • qstat -f fails for more than 48 job ids due to lack of server connection thus qstat reports error
    "Too many open connections"

Affected Platform(s)

  • ALL

Cause / Analysis / Design

  • Not closing the server connection in check_max_job_sequence_id() after using.

Solution Description

  • Passed the already opened connection and qstat will close itself once work will be done.

Testing logs/output

Checklist:

For further information please visit the Developer Guide Home.

exit(1);
/* check the server attribute max_job_sequence_id value */
if (p_server != NULL) {
check_seqid_len = check_max_job_sequence_id(p_server);

This comment has been minimized.

Copy link
@subhasisb

subhasisb Mar 20, 2019

Collaborator

This is better, but will still call check_max_job_sequence_id() for every jobid, even though those jobs are from the same server. Thus, we should call this only once for one cnt2server() calls, way higher, where we do the pbs_statserver().

@RKORavi

This comment has been minimized.

Copy link

commented Mar 25, 2019

@Bhagat-Rajput could you please add time it takes to qstat 1500 job ids after the fix.

@Bhagat-Rajput

This comment has been minimized.

Copy link
Contributor Author

commented Mar 25, 2019

Hello @RKORavi, i have just submitted 1500 sleep jobs on my VM and it took around ~9 secs to print the qstat -f output for all the 1500 jobs, time results are pasted below.
real 0m8.757s
user 0m1.265s
sys 0m5.051s
I think it could be better on actual performance machine.

@RKORavi

This comment has been minimized.

Copy link

commented Mar 25, 2019

Hello @RKORavi, i have just submitted 1500 sleep jobs on my VM and it took around ~9 secs to print the qstat -f output for all the 1500 jobs, time results are pasted below.
real 0m8.757s
user 0m1.265s
sys 0m5.051s
I think it could be better on actual performance machine.

Thankyou for testing it.

@subhasisb

This comment has been minimized.

Copy link
Collaborator

commented Mar 28, 2019

@Bhagat-Rajput are u still working on this? If you are done, please remove the [WIP] flag from this. Also please run your tests under valgrind

@Bhagat-Rajput Bhagat-Rajput changed the title [WIP] Qstat -f fails for more than 48 job ids Qstat -f fails for more than 48 job ids Mar 28, 2019

@Bhagat-Rajput

This comment has been minimized.

Copy link
Contributor Author

commented Mar 28, 2019

Hello @subhasisb , i've added a new ptl testcase and valgrind logs to the ticket.

@subhasisb subhasisb merged commit aae6c52 into PBSPro:master Mar 28, 2019

3 of 4 checks passed

Codacy/PR Quality Review Not up to standards. This pull request quality could be better.
Details
Travis CI - Pull Request Build Passed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.