Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GridEngine parallel jobs/tasks show up as occupying only one core #289

Open
mightybigcar opened this issue Apr 12, 2017 · 9 comments
Open
Assignees
Milestone

Comments

@mightybigcar
Copy link

mightybigcar commented Apr 12, 2017

A job or task submitted with a parallel environment specification such as
-pe make 16
will occupy 16 cores, but qtop only shows them as occupying a single core. This leads naive users to think the cluster is being under-utilized.

PS I might chase this bug myself, but it could take a while. If someone else with actual experience with the qtop code wants to jump on it in the meantime, feel free.

@fgeorgatos fgeorgatos added this to the 0.9.201705XX milestone May 8, 2017
@fgeorgatos
Copy link
Collaborator

@mightybigcar : thanks for raising this.

we've been seeing this for a while and in fact it has been discussed before, because indeed it makes you think that only 1 out of X (x=12,16,20, whatever) cores is being allocated.

when we had a first crack at it, it turned out that it was not very reliable to identify when the whole node is allocated, under SGE - do you have a reliable means to establish that? (we'd like to hear a method). I could be beta tester in this because the need here is similar

@sfranky
Copy link
Collaborator

sfranky commented Jun 25, 2017

@mightybigcar also, could you kindly provide an xml file showing how -pe make 16 manifests in there?

@fgeorgatos
Copy link
Collaborator

fgeorgatos commented Jun 26, 2017

hi @mightybigcar :

in PR #295, using the queue names world or whole on a node, will have the effect you described. it's understandable that this is not the same like what you asked, but there is a need to deterministically identify which jobs get expanded. your feedback?

@mightybigcar
Copy link
Author

@sfranky ,

Here's the fragment I think you're looking for:

  <JB_script_size>0</JB_script_size>
  <JB_pe>make</JB_pe>
  <JB_pe_range>
    <ranges>
      <RN_min>16</RN_min>
      <RN_max>16</RN_max>
      <RN_step>1</RN_step>
    </ranges>
  </JB_pe_range>

I've also attached the full qstat output.

Cheers,
Chris
qstat-364174.txt

@mightybigcar
Copy link
Author

Fsck! I banana fingered the touchpad and accidentally closed this? Is it possible to reopen it? Sorry about that.....

@sfranky sfranky reopened this Jun 27, 2017
@sfranky
Copy link
Collaborator

sfranky commented Jun 27, 2017

Thanks for that, I'll incorporate it into the system! issue reopened 👍

@sfranky
Copy link
Collaborator

sfranky commented Jun 27, 2017

btw the queue names that follow this rule are customizeable in qtopconf.yaml

@mightybigcar
Copy link
Author

@fgeorgatos Here's a qstat.xml output with the parallel environment info (look for requested_pe).
qstat.txt

@mightybigcar
Copy link
Author

mightybigcar commented Aug 9, 2017

Hi @fgeorgatos,

when we had a first crack at it, it turned out that it was not very reliable to identify when the whole node is allocated, under SGE - do you have a reliable means to establish that? (we'd like to hear a method).

For determining whether a node is fully allocated, I use a rather simplistic approach and look at the qstat.xml for a given node. For example:

[sgeadmin@barrel ~]$ qstat -f -q *@doppelbock -xml
<?xml version='1.0'?>
<job_info  xmlns:xsd="http://gridscheduler.svn.sourceforge.net/viewvc/gridscheduler/trunk/source/dist/util/resources/schemas/qstat/qstat.xsd?revision=11">
  <queue_info>
    <Queue-List>
      <name>all.q@doppelbock</name>
      <qtype>BP</qtype>
      <slots_used>64</slots_used>
      <slots_resv>0</slots_resv>
      <slots_total>64</slots_total>
      <load_avg>49.93000</load_avg>
      <arch>linux-x64</arch>
    </Queue-List>
    <Queue-List>
      <name>background.q@doppelbock</name>
      <qtype>BIP</qtype>
      <slots_used>0</slots_used>
      <slots_resv>0</slots_resv>
      <slots_total>64</slots_total>
      <load_avg>49.93000</load_avg>
      <arch>linux-x64</arch>
      <state>S</state>
    </Queue-List>
    <Queue-List>
      <name>mapreduce.q@doppelbock</name>
      <qtype>BIP</qtype>
      <slots_used>0</slots_used>
      <slots_resv>0</slots_resv>
      <slots_total>64</slots_total>
      <load_avg>49.93000</load_avg>
      <arch>linux-x64</arch>
      <state>S</state>
    </Queue-List>
    <Queue-List>
      <name>simulation.q@doppelbock</name>
      <qtype>BP</qtype>
      <slots_used>0</slots_used>
      <slots_resv>0</slots_resv>
      <slots_total>64</slots_total>
      <load_avg>49.93000</load_avg>
      <arch>linux-x64</arch>
      <state>S</state>
    </Queue-List>
  </queue_info>
  <job_info>
  </job_info>
</job_info>
[sgeadmin@barrel ~]$```

Since we always map one slot per logical CPU (Opteron core or Xeon hyperthread), I simply sum up the values for slots_used.  If the result is equal to the number of logical CPUs, then the node is fully booked.  If it's greater than the number of CPUs, I consider the node overbooked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants