Wall time #4

mrocklin · 2016-10-13T17:37:13Z

My understanding is that job schedulers tend to schedule jobs based on their advertised wall times. Allocations of small short-running jobs can squeeze in to the cluster sooner than many long-running jobs.

Dask workers can be fairly flexible here. We can add and remove many single-node jobs frequently, handing data off from about-to-expire workers to workers with a long time to live. It's unclear to me how much value there is here or if this is something we should focus on.

The default setting for drmaa-python seems to be to totally ignore walltime settings. Is this common? Perhaps DRMAA-style clusters are often underutilized so this is not a big issue?

davidr · 2016-10-17T15:30:53Z

In my experience, this, unfortunately, is very implementation-specific. In short, wallclock runtime is almost always required (and very important to the scheduling algorithm as you correctly note), but DRMAAv1 doesn't have a truly universal way to request it.

This is usually true for memory as well. Each DRMAA-compliant scheduler likely has a different way to request it, although always through the "template.nativeSpecification" option. After I initialize the distributed client with the DRMAACluster, if I use this parameter to request memory and wallclock time, it does the right thing, for Grid Engine values of "the right thing":

>>> cluster.worker_template.nativeSpecification = "-l h_vmem=1G,estmem=1G,h_rt=1:0:0"
>>> jobids = cluster.start_workers(10)

Then a qstat shows the correct number of jobs running, each requesting 1 hour of time and 1G of memory. The resource names (h_rt and h_vmem) above can vary from site to site, and can even be configured though multiple resources (in my case, we use "estmem" and "h_vmem" both to request memory for irritating, but required reasons.)

mrocklin · 2016-10-17T16:36:50Z

Out of curiousity what happens when you schedule something without providing a wall time? Does SGE impose a default limit? Are you scheduled with "infinity" time?

davidr · 2016-10-17T17:30:49Z

All the GE-derivatives I know of either a) refuse to schedule the job, and/or b) impose a default limit. My primary scheduler, for instance refuses to schedule the job:

In [1]: from dask_drmaa import DRMAACluster
In [2]: from dask.distributed import Client
In [3]: cluster = DRMAACluster()
In [4]: client = Client(cluster)
In [5]: cluster.worker_template.nativeSpecification = "-l h_vmem=1G,estmem=1G"
In [6]: cluster.start_workers(1)
---------------------------------------------------------------------------
DeniedByDrmException                      Traceback (most recent call last)
<ipython-input-6-37720cbb2f2f> in <module>()
----> 1 cluster.start_workers(1)

/home/dressman/anaconda2/envs/dask/lib/python2.7/site-packages/dask_drmaa/core.pyc in start_workers(self, n)
     27
     28     def start_workers(self, n=1):
---> 29         ids = self.session.runBulkJobs(self.worker_template, 1, n, 1)
     30         self.workers.extend(ids)
     31

/home/dressman/anaconda2/envs/dask/lib/python2.7/site-packages/drmaa/session.pyc in runBulkJobs(jobTemplate, beginIndex, endIndex, step)
    338         the tasks submitted through this method.
    339         """
--> 340         return list(run_bulk_job(jobTemplate, beginIndex, endIndex, step))
    341
    342     # takes string and JobControlAction value, no return value

/home/dressman/anaconda2/envs/dask/lib/python2.7/site-packages/drmaa/helpers.pyc in run_bulk_job(jt, start, end, incr)
    280     jids = pointer(POINTER(drmaa_job_ids_t)())
    281     try:
--> 282         c(drmaa_run_bulk_jobs, jids, jt, start, end, incr)
    283         jid = create_string_buffer(_BUFLEN)
    284         while drmaa_get_next_job_id(jids.contents, jid,

/home/dressman/anaconda2/envs/dask/lib/python2.7/site-packages/drmaa/helpers.pyc in c(f, *args)
    297     managing code.
    298     """
--> 299     return f(*(args + (error_buffer, sizeof(error_buffer))))
    300
    301

/home/dressman/anaconda2/envs/dask/lib/python2.7/site-packages/drmaa/errors.pyc in error_check(code)
    149         error_string = "code {0}: {1}".format(code, error_buffer.value.decode())
    150         try:
--> 151             raise _ERRORS[code - 1](error_string)
    152         except IndexError:
    153             raise DrmaaException(error_string)

DeniedByDrmException: code 17: error: no suitable queues

If I add it, it schedules fine:

In [7]: cluster.worker_template.nativeSpecification = "-l h_vmem=1G,estmem=1G -l h_rt=1:0:0"
In [8]: cluster.start_workers(1)
In [9]:

Incidentally, should start_workers() be returning job ids? From the README I gather it should, but I've spent absolutely no time verifying I haven't done something stupid. If the answer is obvious, ignore me. :)

davidr · 2016-10-17T17:36:02Z

That said, I think you're allowed to set a default wallclock time of "0:0:0", which I think would be infinite, but it's still an explicitly set default time, if that phrase makes any sense at all.

jakirkham · 2018-03-30T19:15:45Z

Think it should be possible to address this more generally as noted in issue ( #66 ). We can check the scheduler implementation through DRMAA and thus figure out what flags are appropriate to pass to nativeSpecification for the particular scheduler implementation. Thoughts welcome.

davidr mentioned this issue Oct 17, 2016

Pass through dask-worker configuration options #3

Open

mrocklin mentioned this issue Oct 17, 2016

How to Handle Scheduler Specific Details #7

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wall time #4

Wall time #4

mrocklin commented Oct 13, 2016

davidr commented Oct 17, 2016

mrocklin commented Oct 17, 2016

davidr commented Oct 17, 2016

davidr commented Oct 17, 2016

jakirkham commented Mar 30, 2018

Wall time #4

Wall time #4

Comments

mrocklin commented Oct 13, 2016

davidr commented Oct 17, 2016

mrocklin commented Oct 17, 2016

davidr commented Oct 17, 2016

davidr commented Oct 17, 2016

jakirkham commented Mar 30, 2018