Add support for checking several limits before running task on engine #827

kaazoo opened this Issue Sep 30, 2011 · 9 comments


None yet

3 participants

kaazoo commented Sep 30, 2011

It would be nice to be able to define some limits for a task, for example operating system, minimal amount of RAM, minimal amount of CPU cores. If there are different computers connected to ipcontroller, tasks would only be queued to engines which are able to fully execute those tasks.

minrk commented Sep 30, 2011

This is already possible with the current dependency mechanism - you can define perfectly arbitrary python functions, to be run on the engine, that will determine whether the job should run there:

kaazoo commented Oct 2, 2011

Thanks for the hint.
In the documentation (both stable and dev) the 'dependent' object is mentioned as an alternative to the '@depend' decorator:

You don’t have to use the decorators on your tasks, if for instance you may want 
to run tasks with a single function but varying dependencies, you can directly 
construct the dependent object that the decorators use:

It feels like there should have been some example, because of the ':', but instead the next chapter about 'Graph Dependencies' follows.
Could you please provide an example in the documentation?

I try to add a dependency in the following way, but it doesn't seem to work:

dependent(DrQueue.run_script_with_env, self.engine_has_os, os_name)
ar = self.lbview.apply(DrQueue.run_script_with_env, render_script, env_dict)
minrk commented Oct 2, 2011

I will definitely mock up some examples, but the basic idea is that if your task raises an UnmetDependency error, it will halt, and try again somewhere else. The @depend/@require/dependent tools are just convenience functions that will raise this error after a simple check.

The dependent object doesn't modify the original function, it creates a new one:

run_script_on_os = dependent(run_script_with_env, engine_has_os, os_name)
ar = lbview.apply(run_script_on_os, render_script, env_dict)

If you want the function itself to always depend on something, you can use @depend:

@depend(engine_has_os, os_name)
def run_script_with_env(...):
lbview.apply(run_script_with_env, ...)

Or if you want to consider the dependency unmet somewhere in the middle of your task, you can just raise the error yourself:

def my_task():
    from IPython.parallel.error import UnmetDependency
    if condition:
        raise UnmetDependency
kaazoo commented Oct 3, 2011

Thanks for your examples.
Is it also possible to build a 'dependecy chain' like this?:

dep_os = dependent(DrQueue.run_script_with_env, self.engine_has_os, os_name)
dep_minram = dependent(dep_os, self.engine_has_minram, minram)
dep_mincores = dependent(dep_minram, self.engine_has_mincores, mincores)
dep_pool = dependent(dep_mincores, self.engine_is_in_pool, pool_name)

ar = self.lbview.apply(dep_pool, render_script, env_dict)

Is it also possible to refer to the current engine_id?

kaazoo commented Oct 3, 2011

Now I tried it with only one dependent function which wraps the other functions:

run_script_with_env_and_deps = dependent(DrQueue.run_script_with_env, self.check_deps, os_name, minram, mincores, pool_name)
ar = self.lbview.apply(run_script_with_env_and_deps, render_script, env_dict)

check_deps function:

def check_deps(self, os_name, minram, mincores, pool_name):
    if self.engine_has_os(os_name) == False:
        return False
    elif self.engine_has_minram(minram) == False:
        return False
    elif self.engine_has_mincores(mincores) == False:
        return False
    elif self.engine_is_in_pool(pool_name) == False:
        return False
        return True

IPython gives me the following exception:

Traceback (most recent call last):
  File "", line 100, in <module>
  File "", line 71, in main
  File "/Users/kaazoo/Documents/Entwicklung/drqueue-entwicklung/drqueue-ipython/DrQueue/", line 217, in job_run
    ar = self.lbview.apply(run_script_with_env_and_deps, render_script, env_dict)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/parallel/client/", line 209, in apply
    return self._really_apply(f, args, kwargs)
  File "<string>", line 2, in _really_apply
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/parallel/client/", line 57, in sync_results
    ret = f(self, *args, **kwargs)
  File "<string>", line 2, in _really_apply
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/parallel/client/", line 46, in save_ids
    ret = f(self, *args, **kwargs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/parallel/client/", line 980, in _really_apply
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/parallel/client/", line 992, in send_apply_message
    bufs = util.pack_apply_message(f,args,kwargs)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/parallel/", line 267, in pack_apply_message
    msg = [pickle.dumps(can(f),-1)]
cPickle.PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed
minrk commented Oct 3, 2011

instancemethods aren't picklable - this is happening because you are trying to execute instance methods remotely, which doesn't work, as that would require sending the entire instance, which you probably don't want to do. I don't know enough about your code, but it's possible that using @staticmethod will solve that problem.

Do these methods really need to have references to the self instance? If not, they should be static methods, or even functions defined either at the module-level or at runtime.

It is possible to access the engine_id. The best way is to set the engine ids from the client, and then get it out of globals(), but if you can't rely on that, you can get it from the application:

from IPython.config.application import Application
eid = Application.instance()

I should probably initialize the user_ns with the id, though.

kaazoo commented Oct 3, 2011

Thanks a lot. Moving the functions to the main module instead of using instance methods makes it work.

kaazoo commented Oct 3, 2011

Getting the current engine_id also works. If not already existing, this would also make sense for some users to be documented.

@kaazoo kaazoo closed this Oct 4, 2011

Now, I don't obtain the engine_id when i execute my program. How can i obtain it from this meth Application.instance()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment