Add support for creation of parallel task when no engine is running #826

kaazoo · 2011-09-30T06:24:53Z

IPython-0.11 complains when I try to create a parallel task when no engines are running. This makes sense for users who need results directly after they create a task. But there are also users who would like to create tasks in advance, before any engines are started. These tasks could have been stored in ipcontroller/backend database. When the first engine connects to ipcontroller it could start processing those tasks.

minrk · 2011-09-30T16:32:35Z

This is already true in trunk for load-balanced tasks. Obviously, you can't send jobs to particular engines that don't exist.

kaazoo · 2011-10-04T22:19:18Z

Thanks. This works with the 0.12 branch.

kaazoo · 2012-02-06T19:57:17Z

With IPython 0.12, I just noticed that I can't get information about tasks using rc.query_db() which where created when no engine was running until I start an engine. The start of the first engine seems to trigger ipcontroller to make that information available.

Is this intended?
I would assume that all tasks which haven't been queued to an engine would have a 'pending' status.

minrk · 2012-02-06T20:46:41Z

Hmm.

The TaskScheduler disables the on_recv callback from clients when there are no engines, because it is certain that tasks cannot be submitted. This means that they sit in the ØMQ queue, and do not trigger the callback that notifies the Hub, putting the request in the database. I imagine I can adjust it in such a way that the tasks are pulled into Python, instead of left in the buffer.

kaazoo · 2012-02-06T20:52:55Z

If you need further proof let me know. Can you reproduce this behaviour?

minrk · 2012-02-06T20:57:07Z

Sorry if I wasn't clear, I was trying to explain why you see what you are seeing. The Hub (the process that maintains the DB) will not get tasks submitted while there are no engines, because the TaskScheduler never calls recv, leaving messages in the upstream ØMQ buffer. Essentially the queue processing is paused slightly further upstream than you would like. I will look into addressing this.

kaazoo · 2012-02-08T19:29:32Z

Thank you for your explaination. Is there any workaround for this?

kaazoo · 2012-02-08T19:44:09Z

Modifying IPython/parallel/controller/scheduler.py made it work, but this is obviously not the solution:

--- ../ipython-git/IPython/parallel/controller/scheduler.py 2011-12-04 21:36:57.000000000 +0100
+++ IPython/parallel/controller/scheduler.py    2012-02-08 20:35:49.000000000 +0100
@@ -181,6 +181,7 @@

     def start(self):
         self.engine_stream.on_recv(self.dispatch_result, copy=False)
+        self.client_stream.on_recv(self.dispatch_submission, copy=False)
         self._notification_handlers = dict(
             registration_notification = self._register_engine,
             unregistration_notification = self._unregister_engine
@@ -192,12 +193,12 @@

     def resume_receiving(self):
         """Resume accepting jobs."""
-        self.client_stream.on_recv(self.dispatch_submission, copy=False)
+        #self.client_stream.on_recv(self.dispatch_submission, copy=False)

     def stop_receiving(self):
         """Stop accepting jobs while there are no engines.
         Leave them in the ZMQ queue."""
-        self.client_stream.on_recv(None)
+        #self.client_stream.on_recv(None)

     #-----------------------------------------------------------------------
     # [Un]Registration Handling

minrk · 2012-02-08T20:31:42Z

I think that is approximately the solution you are looking for (I'll do a PR shortly). There certainly isn't a workaround with code as it is, because it is functioning exactly as designed - the Scheduler is entirely halted while no engines are registered. But with the changes to the scheduler supporting dependencies, etc., I think this should work fine with only minor tweaks.

Improve Hub/Scheduler when no engines are registered 1. Tasks are pulled into the schedule, rather than left on the ZMQ queue, which means they enter the database. 2. queue_status will not raise NoEngines when there aren't any, instead it will still fetch the available information. Bug fixed in db_query, where behavior did not match docstring (buffers should be excluded if no keys are specified). closes ipython#826 (again)

kaazoo closed this as completed Oct 4, 2011

kaazoo reopened this Feb 6, 2012

minrk mentioned this issue Feb 8, 2012

Improve Hub/Scheduler when no engines are registered #1391

Merged

minrk closed this as completed in 821fac2 Feb 9, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for creation of parallel task when no engine is running #826

Add support for creation of parallel task when no engine is running #826

kaazoo commented Sep 30, 2011

minrk commented Sep 30, 2011

kaazoo commented Oct 4, 2011

kaazoo commented Feb 6, 2012

minrk commented Feb 6, 2012

kaazoo commented Feb 6, 2012

minrk commented Feb 6, 2012

kaazoo commented Feb 8, 2012

kaazoo commented Feb 8, 2012

minrk commented Feb 8, 2012

Add support for creation of parallel task when no engine is running #826

Add support for creation of parallel task when no engine is running #826

Comments

kaazoo commented Sep 30, 2011

minrk commented Sep 30, 2011

kaazoo commented Oct 4, 2011

kaazoo commented Feb 6, 2012

minrk commented Feb 6, 2012

kaazoo commented Feb 6, 2012

minrk commented Feb 6, 2012

kaazoo commented Feb 8, 2012

kaazoo commented Feb 8, 2012

minrk commented Feb 8, 2012