Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

TaskScheduler.hwm default to 1 instead of 0 #1294

Merged
merged 2 commits into from

2 participants

@minrk
Owner

1 has more predictable/intuitive behavior, if often slower, and thus a more logical default.

closes #1293

minrk added some commits
@minrk minrk change TaskScheduler.hwm default to 1 from 0
0 should be considered an optimization, and 1 is less likely to confuse new users.
c708a85
@minrk minrk document new default hwm value d2d5e36
@fperez fperez merged commit 0cde7de into from
@fperez
Owner

Great, thanks. Just merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 19, 2012
  1. @minrk

    change TaskScheduler.hwm default to 1 from 0

    minrk authored
    0 should be considered an optimization, and 1 is less likely to confuse new users.
Commits on Jan 20, 2012
  1. @minrk

    document new default hwm value

    minrk authored
This page is out of date. Refresh to see the latest.
View
16 IPython/parallel/controller/scheduler.py
@@ -131,13 +131,23 @@ class TaskScheduler(SessionFactory):
"""
- hwm = Integer(0, config=True, shortname='hwm',
+ hwm = Integer(1, config=True,
help="""specify the High Water Mark (HWM) for the downstream
socket in the Task scheduler. This is the maximum number
- of allowed outstanding tasks on each engine."""
+ of allowed outstanding tasks on each engine.
+
+ The default (1) means that only one task can be outstanding on each
+ engine. Setting TaskScheduler.hwm=0 means there is no limit, and the
+ engines continue to be assigned tasks while they are working,
+ effectively hiding network latency behind computation, but can result
+ in an imbalance of work when submitting many heterogenous tasks all at
+ once. Any positive value greater than one is a compromise between the
+ two.
+
+ """
)
scheme_name = Enum(('leastload', 'pure', 'lru', 'plainrandom', 'weighted', 'twobin'),
- 'leastload', config=True, shortname='scheme', allow_none=False,
+ 'leastload', config=True, allow_none=False,
help="""select the task scheduler scheme [default: Python LRU]
Options are: 'pure', 'lru', 'plainrandom', 'weighted', 'twobin','leastload'"""
)
View
11 docs/source/parallel/parallel_task.txt
@@ -415,11 +415,11 @@ assigned to an engine at a given time. This limit is set with the
.. sourcecode:: python
# the most common choices are:
- c.TaskSheduler.hwm = 0 # (minimal latency, default)
+ c.TaskSheduler.hwm = 0 # (minimal latency, default in IPython ≤ 0.12)
# or
- c.TaskScheduler.hwm = 1 # (most-informed balancing)
+ c.TaskScheduler.hwm = 1 # (most-informed balancing, default in > 0.12)
-The default is 0, or no-limit. That is, there is no limit to the number of
+In IPython ≤ 0.12,the default is 0, or no-limit. That is, there is no limit to the number of
tasks that can be outstanding on a given engine. This greatly benefits the
latency of execution, because network traffic can be hidden behind computation.
However, this means that workload is assigned without knowledge of how long
@@ -429,6 +429,11 @@ effect by setting hwm to a positive integer, 1 being maximum load-balancing (a
task will never be waiting if there is an idle engine), and any larger number
being a compromise between load-balance and latency-hiding.
+In practice, some users have been confused by having this optimization on by
+default, and the default value has been changed to 1. This can be slower,
+but has more obvious behavior and won't result in assigning too many tasks to
+some engines in heterogeneous cases.
+
Pure ZMQ Scheduler
------------------
Something went wrong with that request. Please try again.