Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

TaskScheduler.hwm default to 1 instead of 0 #1294

Merged
merged 2 commits into from

2 participants

Min RK Fernando Perez
Min RK
Owner

1 has more predictable/intuitive behavior, if often slower, and thus a more logical default.

closes #1293

minrk added some commits
Min RK minrk change TaskScheduler.hwm default to 1 from 0
0 should be considered an optimization, and 1 is less likely to confuse new users.
c708a85
Min RK minrk document new default hwm value d2d5e36
Fernando Perez fperez merged commit 0cde7de into from
Fernando Perez
Owner

Great, thanks. Just merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 19, 2012
  1. Min RK

    change TaskScheduler.hwm default to 1 from 0

    minrk authored
    0 should be considered an optimization, and 1 is less likely to confuse new users.
Commits on Jan 20, 2012
  1. Min RK

    document new default hwm value

    minrk authored
This page is out of date. Refresh to see the latest.
16 IPython/parallel/controller/scheduler.py
View
@@ -131,13 +131,23 @@ class TaskScheduler(SessionFactory):
"""
- hwm = Integer(0, config=True, shortname='hwm',
+ hwm = Integer(1, config=True,
help="""specify the High Water Mark (HWM) for the downstream
socket in the Task scheduler. This is the maximum number
- of allowed outstanding tasks on each engine."""
+ of allowed outstanding tasks on each engine.
+
+ The default (1) means that only one task can be outstanding on each
+ engine. Setting TaskScheduler.hwm=0 means there is no limit, and the
+ engines continue to be assigned tasks while they are working,
+ effectively hiding network latency behind computation, but can result
+ in an imbalance of work when submitting many heterogenous tasks all at
+ once. Any positive value greater than one is a compromise between the
+ two.
+
+ """
)
scheme_name = Enum(('leastload', 'pure', 'lru', 'plainrandom', 'weighted', 'twobin'),
- 'leastload', config=True, shortname='scheme', allow_none=False,
+ 'leastload', config=True, allow_none=False,
help="""select the task scheduler scheme [default: Python LRU]
Options are: 'pure', 'lru', 'plainrandom', 'weighted', 'twobin','leastload'"""
)
11 docs/source/parallel/parallel_task.txt
View
@@ -415,11 +415,11 @@ assigned to an engine at a given time. This limit is set with the
.. sourcecode:: python
# the most common choices are:
- c.TaskSheduler.hwm = 0 # (minimal latency, default)
+ c.TaskSheduler.hwm = 0 # (minimal latency, default in IPython ≤ 0.12)
# or
- c.TaskScheduler.hwm = 1 # (most-informed balancing)
+ c.TaskScheduler.hwm = 1 # (most-informed balancing, default in > 0.12)
-The default is 0, or no-limit. That is, there is no limit to the number of
+In IPython ≤ 0.12,the default is 0, or no-limit. That is, there is no limit to the number of
tasks that can be outstanding on a given engine. This greatly benefits the
latency of execution, because network traffic can be hidden behind computation.
However, this means that workload is assigned without knowledge of how long
@@ -429,6 +429,11 @@ effect by setting hwm to a positive integer, 1 being maximum load-balancing (a
task will never be waiting if there is an idle engine), and any larger number
being a compromise between load-balance and latency-hiding.
+In practice, some users have been confused by having this optimization on by
+default, and the default value has been changed to 1. This can be slower,
+but has more obvious behavior and won't result in assigning too many tasks to
+some engines in heterogeneous cases.
+
Pure ZMQ Scheduler
------------------
Something went wrong with that request. Please try again.