fix some O(N) and O(N^2) operations in #3108

merged 8 commits into from Apr 11, 2013


None yet
1 participant

minrk commented Mar 28, 2013

some left over code from (presumably) when the map partitioning objects
were stateful made the current implemenation O(N^2),
thus crazy slow for much more than a thousand tasks.

A decorator has also been fixed to avoid calling itself more than once in a given context.

An O(N) operation when TaskScheduler.hwm != 0 has also been fixed.

Illustration of performance differences:

closes #3106

minrk added some commits Mar 28, 2013

@minrk minrk don't create lists
This is crazy, and makes getting all partitions of a sequence an n^2 operation
@minrk minrk notice nesting of `sync_results` decorator
ensures the sync operation only fires once, after the outermost call.

avoids O(n) calls to set.difference

minrk added some commits Mar 29, 2013

@minrk minrk check whether all engines are at HWM in a few places
avoids calling maybe_run on the entire queue when we know that no tasks can be run.
This caused an expensive O(N) operation every time an engine just became not-full,
which is every time a task finishes in the default case of HWM=1.
@minrk minrk use heap-sorted queue
avoids having to sort the queue when updating the graph

also renamed `depending` to `queue_map`, to better describe its role.
@minrk minrk add debug log when a task is added to the queue 6d459fc
@minrk minrk use per-timeout callback, rather than audit for timeouts 6caa08f
@minrk minrk use deque instead of heapq fa8f1e1
@minrk minrk remove accidental debug statement 38272bf

minrk merged commit a4d5e24 into ipython:master Apr 11, 2013

1 check passed

default The Travis build passed

minrk deleted the minrk:fastmap branch Apr 11, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment