Fix backpressure handling of queued actor pool tasks #34254

ericl · 2023-04-11T01:37:28Z

Why are these changes needed?

There is a bug in the backpressure implementation with regard to actor pools, in that once a task is queued for an actor pool, it is no longer subject to backpressure. This is problematic when the output size of a task is much bigger than the input size. In this situation, the actor pool will keep executing tasks (converting small objects into larger objects), even when this would grossly exceed memory limits.

Put another way: it fixes the issue where the streaming executor queues tasks on an actor pool operator, but later on wants to "take it back" due to unexpectedly high memory usage. This avoids the issue by not queueing tasks that won't be immediately executed (so they won't need to be taken back).

Example:

Suppose there is an actor pool of size 10, each of which can take 1 active task each.
Each input task is size 1GB. The memory limit is 100GB, so we add 100 of these inputs in an actor pool operator.
When the tasks run, they expand into 100GB of output each. Now, the memory usage overall is 200GB (2x over our limit!).
However, since we already added those 100 inputs to the actor pool, there is no way of the streaming scheduler to pause execution of those 90 remaining queued inputs.
Now the 90 queued inputs execute and we end up using 1TB, or 10x our intended memory limit.

We need to check for the memory limit right before executing a task in the actor pool; one way of doing this is to eliminate the internal queue in the actor pool operator and instead always queue work outside the operator.

TODO:

Performance testing
Unit tests
Perf test final version

Related issue number

Closes #34041

Signed-off-by: Eric Liang <ekhliang@gmail.com>

ericl · 2023-04-12T02:37:36Z

python/ray/data/_internal/execution/interfaces.py

@@ -445,6 +453,17 @@ def incremental_resource_usage(self) -> ExecutionResources:
        """
        return ExecutionResources()

+    def notify_resource_usage(


This is kind of gross, but not sure of a better way to still trigger autoscaling.

Signed-off-by: Eric Liang <ekhliang@gmail.com>

amogkam

overall lgtm, but will leave to @c21 for a more thorough review

amogkam · 2023-04-14T20:17:44Z

python/ray/data/_internal/execution/interfaces.py

+
+        Args:
+            input_queue_size: The number of inputs queued outside this operator.
+            under_resource_limits: Whether this operator is under resource limits.


just for my understanding, how come the operator needs to be told if it's under resource limits instead of knowing this information itself?

This is since the limit is across all the operators, so only the executor knows the full picture.

amogkam · 2023-04-14T20:21:37Z

python/ray/data/_internal/execution/interfaces.py

+
+        Args:
+            input_queue_size: The number of inputs queued outside this operator.
+            under_resource_limits: Whether this operator is under resource limits.


Suggested change

under_resource_limits: Whether this operator is under resource limits.

under_resource_limits: Whether all operators are under resource limits.

this still indicates whether this particular operator is under resource limits right? Is Eric saying above that the executor is the own who knows about all operators' limits, but would also be the one to dictate whether this particular operator is under resource limits?

Yeah, I think it's still about this particular operator. It's possible for some operators to be under limits but others to be above. For example, we count all downstream operator memory usage towards the "effective usage" of an operator.

scottjlee · 2023-04-14T22:20:49Z

python/ray/data/_internal/execution/operators/actor_pool_map_operator.py

@@ -471,7 +499,7 @@ class _ActorPool:
    actors when the operator is done submitting work to the pool.
    """

-    def __init__(self, max_tasks_in_flight: int = float("inf")):
+    def __init__(self, max_tasks_in_flight: int = 4):


is this the same value as DEFAULT_MAX_TASKS_IN_FLIGHT? if so, any way we can use the same constant?

Signed-off-by: Eric Liang <ekhliang@gmail.com>

There is a bug in the backpressure implementation with regard to actor pools, in that once a task is queued for an actor pool, it is no longer subject to backpressure. This is problematic when the output size of a task is much bigger than the input size. In this situation, the actor pool will keep executing tasks (converting small objects into larger objects), even when this would grossly exceed memory limits. Put another way: it fixes the issue where the streaming executor queues tasks on an actor pool operator, but later on wants to "take it back" due to unexpectedly high memory usage. This avoids the issue by not queueing tasks that won't be immediately executed (so they won't need to be taken back). Example: 1. Suppose there is an actor pool of size 10, each of which can take 1 active task each. 2. Each input task is size 1GB. The memory limit is 100GB, so we add 100 of these inputs in an actor pool operator. 3. When the tasks run, they expand into 100GB of output each. Now, the memory usage overall is 200GB (2x over our limit!). 4. However, since we already added those 100 inputs to the actor pool, there is no way of the streaming scheduler to pause execution of those 90 remaining queued inputs. 5. Now the 90 queued inputs execute and we end up using 1TB, or 10x our intended memory limit. We need to check for the memory limit right before executing a task in the actor pool; one way of doing this is to eliminate the internal queue in the actor pool operator and instead always queue work outside the operator. TODO: - [x] Performance testing - [x] Unit tests - [x] Perf test final version

There is a bug in the backpressure implementation with regard to actor pools, in that once a task is queued for an actor pool, it is no longer subject to backpressure. This is problematic when the output size of a task is much bigger than the input size. In this situation, the actor pool will keep executing tasks (converting small objects into larger objects), even when this would grossly exceed memory limits. Put another way: it fixes the issue where the streaming executor queues tasks on an actor pool operator, but later on wants to "take it back" due to unexpectedly high memory usage. This avoids the issue by not queueing tasks that won't be immediately executed (so they won't need to be taken back). Example: 1. Suppose there is an actor pool of size 10, each of which can take 1 active task each. 2. Each input task is size 1GB. The memory limit is 100GB, so we add 100 of these inputs in an actor pool operator. 3. When the tasks run, they expand into 100GB of output each. Now, the memory usage overall is 200GB (2x over our limit!). 4. However, since we already added those 100 inputs to the actor pool, there is no way of the streaming scheduler to pause execution of those 90 remaining queued inputs. 5. Now the 90 queued inputs execute and we end up using 1TB, or 10x our intended memory limit. We need to check for the memory limit right before executing a task in the actor pool; one way of doing this is to eliminate the internal queue in the actor pool operator and instead always queue work outside the operator. TODO: - [x] Performance testing - [x] Unit tests - [x] Perf test final version Signed-off-by: elliottower <elliot@elliottower.com>

There is a bug in the backpressure implementation with regard to actor pools, in that once a task is queued for an actor pool, it is no longer subject to backpressure. This is problematic when the output size of a task is much bigger than the input size. In this situation, the actor pool will keep executing tasks (converting small objects into larger objects), even when this would grossly exceed memory limits. Put another way: it fixes the issue where the streaming executor queues tasks on an actor pool operator, but later on wants to "take it back" due to unexpectedly high memory usage. This avoids the issue by not queueing tasks that won't be immediately executed (so they won't need to be taken back). Example: 1. Suppose there is an actor pool of size 10, each of which can take 1 active task each. 2. Each input task is size 1GB. The memory limit is 100GB, so we add 100 of these inputs in an actor pool operator. 3. When the tasks run, they expand into 100GB of output each. Now, the memory usage overall is 200GB (2x over our limit!). 4. However, since we already added those 100 inputs to the actor pool, there is no way of the streaming scheduler to pause execution of those 90 remaining queued inputs. 5. Now the 90 queued inputs execute and we end up using 1TB, or 10x our intended memory limit. We need to check for the memory limit right before executing a task in the actor pool; one way of doing this is to eliminate the internal queue in the actor pool operator and instead always queue work outside the operator. TODO: - [x] Performance testing - [x] Unit tests - [x] Perf test final version Signed-off-by: Jack He <jackhe2345@gmail.com>

ericl added 2 commits April 10, 2023 17:51

fix initial

0b6a772

Signed-off-by: Eric Liang <ekhliang@gmail.com>

tip

9aa1194

Signed-off-by: Eric Liang <ekhliang@gmail.com>

ericl requested review from scv119, clarkzinzow, jjyao, jianoaix and c21 as code owners April 11, 2023 01:37

ericl added 7 commits April 11, 2023 15:28

Merge remote-tracking branch 'upstream/master' into fix-mem-throttling

9c71781

fix actor pool tests

baafaf4

Signed-off-by: Eric Liang <ekhliang@gmail.com>

wip

be5f317

Signed-off-by: Eric Liang <ekhliang@gmail.com>

fix

9065405

Signed-off-by: Eric Liang <ekhliang@gmail.com>

fix

f8948e4

Signed-off-by: Eric Liang <ekhliang@gmail.com>

cleanup

19cd1e4

Signed-off-by: Eric Liang <ekhliang@gmail.com>

add e2e integration test

cf2ca1b

Signed-off-by: Eric Liang <ekhliang@gmail.com>

ericl changed the title ~~[WIP] Fix backpressure handling of queued actor pool tasks~~ Fix backpressure handling of queued actor pool tasks Apr 12, 2023

ericl assigned c21 and amogkam Apr 12, 2023

ericl mentioned this pull request Apr 12, 2023

[data] [streaming] The transformation changes the size of the block, and spilled to disk #34041

Closed

ericl commented Apr 12, 2023

View reviewed changes

ericl added 2 commits April 12, 2023 12:10

add backpressure BUILD

ec37a5b

Signed-off-by: Eric Liang <ekhliang@gmail.com>

Merge remote-tracking branch 'upstream/master' into fix-mem-throttling

132c48e

ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Apr 12, 2023

c21 assigned scottjlee Apr 14, 2023

amogkam approved these changes Apr 14, 2023

View reviewed changes

amogkam reviewed Apr 14, 2023

View reviewed changes

scottjlee reviewed Apr 14, 2023

View reviewed changes

scottjlee approved these changes Apr 14, 2023

View reviewed changes

use default constant

604e964

Signed-off-by: Eric Liang <ekhliang@gmail.com>

scv119 approved these changes Apr 17, 2023

View reviewed changes

ericl merged commit 3150211 into ray-project:master Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix backpressure handling of queued actor pool tasks #34254

Fix backpressure handling of queued actor pool tasks #34254

ericl commented Apr 11, 2023 •

edited

Loading

ericl Apr 12, 2023

amogkam left a comment

amogkam Apr 14, 2023

ericl Apr 14, 2023

amogkam Apr 14, 2023

scottjlee Apr 14, 2023

ericl Apr 14, 2023

scottjlee Apr 14, 2023

ericl Apr 14, 2023

	under_resource_limits: Whether this operator is under resource limits.
	under_resource_limits: Whether all operators are under resource limits.

Fix backpressure handling of queued actor pool tasks #34254

Fix backpressure handling of queued actor pool tasks #34254

Conversation

ericl commented Apr 11, 2023 • edited Loading

Why are these changes needed?

Related issue number

Choose a reason for hiding this comment

amogkam left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ericl commented Apr 11, 2023 •

edited

Loading