Scheduler optimization #185

achoum · 2023-07-06T10:10:34Z

Release unused memory during evaluation
Deterministic op evaluation order

- Release unused memory during evaluation - Deterministic op evaluation order

ianspektor · 2023-07-06T15:41:38Z

temporian/core/operators/base.py

+        self._internal_ordered_id = Operator.next_internal_id
+        Operator.next_internal_id += 1


is this necessary? could we instead ensure that graph.operators is always the same order? e.g. by using an ordered set instead of a set in the graph's _operators, _nodes, etc. while building the schedule (ordered set in python is a dict with no values :) ). not sure if this would suffice?

The alternative solution would be to replace some of the set-inputs in the API and intermediate functions by lists. I don't see that as ideal.

The other benefit of this change is that "id" is now a small number that is easier for people to read in error messages. Ideally, I would like to update all the "ids" with this mechanism.

The third benefit is that an "id()" is not guaranteed to be unique for two objects with non-overlapping lifetimes. This is not an issue for our current code, but this is a property that is not great for what we use it for.

See https://docs.python.org/3/library/functions.html#id

DonBraulio

LGTM.
Nice work on the memory release.
Left a comment because I don't understand why sorting only the first input ops is relevant.

DonBraulio · 2023-07-06T19:05:41Z

temporian/core/evaluation.py

@@ -208,7 +208,8 @@ def build_schedule(
    # Operators ready to be computed (i.e. ready to be added to "planned_ops")
    # as all their inputs are already computed by "planned_ops" or specified by
    # "inputs".
-    ready_ops: Set[Operator] = set()
+    ready_ops: List[Operator] = []
+    ready_ops_set: Set[Operator] = set()


I think it's safe to remove ready_ops_set now we got ready_ops. I don't see its purpose.

"ready_ops" and "ready_ops_set" contain the same data. Those two containers have different properties and costs (one is a list, the other is a set). The list is great for a FILO, while the set is great to check for presence of an item.

I think it's safe to remove ready_ops_set now we got ready_ops

You mean there is a situation where this code will not work? If so, can you give details?

DonBraulio · 2023-07-06T19:43:58Z

temporian/core/evaluation.py

+    # Make evaluation order deterministic.
+    #
+    # Execute the op with smallest internal ordered id first.
+    ready_ops.sort(key=lambda op: op._internal_ordered_id, reverse=True)


This only affects the ops that depend directly on the inputs (current ops in ready_ops) and executes them in instantiation order, right?
I'm not sure why this is better for memory release.

This sort ensure that the execution order of the operators is deterministic. This has not impact for the result. However, we have unit tests that check some of the internals that were affected by this non-deterministic evaluation. With this change, the unit tests are simpler :)

While the results are not impacted, the order of execution could change the speed and RAM usage of executing the graph. Making the order of execution deterministic reduces the risk of flakiness in resource constraints environment.

achoum · 2023-07-10T10:01:20Z

Thanks.
Submitting now.

Adding the open questions to the agenda of our next meeting.

Scheduler optimization

f5987c3

- Release unused memory during evaluation - Deterministic op evaluation order

achoum marked this pull request as ready for review July 6, 2023 11:26

achoum requested a review from ianspektor July 6, 2023 11:26

ianspektor approved these changes Jul 6, 2023

View reviewed changes

DonBraulio approved these changes Jul 6, 2023

View reviewed changes

Merge branch 'main' into gbm_schedule

f159c29

achoum merged commit 1ba8953 into main Jul 10, 2023
7 checks passed

achoum deleted the gbm_schedule branch July 10, 2023 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduler optimization #185

Scheduler optimization #185

achoum commented Jul 6, 2023

ianspektor Jul 6, 2023

achoum Jul 10, 2023

DonBraulio left a comment

DonBraulio Jul 6, 2023

achoum Jul 10, 2023

DonBraulio Jul 6, 2023 •

edited

achoum Jul 10, 2023

achoum commented Jul 10, 2023

		self._internal_ordered_id = Operator.next_internal_id
		Operator.next_internal_id += 1

Scheduler optimization #185

Scheduler optimization #185

Conversation

achoum commented Jul 6, 2023

ianspektor Jul 6, 2023

Choose a reason for hiding this comment

achoum Jul 10, 2023

Choose a reason for hiding this comment

DonBraulio left a comment

Choose a reason for hiding this comment

DonBraulio Jul 6, 2023

Choose a reason for hiding this comment

achoum Jul 10, 2023

Choose a reason for hiding this comment

DonBraulio Jul 6, 2023 • edited

Choose a reason for hiding this comment

achoum Jul 10, 2023

Choose a reason for hiding this comment

achoum commented Jul 10, 2023

DonBraulio Jul 6, 2023 •

edited