Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler optimization #185

Merged
merged 2 commits into from Jul 10, 2023
Merged

Scheduler optimization #185

merged 2 commits into from Jul 10, 2023

Conversation

achoum
Copy link
Collaborator

@achoum achoum commented Jul 6, 2023

  • Release unused memory during evaluation
  • Deterministic op evaluation order

- Release unused memory during evaluation
- Deterministic op evaluation order
@achoum achoum marked this pull request as ready for review July 6, 2023 11:26
@achoum achoum requested a review from ianspektor July 6, 2023 11:26
Comment on lines +91 to +92
self._internal_ordered_id = Operator.next_internal_id
Operator.next_internal_id += 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this necessary? could we instead ensure that graph.operators is always the same order? e.g. by using an ordered set instead of a set in the graph's _operators, _nodes, etc. while building the schedule (ordered set in python is a dict with no values :) ). not sure if this would suffice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative solution would be to replace some of the set-inputs in the API and intermediate functions by lists. I don't see that as ideal.

The other benefit of this change is that "id" is now a small number that is easier for people to read in error messages. Ideally, I would like to update all the "ids" with this mechanism.

The third benefit is that an "id()" is not guaranteed to be unique for two objects with non-overlapping lifetimes. This is not an issue for our current code, but this is a property that is not great for what we use it for.

See https://docs.python.org/3/library/functions.html#id

Copy link
Collaborator

@DonBraulio DonBraulio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Nice work on the memory release.
Left a comment because I don't understand why sorting only the first input ops is relevant.

@@ -208,7 +208,8 @@ def build_schedule(
# Operators ready to be computed (i.e. ready to be added to "planned_ops")
# as all their inputs are already computed by "planned_ops" or specified by
# "inputs".
ready_ops: Set[Operator] = set()
ready_ops: List[Operator] = []
ready_ops_set: Set[Operator] = set()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to remove ready_ops_set now we got ready_ops. I don't see its purpose.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ready_ops" and "ready_ops_set" contain the same data. Those two containers have different properties and costs (one is a list, the other is a set). The list is great for a FILO, while the set is great to check for presence of an item.

I think it's safe to remove ready_ops_set now we got ready_ops

You mean there is a situation where this code will not work? If so, can you give details?

# Make evaluation order deterministic.
#
# Execute the op with smallest internal ordered id first.
ready_ops.sort(key=lambda op: op._internal_ordered_id, reverse=True)
Copy link
Collaborator

@DonBraulio DonBraulio Jul 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only affects the ops that depend directly on the inputs (current ops in ready_ops) and executes them in instantiation order, right?
I'm not sure why this is better for memory release.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort ensure that the execution order of the operators is deterministic. This has not impact for the result. However, we have unit tests that check some of the internals that were affected by this non-deterministic evaluation. With this change, the unit tests are simpler :)

While the results are not impacted, the order of execution could change the speed and RAM usage of executing the graph. Making the order of execution deterministic reduces the risk of flakiness in resource constraints environment.

@achoum
Copy link
Collaborator Author

achoum commented Jul 10, 2023

Thanks.
Submitting now.

Adding the open questions to the agenda of our next meeting.

@achoum achoum merged commit 1ba8953 into main Jul 10, 2023
7 checks passed
@achoum achoum deleted the gbm_schedule branch July 10, 2023 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants