Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialise point-to-point mappings from scheduler output (groundwork) #160

Merged
merged 14 commits into from
Oct 25, 2021

Conversation

Shillaker
Copy link
Collaborator

@Shillaker Shillaker commented Oct 22, 2021

At the moment, scheduling and point-to-point messaging are seen as two separate things, and the function/ executor doing the scheduling has to ensure that the point-to-point mappings are set up independently. It also means that the scheduled functions have to wait for confirmation from the master before performing any point-to-point messaging, as they cannot ensure the mappings have been set up when they start executing (this is equivalent to the barrier we use before executing MPI functions).

Ideally the point-to-point mappings would be set up transparently when a group of functions is scheduled, and the downstream functions wouldn't have to manually synchronise before doing point-to-point messaging.

The full implementation of this transparent point-to-point setup will be completed with the introduction of function groups in #141, but this PR contains a lot of the groundwork for it (to avoid bloating the other one).

Changes:

  • Add a SchedulingDecision object, which captures all the metadata from the scheduling, and is returned by the scheduler after scheduling has completed.
  • Change the PointToPointBroker to only accept a SchedulingDecision object as the input to set up the mapping (as it will only be the scheduler doing this setup in future).
  • Add a check inside sendMessage that checks if the mappings for that app are available on the current host, and does a blocking wait if not.
  • Update scheduler tests to check the new SchedulingDecision object returned from callFunctions.

Note that I've put the SchedulingDecision in faabric::util as it's essentially just a struct used for passing data around, and things consuming it don't need a full dependency on the scheduling module.

@Shillaker Shillaker marked this pull request as draft October 22, 2021 08:20
@Shillaker Shillaker self-assigned this Oct 22, 2021
@Shillaker Shillaker marked this pull request as ready for review October 22, 2021 12:32
@Shillaker Shillaker changed the title Introduce concept of scheduling decisions Initialise point-to-point mappings from scheduler output Oct 22, 2021
@Shillaker Shillaker changed the title Initialise point-to-point mappings from scheduler output Initialise point-to-point mappings from scheduler output (part 1) Oct 22, 2021
@Shillaker Shillaker marked this pull request as draft October 22, 2021 14:10
@Shillaker Shillaker changed the title Initialise point-to-point mappings from scheduler output (part 1) Initialise point-to-point mappings from scheduler output (groundwork) Oct 22, 2021
@Shillaker Shillaker marked this pull request as ready for review October 22, 2021 15:03
@Shillaker Shillaker marked this pull request as draft October 22, 2021 15:03
@Shillaker Shillaker marked this pull request as ready for review October 22, 2021 16:17
@@ -6,6 +6,8 @@
#include <faabric/util/locks.h>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is pretty messed up as I've removed a lot of the interface for this class. Best to look at full version:

https://github.com/faasm/faabric/blob/scheduling-decisions/src/transport/PointToPointBroker.cpp

formatByteArrayToIntString(kickOffData),
formatByteArrayToIntString(expectedKickOffData));
return 1;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sort of point-to-point initialisation is done transparently now.

FAIL();
}
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests were covering the old PointToPointBroker API which has been simplified (and is tested below).

Copy link
Collaborator

@csegarragonz csegarragonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a minor typo

tests/test/util/test_scheduling.cpp Show resolved Hide resolved
@Shillaker Shillaker merged commit 1bb5c13 into master Oct 25, 2021
@Shillaker Shillaker deleted the scheduling-decisions branch October 25, 2021 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants