Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add distributed coordination operations #161

Merged
merged 22 commits into from
Oct 29, 2021
Merged

Add distributed coordination operations #161

merged 22 commits into from
Oct 29, 2021

Conversation

Shillaker
Copy link
Collaborator

@Shillaker Shillaker commented Oct 25, 2021

Adds generic utilities to support distributed coordination primitives such as locks and barriers.

It works as follows:

  • As distributed coordination is mostly point-to-point messaging, it happens through the PointToPointBroker class, which deals with PointToPointGroups, where each function in the group can coordinate with one another across hosts.
  • There may be zero to many groups per app, and the PointToPointBroker just deals with a groupId (integer) to refer to each one.
  • Each function belongs to an app (with an appId and an optional appIdx, e.g. an MPI rank, or OpenMP thread ID), and potentially a group as well (with a groupId and a groupIdx). There may be many groups in a single app, therefore the groupIdx and appIdx are treated separately (although may be set to the same thing if there's only one group in the app).
  • The operations available to each group are: a lock, a barrier and a no-wait barrier/ notify (where one function on the master can wait for all the others to finish, without the others being blocked).
  • Point-to-point groups are created by scheduling a batch of functions with groupId and groupIdx set on the underlying Messages. The scheduler uses this information to transparently set up the point-to-point mappings required to do the required messaging.
  • The barrier and notify implementations use standard point-to-point messaging.
  • Locking is done from a remote host by sending a request to the PointToPointServer. When the lock has been successfully acquired, a corresponding point-to-point message will be sent back to the group index that originally requested the lock. If the lock is requested locally, we do the same thing, just without the request to the remote PointToPointServer.
  • Unlocking is a single async operation (i.e. without a response), implemented as another request to the PointToPointServer if remote.

I've added a group lock/ unlock surrounding writing snapshot diffs. This could otherwise cause problems by overwriting regions of memory while another thread was in a critical section.

For posterity, this PR is a rewrite of #141

tests/dist/server.cpp Outdated Show resolved Hide resolved
@Shillaker Shillaker marked this pull request as ready for review October 28, 2021 18:03
Copy link
Collaborator

@csegarragonz csegarragonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very glad this is finally done 🎉

Just some minor comments.

Before merging though, could we do a bump faabric PR in faasm to make sure this does not break anything?

src/scheduler/CMakeLists.txt Show resolved Hide resolved
tests/test/transport/test_point_to_point_groups.cpp Outdated Show resolved Hide resolved
nSums = 1000;
}

// Spawn n-1 child threads to add to shared sums over several barriers so
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

several ?

Copy link
Collaborator Author

@Shillaker Shillaker Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by quoting a single word... The test is running the sum operations in a loop, so it's invoking several barriers. Does that make sense?

src/transport/PointToPointBroker.cpp Outdated Show resolved Hide resolved
{
std::vector<uint8_t> data(1, 0);

ptpBroker.sendMessage(groupId, 0, groupIdx, data.data(), data.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always assume the master's index to be 0 right? Maybe it would make the code more readable if this value was hardcoded somwhere. It is sometimes hard to understand that the 0 correponds to the master index.

Copy link
Collaborator Author

@Shillaker Shillaker Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point, we frequently hard-code this zero in the MPI code too, i.e. if(rank == 0) { // Do stuff for master }, e.g. https://github.com/faasm/faabric/blob/master/src/scheduler/MpiWorld.cpp#L1216

I'll change for the ptp stuff but would be good to switch the MPI code at some point too.

@Shillaker
Copy link
Collaborator Author

Faasm PR checking here: faasm/faasm#531

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants