Adding arbitrary information to execution graph #166

csegarragonz · 2021-10-27T16:12:38Z

In this PR I introduce a feature to count arbitrary events in a per-message fashion. The motivation for this PR is to be able to shed light into the system's behaviour in a low-intrusion manner. Given that protobuf messages are first-class citizens in the system, it feels natural to include this tracing information in the message objects.

To start the recording, we must set the recordexecgraph flag in the message.

At the moment, the fields populated are not serialised to and from JSON. The reason being that serialising an arbitrary map will add a lot of complexity to the already convoluted methods.

After some local testing, I think it is a better option to go for #167 first.

Shillaker · 2021-10-28T07:09:01Z

This is good, as any info from the message will automatically be put onto the execution graph result.

However, I have a couple of points/ questions:

Enabling this only in debug builds is somewhat against the original aim of this work, which was to trace distributed deployments (which will be using a release build). Our primary use-case will be when there's a big deployment and something is going wrong, and we don't want to have to rebuild and redeploy just to do the tracing.
I would suggest adding another boolean parameter to the Message called recordExecGraph, then wrap any exec graph interaction in a conditional based on that (defaulting to false). This way a user can reinvoke their function with this flag set to true, then query the exec graph without redeploying or recompiling.
If we have to add a new protobuf object for every type of tracing we are going to bloat the protobuf definitions quite a lot, and it makes it more fiddly to add new information. I would suggest having an arbitrary key/ value approach, e.g.

message ExecGraphDetail {
    string key = 1;
    string value = 2;
}

message Message {
...
    bool recordExecGraph = 38;
    repeated ExecGraphDetail execGraphDetails = 39;
}

Then adding info would be done through some utility functions:

void addExecGraphDetail(faabric::Message &msg, const std::string &key, const std::string &value) {
    // Add an entry to the execGraphDetails;
}

void incrementExecGraphCounter(faabric::Message &msg, const std::string &key, const std::string &value) {
    // Get the current value, parse as an int, increment, write back
}

Then for now, for simple MPI tracing we'd be interested in the following, however, I'm not sure if it's possible:

mpi_cross_host_msg_count - counter saying how many messages were sent outside this host
mpi_in_host_msg_count - counter saying how many messages were sent locally

The execution graph currently relies on Redis, which we'd like to remove one day, but I think we can switch this to use Faabric state under the hood instead (eventually).

csegarragonz · 2021-10-28T10:06:27Z

I agree with most of your comments, some observations.

I think there's value in starting and stoping the recording, keeping the records in memory, and then modifying the message once. This way we ensure: never editing the wrong message, no race-conditions, and more (?) efficient tracing/no-opping.
Given that this won't be disabled in Release builds, I've added a bit of complexity to the code in order to properly no-op functions if the message value is not set.
If we keep track of the number of messages sent to each rank in the exec graph, then using other message entries like mpirank and masterhost, it is possible to work out how many messages are cross host and in host. It becomes a matter of post-processing the execution graph.

Shillaker · 2021-10-28T13:02:24Z

I think there's value in starting and stoping the recording, keeping the records in memory, and then modifying the message once. This way we ensure: never editing the wrong message, no race-conditions, and more (?) efficient tracing/no-opping.

I think this is unnecessary complexity, it means we have to maintain another map of message IDs and values, when it seems like we will always have a reference to the target message when doing the tracing (so could just set it directly). However, perhaps I don't understand the risks/ difficulties. When would we modify the wrong message? If we're passing a reference to the message object itself into the functions that add info (and modifying it in place), then by definition it's the right one; the place where I can see this going wrong is if we're passing a copy of the original message at some point, and therefore edits don't persist on the original. However, we should only ever be passing messages around by reference and never copying, so I'm not sure this would be a problem. I don't see race conditions being a problem either, as each message is only ever handled by a single executor (or single scheduler thread) IIRC.

Given that this won't be disabled in Release builds, I've added a bit of complexity to the code in order to properly no-op functions if the message value is not set.

I'm not sure I understand the points about no-opping and efficiency. What I'm saying is that we won't do any execution graph stuff by default in either Release or Debug builds (i.e. less than we do now, where we record the graph for every request), unless someone sets the recordExecGraph flag (i.e. before adding any info we'd have an if(msg.recordExecGraph()) { // do something }. Once this is set, I'm not sure we need to worry too much about performance (as the user has explicitly asked for it).

If we keep track of the number of messages sent to each rank in the exec graph, then using other message entries like mpirank and masterhost, it is possible to work out how many messages are cross host and in host. It becomes a matter of post-processing the execution graph.

Yes, good point, although i'm not sure how straightforward it will be to map ranks to hosts, especially if we ever do migration. I would say recording counts of messages to hosts as well as to ranks would be great if possible (each would just be a different key/ value).

Shillaker · 2021-10-28T13:37:13Z

src/scheduler/MpiWorld.cpp

@@ -33,6 +34,9 @@ static thread_local std::unordered_map<
  std::unique_ptr<faabric::transport::AsyncSendMessageEndpoint>>
  ranksSendEndpoints;

+// Id of the message that created this thread-local instance
+static thread_local int thisMsgId;


Having thread-local state makes me nervous and I'd like to avoid it if at all possible. If we edit the message objects directly then perhaps we could avoid this.

Shillaker · 2021-10-28T13:38:15Z

src/util/exec_graph.cpp

+
+    checkMessageNotLinked();
+
+    linkedMsg = std::make_shared<faabric::Message>(msg);


Does this not copy of the message?

Shillaker · 2021-10-28T13:40:03Z

src/util/exec_graph.cpp

+    linkedMsg = std::make_shared<faabric::Message>(msg);
+
+    // If message flag is not set, no-op the increment functions for minimal
+    // overhead


This is quite a lot of black magic and potentially premature optimisation. Could we avoid this complexity by editing messages directly, then putting in an if(!msg.recordexecgraph()) { return; } in those methods (or the same check but wrapping the call to those methods)?

Shillaker · 2021-10-28T13:41:48Z

src/proto/faabric.proto

@@ -148,6 +153,10 @@ message Message {
    string sgxTag = 35;
    bytes sgxPolicy = 36;
    bytes sgxResult = 37;
+
+    // Exec-graph utils
+    bool recordExecGraph = 38;


This flag needs to be added to the message JSON serialisation/ deserialisation to allow clients to pass it in. I.e. the request JSON would look something like:

{ "user": "mpi", "func": "some_mpi_func", ... "exec_graph": true }

As mentioned in the PR description, I think we should go for #167 . The amount of complexity reduced is vast, and definately worth the time.

Agreed, although for now we should still add it to what we have. The change in complexity may be good, but it's a bit of code that currently requires very little maintenance, and making the change would inevitably take at least a few hours (especially as it would require testing the operations from the two language clients and all the experiments).

However, I take the point around complexity of a map. Unfortunately this work isn't usable until we pass it back to the client, and using it to debug an issue with an experiment is our top priority. Is it possible to use the protobuf serialisation to serialise just this map to JSON, put that as a string into the JSON sent back to the client, then have the client deserialise it?

csegarragonz · 2021-10-29T10:15:44Z

include/faabric/scheduler/MpiContext.h

@@ -9,9 +9,9 @@ class MpiContext
  public:
    MpiContext();

-    int createWorld(const faabric::Message& msg);
+    int createWorld(faabric::Message& msg);


Need to de-constify this so that we can actually edit the message when tracing.

Shillaker

Nice, this is looking good, just a couple of tweaks and it's good to go.

tests/test/scheduler/test_exec_graph.cpp

Shillaker · 2021-10-29T11:12:33Z

include/faabric/util/exec_graph.h

+                      const std::string& key,
+                      const int valueToIncrement = 1);
+
+static inline std::string const mpiMsgCountPrefix = "mpi-msgcount-torank-";


This constant is MPI-specific and only used by MPI stuff, therefore should live in an MPI header (and I would also use a #define to fit with the style of the other constants we define, but an inline std::string is probably equivalent).

I know this is not how we define constants elsewhere, and I thought about the #define option, but I liked the idea of having the string constant sit inside a namespace, thus why I went for the inline option.

Orthogonally, the constant is MPI-specific but also exec-graph specific, i.e. it is only used to record this exec graph details, and is not needed in MPI headers. Thus why I placed it here, I can see us having some of this "prefix" strings together here; happy to move elsewhere though.

src/scheduler/MpiWorld.cpp

Shillaker · 2021-10-29T11:17:14Z

src/util/json.cpp

+            const auto& map = msg.execgraphdetails();
+            for (const auto& it : map) {
+                out = fmt::format("{},{}:{}", out, it.first, it.second);
+            }


Won't this result in a leading ,?

Could instead do this with a sstream, appending a comma and skipping the comma on the last element (a bit like here: https://github.com/faasm/faabric/blob/master/src/util/bytes.cpp#L77)

Shillaker · 2021-10-29T11:19:12Z

tests/test/util/test_json.cpp

+    auto& map = *msg.mutable_execgraphdetails();
+    map["foo"] = "bar";
+    auto& intMap = *msg.mutable_intexecgraphdetails();
+    intMap["foo"] = 0;


Is this test checking the serialisation of the maps? I can't see a string that looks like "foo:bar,qux:blah".

We don't have to actually hardcode the strings. The way we check for correctness is:

Set message fields ( =: msgA)

Serialise message

De-serialise message ( =: msgB)

Check msgA == msgB

I was missing the de-serialise and equality check bits.

csegarragonz · 2021-10-29T15:16:58Z

src/util/json.cpp

@@ -266,6 +307,54 @@ std::string getStringFromJson(Document& doc,
    return std::string(valuePtr, valuePtr + it->value.GetStringLength());
 }

+std::map<std::string, std::string> getStringStringMapFromJson(


I actually completely overlooked the string to json conversion, and updating the checkMessageEquality function.

…yout and rename to ExecGraphDetail

csegarragonz added the enhancement New feature or request label Oct 27, 2021

csegarragonz self-assigned this Oct 27, 2021

csegarragonz force-pushed the log-calls branch from b4e83f4 to a626d66 Compare October 27, 2021 16:32

csegarragonz marked this pull request as ready for review October 27, 2021 16:32

csegarragonz force-pushed the log-calls branch 2 times, most recently from bc3f886 to 7290d36 Compare October 27, 2021 16:49

csegarragonz force-pushed the log-calls branch from ed6b366 to f0065ee Compare October 28, 2021 10:21

Shillaker reviewed Oct 28, 2021

View reviewed changes

csegarragonz commented Oct 29, 2021

View reviewed changes

csegarragonz requested a review from Shillaker October 29, 2021 10:41

Shillaker requested changes Oct 29, 2021

View reviewed changes

csegarragonz commented Oct 29, 2021

View reviewed changes

csegarragonz added 12 commits October 29, 2021 16:02

add call records and tests

aace59d

formatting

7624ff9

disable tracing by default during tests

7ba6cfe

re-factor to be used depending on message flag, change the message la…

5755bf9

…yout and rename to ExecGraphDetail

update comments

a46a5e9

refactor after offline discussion

2a66807

don't log chained calls if recording exec graph is not set

46ab804

quick test fix

1b64990

add serialisation for maps

02661a9

fix tests

0103cbc

self-review

534ae9c

refactor thread local message's name

163ea28

csegarragonz added 3 commits October 29, 2021 16:04

move mpi exec graph tests to separate file

d6c3b1a

add checks for serialisation/deserialisation

11ec3fc

cleanup

80e5521

csegarragonz force-pushed the log-calls branch from 2048391 to 80e5521 Compare October 29, 2021 16:04

csegarragonz requested a review from Shillaker October 29, 2021 16:21

Shillaker approved these changes Nov 1, 2021

View reviewed changes

Shillaker changed the title ~~In-faabric tracing of arbitrary calls~~ Adding arbitrary information to execution graph Nov 1, 2021

csegarragonz merged commit 832aafe into master Nov 2, 2021

csegarragonz deleted the log-calls branch November 2, 2021 09:44

csegarragonz mentioned this pull request Nov 9, 2021

Use latest ExecGraph features and only upon request faasm/faasm#537

Merged

csegarragonz mentioned this pull request Feb 23, 2022

Add task to generate release body #233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding arbitrary information to execution graph #166

Adding arbitrary information to execution graph #166

csegarragonz commented Oct 27, 2021 •

edited

Loading

Shillaker commented Oct 28, 2021 •

edited

Loading

csegarragonz commented Oct 28, 2021 •

edited

Loading

Shillaker commented Oct 28, 2021 •

edited

Loading

Shillaker Oct 28, 2021

Shillaker Oct 28, 2021

Shillaker Oct 28, 2021 •

edited

Loading

Shillaker Oct 28, 2021 •

edited

Loading

csegarragonz Oct 29, 2021

Shillaker Oct 29, 2021 •

edited

Loading

Shillaker Oct 29, 2021 •

edited

Loading

csegarragonz Oct 29, 2021

Shillaker left a comment

Shillaker Oct 29, 2021

csegarragonz Oct 29, 2021 •

edited

Loading

Shillaker Oct 29, 2021

csegarragonz Oct 29, 2021

Shillaker Oct 29, 2021

csegarragonz Oct 29, 2021

csegarragonz Oct 29, 2021


		checkMessageNotLinked();

		linkedMsg = std::make_shared<faabric::Message>(msg);

Adding arbitrary information to execution graph #166

Adding arbitrary information to execution graph #166

Conversation

csegarragonz commented Oct 27, 2021 • edited Loading

Shillaker commented Oct 28, 2021 • edited Loading

csegarragonz commented Oct 28, 2021 • edited Loading

Shillaker commented Oct 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shillaker Oct 28, 2021 • edited Loading

Choose a reason for hiding this comment

Shillaker Oct 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shillaker Oct 29, 2021 • edited Loading

Choose a reason for hiding this comment

Shillaker Oct 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Shillaker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csegarragonz Oct 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csegarragonz commented Oct 27, 2021 •

edited

Loading

Shillaker commented Oct 28, 2021 •

edited

Loading

csegarragonz commented Oct 28, 2021 •

edited

Loading

Shillaker commented Oct 28, 2021 •

edited

Loading

Shillaker Oct 28, 2021 •

edited

Loading

Shillaker Oct 28, 2021 •

edited

Loading

Shillaker Oct 29, 2021 •

edited

Loading

Shillaker Oct 29, 2021 •

edited

Loading

csegarragonz Oct 29, 2021 •

edited

Loading