Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc: Instrumenting MPE logs #118

Open
azzaea opened this issue Apr 20, 2017 · 4 comments
Open

Doc: Instrumenting MPE logs #118

azzaea opened this issue Apr 20, 2017 · 4 comments

Comments

@azzaea
Copy link
Contributor

azzaea commented Apr 20, 2017

Thank you for providing the supporting MPE functions and documentation. These are very helpful indeed.

I'm not sure if github comments to a closed issue generate notifications, so I'm re-posting some questions relating to issue #115 in this thread.

The questions:

I'm more interested in having the log file include a reference to leaf function calls, as it would be really useful to know where and how each leaf function is being executed and so on. Looking at the code (and the output from list-events.x), I'm concluding only solo events are reported in the log (in addition to other ADLB calls); and this is achieved via the standard library metadata function. This gives an entry named metadata in the log file, but is not really helpful in drawing much insight about pipeline run details. Therefore, my first question is: Is there a reason why only solo events are currently implemented, or am I missing some point here?

My thinking to add more details is to instrument the MPE log, by specifying the apps/leaf functions start and end time as events and betwen them as states, and then proceed. In a typical C-program, this would involve calls to the functions below:

int MPE_Init_log(); // Initialize MPE logging
int MPE_Log_get_event_number(); //Obtain an event’s ID
int MPE_Describe_state(int start, int end, char name, charcolor); //Describe a state
int MPE_Start_log(int event, int intdata, char *chardata); // Start logging 
int MPE_Log_event(int event, int intdata, char *chardata); //Log the given event of the state
int MPE_Stop_log(void); // Stop logging a certain event
int MPE_Finish_log(char *logfilename); // Close the log, and write to the logfilename

But, given the wrapper functions defined in the mpe namespace, I think it is possible to make use of the mapping below between swift/t functions (to the left) and c functions (to the right):

mpe::create_pair <symbol> => [ list start-ID stop-ID ]  ### MPE_Describe_state(event1a, event1b, "Broadcast", "red")
mpe::create_solo <symbol> => ID ### MPE_Log_get_event_number();
mpe::log <event-ID> [<message>] ### MPE_Log_event(event1a, 0, "start broadcast");

Would this be a reasonable way to proceed? I really appreciate any kind of pointer you would be able to provide.

Thank you in advance,
Azza

@azzaea
Copy link
Contributor Author

azzaea commented Apr 24, 2017

I have been trying to work on this, and my code seems to be running smoothly (producing MPE logs corresponding to my app functions, or whatever code segment I like).
Essentially, I needed to define a swift module wrapping the built-in tcl definitions. I called it my_mpe.swift, and placed it inside my SWIFT-TDIR/turbine/export.

$ cat my_mpe.swift

#ifndef MPE_SWIFT
#define MPE_SWIFT

(string event) create_pair (string name) "turbine" "0.0"  [
         "set <<event>> [mpe::create_pair <<name>>] "
 ];

(void signal) mpelog (int eventID, string msg) "turbine" "0.0" [
        "mpe::log <<eventID>> <<msg>>"
];

#endif

Now, in my regular code, I can trace events of interest in my code by calling these functions as desired. Here is a simplistic example:

import my_mpe;

string event = create_pair("mpe logging testing case"); // Ideally, one should specify event as array of integers, considering the output of create_pair is a list of 2 IDs. This didn't work, so the string specification is a workaround that seems to get the job done.
string ids[] = split(event," "); 

mpelog( string2int(ids[0]), "start") =>
int x = 5 =>
mpelog(string2int(ids[1]), "end");

@j-woz
Copy link
Contributor

j-woz commented Apr 24, 2017

This is a great idea. Do you mind if I make this part of the Swift/T standard library?

Did you see Turbine's export/mpe.swift and lib/mpe.tcl features? Those are for adding arbitrary data into the MPE log.

@azzaea
Copy link
Contributor Author

azzaea commented Apr 24, 2017

Thank you, please do. I'm happy this is moving in the right direction.

I tried to mimic the code found in the export/mpe.swift and lib/mpe.tcl, attached. The code started getting hairy though, so I settled for the minimalistic code above especially that it wouldn't involve creating a corresponding tcl package, and also that it seems to work.
I'm curious to know if there are better ways to arrive at the same result.

My take on the metadata function that is supported via the turbing\lib\mpe.tcl is that the final log would not be fully customizable, as the final adlb.clog2 report would only include a reference to the keyword metadata , which is still a limitation in the follow up analysis of how a pipeline run went.

@azzaea
Copy link
Contributor Author

azzaea commented Apr 25, 2017

This is a bummer, but I need to report back.
Following the logic above, and enclosing critical aspects of a more realistic pipeline by the mpelog function defined above generates an adlb.clog2 that contains reference to the critical aspects of interest, in my case, these were: Alignment_stage and Deduplication_stage that occur one after the other, which is great.

However, when I run the profile.x script that was compiled from the mpe documenation, I get the results below, with strange timings corresponding to my stages of interest. They are commented below:

azza:config$ ./profile.x adlb.clog2.txt -1 
MPE_TIME_PREVIOUS: 0.000
Event                  Calls   Total (s)      Avg (s)  
ADLB_all_init                4         0.012   0.003037
ADLB_all_finalize            4         0.012   0.002936
ADLB_svr_busy              636         0.004   0.000007
ADLB_svr_get                56         0.000   0.000004
ADLB_svr_put                53         0.000   0.000000
ADLB_svr_create             18         0.000   0.000003
ADLB_svr_subscribe           0         0.000   0.000000
ADLB_svr_store             135         0.001   0.000007
ADLB_svr_retrieve          152         0.001   0.000004
ADLB_svr_sync                0         0.000   0.000000
ADLB_svr_steal               0         0.000   0.000000
ADLB_svr_shutdown            0         0.000   0.000000
ADLB_dmn_steal               0         0.000   0.000000
ADLB_dmn_sync                0         0.000   0.000000
ADLB_dmn_shutdown            1         0.000   0.000002
ADLB_wkr_put                 6         0.000   0.000009
ADLB_wkr_get                56        37.829   0.675522
ADLB_wkr_create              1         0.000   0.000006
ADLB_wkr_store             135         0.002   0.000014
ADLB_wkr_retrieve          152         0.002   0.000013
ADLB_wkr_subscribe           0         0.000   0.000000
ADLB_wkr_close               0         0.000   0.000000
ADLB_wkr_unique              0         0.000   0.000000
ADLB_svr_info                0         0.000   0.000000
user_state_0                53        19.119   0.360738
metadata                     0         0.000   0.000000
Alignment_stage              0         0.000   0.000000 #Interesting event that in reality lasts for ~6-8 seconds
Deduplication_stage          0         0.000   0.000000 #Interesting event that in reality lasts for ~6-8 seconds; after Alignment
metadata                     0         0.000   0.000000
CLOG_Buffer_write2disk       0         0.000   0.000000

This gets even trickier when I attempt to visualize the file using jumpshot, as either one or both of my desired stages disappear from the key. This behavior has been inconsistent when running the same piece of code several times. The screen-shots below were taken upon running the same pipeline twice consecutively. Each shows only one desired stage:

instumented mpe log

instrumenting mpe 2

In comparison, the metadata function defined in turbine\lib\mpe.tcl, and turbine\export\mpe.swift is very reliable, and seems to indicate the right instances (I use it after wait statements, to know when an event has finished), and I get the pin points where they should happen... This would suggest a need to re-write the codes aboves in tcl or so in a manner similar to the standard library.

I would really appreciate any pointer you could provide to move forward.

Thank you in advance,
Azza

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants