# Understanding Lucata Plotting Tools

### Lesson Objectives

Upon completing this notebook you should be able to understand and apply the following concepts:

1) Run a simulation with timing that generates statistics for plotting.  
2) Evaluate the outputs from plotting scripts.  
3) Look at two different kinds of spawn primitives and compare them using their plots.  

### Environment Setup

In [None]:
#As with the previous notebook we set up the environment for tools to be used in this notebook. From the command line you can source the ../.env script.
import os

#Set the path to the latest toolset 
LUCATA_BASE="/tools/emu/pathfinder-sw/22.09-beta" 

os.environ["USER_NOTEBOOK_CODE"]=os.path.dirname(os.getcwd())
os.environ["PATH"]=os.pathsep.join([os.path.join(LUCATA_BASE,"bin"),os.environ["PATH"]])
os.environ["FLAGS"]="-I"+LUCATA_BASE+"/include/"+" -L"+LUCATA_BASE+"/lib -lmemoryweb"

### Running Simulations for Profiling  

First we build all versions of `hello-world-*.c`. Then we will demonstrate how to run simulations and run each step of the profiling meta-script, `emusim_profile`.

In [None]:
%%bash
set -x
ls -l hello-world*.c
. ../.env
make all
set +x

Manually run the simulator and generate all the plots.

In [None]:
%%bash
set -x;
mkdir -p manual_plots;
cd manual_plots;
emusim.x --capture_timing_queues -m 24 --total_nodes 2 --output_instruction_count -- ../hello-world.mwx;
make_tqd_plots.py hello-world.tqd;
make_map_plots.py hello-world.mps;
make_uis_plots.py hello-world.uis;
make_hpc_plots.py -f hello-world.hpc;
set +x;

In [None]:
from IPython.display import Image, display

display(Image(filename="manual_plots/hello-world.Thread_Enqueue_Map.png"))
display(Image(filename="manual_plots/hello-world.Memory_Read_Map.png"))
display(Image(filename="manual_plots/hello-world.Memory_Write_Map.png"))
display(Image(filename="manual_plots/hello-world.Atomic_Transaction_Map.png"))
display(Image(filename="manual_plots/hello-world.Remote_Transaction_Map.png"))
display(Image(filename="manual_plots/hello-world_total_instructions.png"))

Use the `emusim_profile` wrapper to generate these with a single command.  
The directory `profile_hello-world` is where the outputs will be generated.
The wrapper inputs are as follows:
```
emusim_profile <profile directory> [<emusim options>] -- mybenchmark.mwx --param 1 --param 2
```
Note: The profiler uses the following simulator flags, so they should not be passed into the profiler:  
`-o, --capture_timing_queues, --output_instruction_count`.

In [None]:
%%bash
mkdir -p profile_hello-world;
emusim_profile profile_hello-world --total_nodes 2 -m 24 -- hello-world.mwx

In [None]:
display(Image(filename="profile_hello-world/hello-world.Thread_Enqueue_Map.png"))
display(Image(filename="profile_hello-world/hello-world.Memory_Read_Map.png"))
display(Image(filename="profile_hello-world/hello-world.Memory_Write_Map.png"))
display(Image(filename="profile_hello-world/hello-world.Atomic_Transaction_Map.png"))
display(Image(filename="profile_hello-world/hello-world.Remote_Transaction_Map.png"))
display(Image(filename="profile_hello-world/hello-world_total_instructions.png"))

In [None]:
!ls hello-world.*

We now have several different output files. These are detailed in Ch. 7.6 of the Programming Guide and are as follows:
* hello-world.mwx - Lucata executable.
* hello-world.cdc - Configuration data output file; includes system information and wall-clock time.
* hello-world.mps - Memory map output; shows memory operation types and thread enqueuing.
* hello-world.tqd - Timed activity tracing; includes live threads, thread activity counts, and requests.
* hello-world.uis - Instruction count statistics; shows the number of instructions per function in the application and number of migrations.

These files can be used with plotting tools to provide detailed output on the simulation of the application.

## Hello World Spawn Example

That example kept one thread alive and migrating between nodelets.  This one, hello-world-spawn.c, uses Cilk's thread spawning intrinsic:

```c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <cilk.h>

#include <memoryweb.h>
#include <timing.h>

const char str[] = "Hello, world!";

static inline void copy_ptr (char *pc, const long *pl) { *pc = (char)*pl; }

replicated long * ptr;
replicated char * str_out;

int main (void)
{
     long n = strlen (str) + 1;

     mw_replicated_init ((long*)&ptr, (long)mw_malloc1dlong (n));
     mw_replicated_init ((long*)&str_out, (long)malloc (n * sizeof (char)));

     /*
      * Start timing here.
      * Profiler settings hidden for simplicity.
      */

     for (long k = 0; k < n; ++k)
          ptr[k] = (long)str[k]; // Remote writes

     for (long k = 0; k < n; ++k)
          cilk_spawn copy_ptr (&str_out[k], &ptr[k]);

     cilk_sync;

     printf("%s\n", str_out);  // Migration back
     
     // Profiler end commands.
     
     return 0;
}
```

In [None]:
%%bash
mkdir -p profile_hello-world-spawn;
emusim_profile profile_hello-world-spawn --total_nodes 2 -m 24 -- hello-world-spawn.mwx
ls profile_hello-world-spawn/hello-world-spawn*

In [None]:
display(Image(filename="profile_hello-world/hello-world.Live_Threads.png"))
display(Image(filename="profile_hello-world-spawn/hello-world-spawn.Live_Threads.png"))
#display(Image(filename="profile_hello-world/hello-world.Thread_Activity.png"))
#display(Image(filename="profile_hello-world-spawn/hello-world-spawn.Thread_Activity.png"))
display(Image(filename="profile_hello-world/hello-world.MSP_Activity.png"))
display(Image(filename="profile_hello-world-spawn/hello-world-spawn.MSP_Activity.png"))
display(Image(filename="profile_hello-world/hello-world_total_instructions.png"))
display(Image(filename="profile_hello-world-spawn/hello-world-spawn_total_instructions.png"))

Then we can compare the output of the normal Hello World and the Spawn Hello World for the statistics that are different.

## Advanced Implementation - Spawn At

This example just shows one additional variation of using a `cilk_spawn_at` call to spawn threads at a remote node

```c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <cilk.h>

#include <memoryweb.h>
#include <timing.h>

static const char str[] = "Hello, world!";

static inline void copy_ptr (char *pc, const long *pl) { *pc = (char)*pl; }

replicated long * ptr;
replicated char * str_out;

int main (void)
{
     long n = strlen (str) + 1;

     mw_replicated_init ((long*)&ptr, (long)mw_malloc1dlong (n));
     mw_replicated_init ((long*)&str_out, (long)malloc (n * sizeof (char)));

     /*
      * Start timing here.
      * Profiler settings hidden for simplicity.
      */

     for (long k = 0; k < n; ++k)
          ptr[k] = (long)str[k]; // Remote writes

     for (long k = 0; k < n; ++k) {
          cilk_spawn_at(&ptr[k]) copy_ptr (&str_out[k], &ptr[k]);
     }

     cilk_sync;
    
     printf("%s\n", str_out);  // Migration back
    
     // Profiler end commands.
    
     return 0;
}
```

In [None]:
%%bash
mkdir -p profile_hello-world-spawn-at;
emusim_profile profile_hello-world-spawn-at --total_nodes 2 -m 24 -- hello-world-spawn-at.mwx
ls profile_hello-world-spawn-at/hello-world-spawn-at*

In [None]:
display(Image(filename="profile_hello-world/hello-world.Live_Threads.png"))
display(Image(filename="profile_hello-world-spawn-at/hello-world-spawn-at.Live_Threads.png"))
#display(Image(filename="profile_hello-world/hello-world.Thread_Activity.png"))
#display(Image(filename="profile_hello-world-spawn-at/hello-world-spawn-at.Thread_Activity.png"))
display(Image(filename="profile_hello-world/hello-world.MSP_Activity.png"))
display(Image(filename="profile_hello-world-spawn-at/hello-world-spawn-at.MSP_Activity.png"))
display(Image(filename="profile_hello-world/hello-world_total_instructions.png"))
display(Image(filename="profile_hello-world-spawn-at/hello-world-spawn-at_total_instructions.png"))

Once we've finished our testing, we can then clean up some of the logfiles that we used for this example.

In [None]:
!make clean