# Notebook #5 - Pathfinder Timing Hooks and Hardware Performance Counters

### Lesson Objectives

Upon completing this notebook you should be able to understand and apply the following concepts:

- Learn about and utilize timing hooks to measure execution time with the emusim simulator
- Apply the basic concepts of performance measurement and performance counter measurements on the Pathfinder hardware

This notebook goes along with the [Lucata profiling and timing slides]() and the [hardware counter slides](), so please follow along with the slides for a supplemental resource. 

### Environment Setup

We first need to initialize our environment to use the Lucata toolchain.

In [1]:
%load_ext slurm_magic
import os
from IPython.display import Code

#Set the path to the latest toolset 
LUCATA_BASE="/tools/emu/pathfinder-sw/22.09-beta" 

#Get the path to where all code samples are
os.environ["USER_NOTEBOOK_CODE"]=os.path.dirname(os.getcwd())
os.environ["PATH"]=os.pathsep.join([os.path.join(LUCATA_BASE,"bin"),os.environ["PATH"]])
os.environ["FLAGS"]="-I"+LUCATA_BASE+"/include/memoryweb"+" -L"+LUCATA_BASE+"/lib -lemu_c_utils -lmemoryweb"



## Lucata Timing Hooks

The Lucata toolchain includes a simulation profiler called `emusim_profile`. Running the profiler on your entire program can take a long time, so the toolchain provides timing hooks to specify regions of interest for performance profiling. Here we annotate SAXPY by placing timing hooks around the main computational kernel.

In [2]:
Code('timing-hooks-saxpy.c')

### Profiling with Timing Hooks
We can then compile and profile the code. This will generate a separate folder "saxpy_profile" with HTML output files that you can then investigate in further detail. 

Note that this process might take a long time! For this reason it is important to only scope the region of interest in your code that you want to gather statistics for. Simulation should take under 3 minutes.

In [3]:
%%bash
emu-cc -o timing-hooks-saxpy.mwx $FLAGS timing-hooks-saxpy.c

+ emu-cc -o timing-hooks-saxpy.mwx -I/tools/emu/pathfinder-sw/22.09-beta/include/memoryweb -L/tools/emu/pathfinder-sw/22.09-beta/lib -lemu_c_utils -lmemoryweb timing-hooks-saxpy.c


In [4]:
%%bash
time emusim_profile saxpy_profile -m 24 --total_nodes 2 -- timing-hooks-saxpy.mwx 8 128 5.0 

Generating profile in saxpy_profile/timing-hooks-saxpy
emusim.x  -m 24 --total_nodes 2
timing-hooks-saxpy.mwx 8 128 5.0
Start untimed simulation with local date and time= Tue Sep 20 23:37:10 2022

{"region_name":"example","core_clk_mhz":175,"use_CORE_CLK_MHZ_envvar":0,"time_ms":0.02,"ticks":3699}
time (ms) = 0.021137
End untimed simulation with local date and time= Tue Sep 20 23:37:10 2022




        SystemC 2.3.3-Accellera --- Sep  7 2022 09:15:59
        Copyright (c) 1996-2018 by all Contributors,
        ALL RIGHTS RESERVED
/tools/emu/pathfinder-sw/22.09-beta/bin/emusim_profile: line 97: saxpy_profile/timing-hooks-saxpy.uis: No such file or directory

real	0m0.244s
user	0m0.145s
sys	0m0.023s


CalledProcessError: Command 'b'time emusim_profile saxpy_profile -m 24 --total_nodes 2 -- timing-hooks-saxpy.mwx 8 128 5.0 \n'' returned non-zero exit status 1.

#### View Simulation Timing Hook and Profile Output

This should have generated the file `saxpy_profile/timing-hooks-saxpy-report.html`. Use the Jupyter file browser to navigate to `saxpy_profile` and open the report in your browser.


In [None]:
%%bash
#Clean up any older Slurm output files
rm -f *.out

#Run a single node 
sbatch sbatch-saxpy-timing.sh
 
#If the job runs successfully, the output file should print out "SAXPY complete!"
sleep 5 
#Show the content of the latest output file
printf "\nOutput from the run:\n"
less slurm-*.out

### Postcript

Once we've finished our testing, we can clean up some of the logfiles that we used for this example with `make clean`. Uncomment the following line to clean this directory.

## Measuring Performance on the Pathfinder Hardware



In [None]:
#!make clean