# Spatter Benchmark Worflow

This notebook will guide you through the process of performing memory performance analysis using the Spatter benchmark. For more detail please refer to our [wiki](https://github.com/hpcgarage/spatter/wiki/Spatter-Benchmark-Workflow). As an example, we will use [Quicksilver](https://github.com/LLNL/Quicksilver) with the [All Absorb](https://github.com/LLNL/Quicksilver/blob/master/Examples/AllAbsorb/allAbsorb.inp) input.

### Installations

1) Install [Quicksilver](https://github.com/LLNL/Quicksilver)
    1) Build with OpenMP and debug symbols ON
2) Install [VTune](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler-download.html)
3) Install [gs_patterns](https://github.com/lanl/gs_patterns)
    1) Make sure to also install [PIN](https://www.intel.com/content/www/us/en/developer/articles/tool/pin-a-dynamic-binary-instrumentation-tool.html) following the steps in the [pin_tracing](https://github.com/lanl/gs_patterns/tree/main/pin_tracing) folder

In [50]:
export qs_dir="/nethome/jvalverde6/Quicksilver"
export notebook_dir="$(pwd)"
export gspatterns_dir="/nethome/jvalverde6/gs_patterns"
export pin_dir="/nethome/jvalverde6/gs_patterns/pin_dir"

### Identifying Hotspots

In order to identify memory hot spots, we will need to profile an execution of the application.

In [40]:
module load vtune
vtune -collect hotspots "$qs_dir"/src/qs "$qs_dir"/Examples/AllAbsorb/allAbsorb.inp

vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /nethome/jvalverde6/spatter/notebooks/r002hs -command stop.
Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version     : 2023-Aug-18-15:12:10
Quicksilver Git Hash    : eb68bb8d6fc53de1f65011d4e79ff2ed0dd60f3b
MPI Version             : 3.0
Number of MPI ranks     : 1
Number of OpenMP Threads: 128
Number of OpenMP CPUs   : 128

Simulation:
   dt: 1e-08
   fMax: 0.1
   inputFile: 
   energySpectrum: 
   boundaryCondition: reflect
   loadBalance: 0
   cycleTimers: 0
   debugThreads: 0
   lx: 100
   ly: 100
   lz: 100
   nParticles: 1000000
   batchSize: 0
   nBatches: 10
   nSteps: 10
   nx: 10
   ny: 10
   nz: 10
   seed: 1029384756
   xDom: 0
   yDom: 0
   zDom: 0
   eMax: 20
   eMin: 1e-09
   nGroups: 230
   lowWeightCutoff: 0.001
   bTally: 1
   fTally: 1
   cTally: 1
   coralBenchmark: 0
   crossSectionsOut:

Geometry:
   m

    Operating System: 4.18.0-513.18.1.el8_9.x86_64 Red Hat Enterprise Linux release 8.9 (Ootpa)
    Computer Name: flubber8.crnch.gatech.edu
    Result Size: 14.3 MB 
    Collection start time: 19:41:20 20/03/2024 UTC
    Collection stop time: 19:41:25 20/03/2024 UTC
    Collector Type: Driverless Perf per-process counting,User-mode sampling and tracing
    CPU
        Name: Intel(R) Xeon(R) Processor code named Icelake
        Frequency: 2.000 GHz
        Logical CPU Count: 128
        LLC size: 50.3 MB 
        Cache Allocation Technology
            Level 2 capability: not detected
            Level 3 capability: available

If you want to skip descriptions of detected performance issues in the report,
enter: vtune -report summary -report-knob show-issues=false -r <my_result_dir>.
Alternatively, you may view the report in the csv format: vtune -report
<report_name> -format=csv.
vtune: Executing actions 100 % done                                            


Hotspot functions will be towards the end of the output, under **Top Hotspots**. 

Vtune might output some hotspot functions that look like:

1) func@0x1de24  
2) func@0x1dfd4

These are functions from an external library which are not traceable, therefore we will not focus on them. 

Functions that are traceable will usually have a friendlier name. We will trace the function **CycleTrackingFunction**.

Start by creating a file called **roi_funcs.txt**. This file should be located in the same directory where you call gs_patterns, and will contain the names of the functions you want to trace in your application, each on a separate line. 

You will find a **roi_funcs.txt** file already containing **CycleTrackingFunction** in the directory containing this notebook.

## Generating Traces

### Running Pin

Be advised running Pin might take a significant amount of time depending on the run time of your app, and how many functions you trace. 

In [53]:
"$pin_dir"/pin -t "$pin_dir"/source/tools/ImemROI/obj-intel64/ImemROIThreads.so -- "$qs_dir"/src/qs "$qs_dir"/Examples/AllAbsorb/allAbsorb.inp

PIN -- ROI_FUNC[ 0]: CycleTrackingFunction

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version     : 2023-Aug-18-15:12:10
Quicksilver Git Hash    : eb68bb8d6fc53de1f65011d4e79ff2ed0dd60f3b
MPI Version             : 3.0
Number of MPI ranks     : 1
Number of OpenMP Threads: 128
Number of OpenMP CPUs   : 128

Simulation:
   dt: 1e-08
   fMax: 0.1
   inputFile: 
   energySpectrum: 
   boundaryCondition: reflect
   loadBalance: 0
   cycleTimers: 0
   debugThreads: 0
   lx: 100
   ly: 100
   lz: 100
   nParticles: 1000000
   batchSize: 0
   nBatches: 10
   nSteps: 10
   nx: 10
   ny: 10
   nz: 10
   seed: 1029384756
   xDom: 0
   yDom: 0
   zDom: 0
   eMax: 20
   eMin: 1e-09
   nGroups: 230
   lowWeightCutoff: 0.001
   bTally: 1
   fTally: 1
   cTally: 1
   coralBenchmark: 0
   crossSectionsOut:

Geometry:
   material: sourceMaterial
   shape: brick
   xMax: 100
   xMin: 0
   yMax: 100
   yMin: 0
   zMax: 100
   zMin: 0

Material:
   name: so

There should be an output file called **roitrace.00.CycleTrackingFunction.bin**. For the next step, we must **gzip** the file. Again, this step might take a significant amount of time depending on the size of the file. 

In [55]:
gzip roitrace.00.CycleTrackingFunction.bin

### Running gs_patterns

This command will likely take the longest out of all. 

In [56]:
"$gspatterns_dir"/build/gs_patterns roitrace.00.CycleTrackingFunction.bin.gz "$qs_dir"/src/qs

First pass to find top gather / scatter iaddresses
..................................................
 RESULTS 
DRTRACE STATS
DRTRACE LINES:              1755297292
OPCODES:                    1252846494
MEMOPCODES:                  501650350
LOAD/STORES:                 502450790
OTHER:                               8

GATHER/SCATTER STATS: 
LOADS per GATHER:                6.280
STORES per SCATTER:              6.865
GATHER COUNT:                   23.774 (log2)
SCATTER COUNT:                  21.629 (log2)
OTHER  COUNT:                   28.539 (log2)

Symbol table lookup for gathers...done.
Symbol table lookup for scatters...done.

Second pass to fill gather / scatter subtraces
..................................................

***************************************************************************************
roitrace.00.CycleTrackingFunction.bin.sbin
GIADDR   -- 0x41818c
SRCLINE  -- /nethome/jvalverde6/Quicksilver/src/NuclearData.cc:193
GATHER % --  7.192% (512-bit chunks)
N

        14         14         14         14         14         14         14         14         16         16         16         16         16 
        16         16         16         16         18         18         18         18         18         18         18         18         18 

DIST HISTOGRAM --
   -18: 19273
   -16: 17033
   -14: 17076
   -12: 16964
   -10: 17208
    -8: 17291
    -6: 17079
    -4: 17262
    -2: 17217
     0: 6863258
     2: 789228
***************************************************************************************

***************************************************************************************
roitrace.00.CycleTrackingFunction.bin.sbin
GIADDR   -- 0x4128f4
SRCLINE  -- /nethome/jvalverde6/Quicksilver/src/MacroscopicCrossSection.cc:17
GATHER % --  5.158% (512-bit chunks)
NDISTS  -- 7808890
         0          4          8         12         16         20         24         28         32         36          0          0          0 
         0       

     31778      31782      30538      30542      30546      30550      30554      30558      30562      30566      30570      30574      30578 
     30582      30586      30590      30594      30598      30602      30606      30610      30614      30618      30622      30626      30630 

DIST HISTOGRAM --
< -512: 22689
  -476: 353
  -380: 622
  -284: 730
  -188: 4671
   -92: 143148
     4: 4629697
   100: 750
   196: 638
   292: 348
   388: 383
   484: 425
>  512: 21777
***************************************************************************************


***************************************************************************************
roitrace.00.CycleTrackingFunction.bin.sbin
SIADDR    -- 0x40ad30
SRCLINE   -- /nethome/jvalverde6/Quicksilver/src/MC_Distance_To_Facet.hh:14
SCATTER % -- 18.597% (512-bit chunks)
NDISTS  -- 4826232
         0          2          4          6          8         10         12         14         16         18         20         22         24 
     


DIST HISTOGRAM --
   -46: 8033
     0: 84692
    46: 8033
***************************************************************************************

***************************************************************************************
roitrace.00.CycleTrackingFunction.bin.sbin
SIADDR    -- 0x40506b
SRCLINE   -- /nethome/jvalverde6/Quicksilver/src/MC_RNG_State.hh:26
SCATTER % --  0.248% (512-bit chunks)
NDISTS  -- 100759
        46         46         46         46         46         46         46         46         46         46         46         46         46 
        46         46         46         46         46          0          0         46         46         46         46         46         46 
        46          0          0         46          0         46         46         46         46          0         46         46         46 
...
        46         46         46         46         46         46         46          0          0         46         46         46        

Once the above command is complete, you can run Spatter with the new **roitrace.00.CycleTrackingFunction.bin.json** file. Please refer to the [Getting Started](http://localhost:8080/notebooks/spatter/notebooks/GettingStarted.ipynb) notebook for more instructions on how to run Spatter. 

If you'd like to take a closer look at the pattern generated, please refer to the [Graphing Patterns]() notebook. 