# Basic Thicket Tutorial: Thicket 101

Thicket is a python-based toolkit for Exploratory Data Analysis (EDA) of parallel performance data that enables performance optimization and understanding of applications’ performance on supercomputers. It bridges the performance tool gap between being able to consider only a single instance of a simulation run (e.g., single platform, single measurement tool, or single scale) and finding actionable insights in multi-dimensional, multi-scale, multi-architecture, and multi-tool performance datasets.

#### NOTE: An interactive version of this notebook is available in the Binder environment.

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/llnl/thicket-tutorial/develop)

***

## 1. Import Necessary Packages

To explore the structure and various capabilities of thicket components, we begin by importing necessary packages. These include python extensions and thicket's statistical functions.

In [None]:
import re

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display
from IPython.display import HTML
import hatchet as ht

import thicket as tt

display(HTML("<style>.container { width:80% !important; }</style>"))

## 2. Read in Performance Profiles

For this notebook, we select profiles generated on Lawrence Livermore National Lab(LLNL) machine, lassen. We create two thicket objects, one generated with the same problem size of 1048576 and the other generated with different problem sizes (1048576 and 4194304).   

In [None]:
lassen1 = [f"../data/lassen/XL_BaseCuda_01048576_0{x}.cali" for x in range(1, 4)]
lassen2 = [f"../data/lassen/XL_BaseCuda_04194304_01.cali"]

# generate thicket(s)
th_lassen = tt.Thicket.from_caliperreader(lassen1)
th_obj = tt.Thicket.from_caliperreader(lassen1+lassen2)

## 3. More Information on a Function
***
You can use the help() method within Python to see the information for a given object. You can do this by typing help(object). 
This will allow you to see the arguments for the function, and what will be returned. An example is below.

In [None]:
help(tt.median)

## 4. Thicket Components

### 4.1 Performance Data

The performance data table is a multi-dimensional, multi-indexed component of thicket. The rows represent nodes that each contain a different execution (i.e., profile index) of the associated call tree node. 


#### View performance data table:

In [None]:
th_lassen.dataframe

### 4.2 Metadata

The metadata table stores HPC simulation information such as an application’s build settings and execution context. A row corresponds to a single execution of the application and is identified by a unique profile index.

#### View metadata table:

In [None]:
th_lassen.metadata

#### Composing multiple Thickets:
We can compose thickets in a hierarchical, horizontal ordering using thicket's `columnar_join` function. In this example, we compose profiles of two different problem sizes and four different block sizes seamlessly.

In [None]:
problem_sizes = ["1M", "4M"]
data = {
    "block_128": [f"../data/lassen/new-cali/Base_CUDA-block_128-{i}.cali" for i in problem_sizes],
    "block_256": [f"../data/lassen/new-cali/Base_CUDA-block_256-{i}.cali" for i in problem_sizes],
    "block_512": [f"../data/lassen/new-cali/Base_CUDA-block_512-{i}.cali" for i in problem_sizes],
    "block_1024": [f"../data/lassen/new-cali/Base_CUDA-block_1024-{i}.cali" for i in problem_sizes],
}

In [None]:
block_128 = tt.Thicket.from_caliperreader(data["block_128"])
block_256 = tt.Thicket.from_caliperreader(data["block_256"])
block_512 = tt.Thicket.from_caliperreader(data["block_512"])
block_1024 = tt.Thicket.from_caliperreader(data["block_1024"])

In [None]:
th_cj = tt.Thicket.columnar_join(
    thicket_list=[block_128, block_256, block_512, block_1024],
    header_list=["Block 128", "Block 256", "Block 512", "Block 1024"],
    column_name="ProblemSizeRunParam"
)

In [None]:
print(th_cj.tree())

In [None]:
th_cj.dataframe

#### Filter with respect to metadata

The metadata table of a thicket helps select certain variation based on specific metadata. For example, selecting a certain compiler. In this example, we filter the metadata to select the profiles generated with a block size of `128`.

In [None]:
# selecting profiles originating form the quartz cluster
filter_metadata_func = lambda x: x['gpu_targets_block_sizes'] == "128"
th_example = th_obj.filter_metadata(filter_metadata_func)
th_example.metadata

#### Group with the metadata 

The metadata table also supports the grouping of the thicket based on the unique values present in the provided column(s). The following example groups the thicket according to any unique combination of values in the `launchdate` and `gpu_targets_block_sizes` columns.

In [None]:
# create sub-thickets from unique values in the cluster column
grouping_metadata_cols = ['launchdate', 'gpu_targets_block_sizes']
sub_thickets = th_lassen.groupby(grouping_metadata_cols)
for th in sub_thickets:
    display(th.metadata)

### 4.3 Aggregated Statistics

The aggregated statistics in a thicket is a GraphFrame. Therefore, it contains a graph and corresponding dataframe component. The table supports an order-reduction mechanism and stores processed applications’ performance. Each row of the aggregated statistics table holds data aggregated across all profiles associated with a particular call tree node. Below is an example of an empty aggregated statistics table 


#### View aggregated statistics table:

In [None]:
th_lassen.statsframe.dataframe

#### Filter with respect to aggregated statistics

The aggregated statistics table also supports a filter function. In the example below, we filter the table to select the nodes with the names `Base_CUDA`, `Algorithm`, and `Stream_MUL`.

In [None]:
stats_nodes = ["Base_CUDA", "Algorithm", "Stream_MUL"]
th_stats_name = th_obj.filter_stats(lambda x: x['name'] in stats_nodes)
th_stats_name.statsframe.dataframe

#### Append the average (mean and median) of performance data

The aggregated statistics table allows users to select a column from the performance data to perform an average on the values in a column. After performing the `median()` and `mean()` operation on said column, two new columns are appended to the statistics table with the mean and median values corresponding to the columns provided by the user. 

Below is an example where we calculate the mean and median of the values in the `sum#sum#sum#time.duration` column, which is the total exclusive time corresponding to the respective nodes.  

In [None]:
metrics = ['Total time (exc)']
tt.median(th_lassen,columns=metrics)
th_lassen.statsframe.dataframe

In [None]:
tt.mean(th_lassen,columns=metrics)
th_lassen.statsframe.dataframe

#### Append the percentile of exclusive time on performance data

The aggregated statistics table allows users to select a column from the performance data to perform the `percentiles()` operation. This results in a new column appended to the statistics table containing the 25th, 50th, and 75th percentiles of the values in the provided column.

Below is an example where we calculate the percentiles of the values in the same `sum#sum#sum#time.duration` column.

In [None]:
tt.percentiles(th_lassen,columns=metrics)
th_lassen.statsframe.dataframe

#### View aggregated statistics call tree:

* More explanation about the calc_average() function and the selected column `sum#sum#sum#time.duration_median` below.

In [None]:
th_lassen.dataframe.columns

In [None]:
tt.median(th_lassen, columns=['Total time (exc)'])
print(th_lassen.statsframe.tree(metric_column='Total time (exc)_median'))

#### Use the Query Language

Thicket's query language provides users the capability to select or `query` specific nodes based on the call tree of the thicket. The performance data is then updated as part of the operation. 

**Initial call tree:** 

In [None]:
print(th_lassen.statsframe.tree('Total time (exc)_median'))

**Example 1**

**Queried call tree:**

In the example below, we use a thicket query that only select the parents and children nodes of `Algorithm`, maintaining the structure of the call tree.

NOTE: A DeprecationWarning is generated when using “old-style” queries (i.e., queries with QueryMatcher) if you have the newest version of Hatchet installed.

In [None]:
alg_query_ex1 = (
    ht.QueryMatcher()
    .match("*")
    .rel(
        ".", 
        lambda row: row["name"].apply(
        lambda x: re.match(
            r"Algorithm.*", x
        ) 
        is not None).all()
    )
)
    
# applying the query on the lassen thicket
th_algorithm_ex1 = th_lassen.query(alg_query_ex1)
tt.median(th_algorithm_ex1, columns=['Total time (exc)'])
print(th_algorithm_ex1.statsframe.tree('Total time (exc)_median'))

**Example 2**

**Queried call tree:**

In the example below, we use a thicket query that selects the only the `Algorithm` node and it's children nodes, maintaining the structure of the call tree.

NOTE: A DeprecationWarning is generated when using “old-style” queries (i.e., queries with QueryMatcher) if you have the newest version of Hatchet installed.

In [None]:
alg_query_ex2 = (
    ht.QueryMatcher()
    .match(
        ".",
        lambda row: row["name"].apply(
        lambda x: re.match(
            r"Algorithm.*", x
        ) 
        is not None).all()
    )
    .rel("*")
)

# applying the second query on the lassen thicket
th_algorithm_ex2 = th_lassen.query(alg_query_ex2)
tt.median(th_algorithm_ex2, columns=['Total time (exc)'])
print(th_algorithm_ex2.statsframe.tree('Total time (exc)_median'))

#### Display histogram

The `display_histogram()` function allows users to select a node and metric value (a column in the performance data table) for which a histogram is generated.

Some available keyword arguments are the following,

* height: height (in inches) of each facet.
* aspect: aspect ratio of each facet, aspect * height will give you the width of each facet.
* bins: rather generic, and can set the number of bins.
* binwidth: width of each bin, overrides bins but can be used with binrange.
* binrange: lowest and highest value for bin edges; can be used either with bins or binwidth. Defaults to extremes.
* color: Set the color of the bars.

An exhaustive list of available arguments can be found [here](https://seaborn.pydata.org/generated/seaborn.displot.html).   

In [None]:
n = pd.unique(th_algorithm_ex1.dataframe.reset_index()["node"])[0]

In [None]:
tt.display_histogram(th_algorithm_ex1,node=n,column="Total time (exc)")

#### Display heatmap

The `display_heatmap()` function allows users to select column(s) from the performance data table, for which a heatmap is generated based on the values of the column.

Some available keyword arguments are the following: 
* vmax: maximum value to anchor the color map.
* vmin: minimum value to anchor the color map.
* linecolor: color of the lines that will divide each cell.
* linewidths: Width of the lines that will divide each cell.

An exhaustive list of available arguments can be found [here](https://seaborn.pydata.org/generated/seaborn.heatmap.html).  

In [None]:
th_algorithm_ex1.dataframe.columns

In [None]:
plt.figure(figsize = (30,30))
metrics = ["Total time (exc)_median"]
tt.display_heatmap(th_algorithm_ex1, columns=metrics)