# Kernel Functions Profiling

This notebook is there to show how to do some simple function profiling using LISA.

We'll be using the Ftrace function profiler (See "Function profiling" in https://lwn.net/Articles/370423/) for that, and will present the relevant Python APIs from Devlib & LISA that make it easier to use.

In [None]:
import logging
from lisa.utils import setup_logging
setup_logging()

In [None]:
import os
from lisa.target import Target, TargetConf

## Target configuration

The only target requirement here is to have enough Ftrace goodies enabled (look at the requirements for **CONFIG_FUNCTION_PROFILER**)

In [None]:
target = Target(
    kind='linux',
    name='myhikey960',
    host='192.168.0.1',
    username='root',
    password='root',
)

## Experiment setup

We can run whatever we want here, let's just build a simple ((1 20% task) x NR_CPUS) workload

In [None]:
from lisa.wlgen.rta import RTA, Periodic

In [None]:
rtapp_profile = {}

for cpu in range(target.number_of_cpus):
    rtapp_profile["task{}".format(cpu)] = Periodic(duty_cycle_pct=20)

In [None]:
wload = RTA.by_profile(target, "profiling_wload", rtapp_profile)

Now, let's define the functions we want to do some profiling on. Do keep in mind all functions might not be profilable - that can happen if they are inline.

In [None]:
functions = [
    "scheduler_tick",
    "run_rebalance_domains"
]

We're using an FtraceCollector so might as well record some basic events to get a meaningful trace

In [None]:
events = [
    "sched_switch",
    "sched_wakeup",
    "sched_wakeup_new"
]

## Running the experiment

In [None]:
from lisa.trace import FtraceCollector

In [None]:
ftrace_coll = FtraceCollector(target, functions=functions, buffer_size=10240)
trace_path = os.path.join(wload.res_dir, "trace.dat")
with ftrace_coll:
    wload.run()
ftrace_coll.get_trace(trace_path)

# Save the profiling stats
ftrace.get_stats(os.path.join(wload.res_dir, "stats.json"))

In [None]:
!tree {wload.res_dir}

## Loading the trace

In [None]:
from lisa.trace import Trace

In [None]:
trace = Trace(trace_path, target.plat_info, events=events)

We can have a look at the trace of the workload we just ran

In [None]:
from trappy.plotter import plot_trace

In [None]:
plot_trace(trace.ftrace)

## Loading the function profiling

The profiling stats are JSON so let's load it up into a dict

In [None]:
import json

In [None]:
stats_path = os.path.join(wload.res_dir, "stats.json")

with open(stats_path, "r") as fh:
    # That ';' is just there to prevent Jupyter from dumping the dict in stdout
    stats = json.load(fh);

The data in the file is arranged like so:

- For each CPU
    - For each function
       - time (µs)
       - hits (#)
       - s_2, AKA variance - apply sqrt() to get standard deviation
       - avg (µs)
       
To make it a bit simpler to manipulate, we're going to turn this data into a pandas DataFrame.

In [None]:
import pandas as pd

def stats_to_df(stats_dict):
    """
    Turn Ftrace function profiling stats into a pandas DataFtrame
    
    :param stats_dict: The stats dictionnary generated by FtraceCollector
    :type stats_dict: dict
    """
    data = []
    index = []
    
    for cpu, functions in stats_dict.items():
        index.append(int(cpu))
        columns = []
        line = []
        
        for function, stats in functions.items():
            
            for name, stat in stats.items():
                columns.append((function, name))
                line.append(stat)

        data.append(line)
        
    df = pd.DataFrame(data, index=index, columns=columns)
    df.columns = pd.MultiIndex.from_tuples(df.columns, names=["function", "cpu"])
    df = df.sort_index()
    return df

In [None]:
df = stats_to_df(stats)

Here's how the Dataframe looks like:

In [None]:
df

We can easily have a look at a specific function:

In [None]:
df.run_rebalance_domains

It's also easy to get overall stats for one function. For instance, if we want the total number of hits for a function (summing up the number of hits over all CPUs), that can be done like so:

In [None]:
df.run_rebalance_domains.hits.sum()

You can also get stats recorded on a single CPU like so:

In [None]:
df.run_rebalance_domains.loc[2]

## Visual profiling

Now that we have all of the relevant data in Dataframe format, it's very easy to make plots out of it

In [None]:
import matplotlib.pyplot as plt
import numpy as np

In [None]:
def plot_hits(df, function):
    fig, ax = plt.subplots(figsize=(16, 5))
    
    df[function].hits.plot.bar(ax=ax)    
    ax.set_title("Per-CPU hits of \"{}\"".format(function))
    ax.set_xlabel("CPU")
    ax.set_ylabel("# of hits")
    ax.grid(True)

In [None]:
def plot_time_avg(df, function):
    fig, ax = plt.subplots(figsize=(16, 5))
    
    # Let's compute the standard deviation to plot error bars
    stddev = df[function].s_2.apply(np.sqrt)
    
    df[function].avg.plot.bar(ax=ax, yerr=stddev, capsize=10)    
    ax.set_title("Per-CPU average time of \"{}\"".format(function))
    ax.set_xlabel("CPU")
    ax.set_ylabel("Average time (µs)")
    ax.grid(True)

In [None]:
for function in functions:
    plot_hits(df, function)
    plot_time_avg(df, function)