# Advanced Operating Systems: Lab 3 - TCP/IP - DTrace functions

This notebook provides sample code that collects data from DTrace probes and works not only with aggregations but also probes using `trace()` and `printf()` D functions.

## Import the DTrace module

As in previous labs, first import the `python-dtrace` module:

In [None]:
from dtrace import DTraceConsumerThread
import subprocess

## (UPDATED) Define a DTrace convenience function

Next, define the `dtrace_synchronous()` function with an additional argument `out`. The `out` argument is a function that is called whenever DTrace prints output, e.g. with `trace()` or `printf()`. The `walker` argument is a function that is called to collect aggregations. The `walker` and `out` arguments can be used together.

In [None]:
def dtrace_synchronous(script, walker, out, cmdline):
    """
    script - a D script
    walker - a routine to receive data from aggregations
    out - a routine to receive data from output
    cmdline - a command to run
    """
    
    # Create a seperate thread to run the DTrace instrumentation
    dtrace_thread = DTraceConsumerThread(script,
                                     walk_func=walker,
                                     out_func=out,
                                     chew_func=lambda v: None,
                                     chewrec_func=lambda v: None,
                                     sleep=1)
    
    # Start the DTrace instrumentation
    dtrace_thread.start()

    # Display header to indicate that dd(1) has started
    print("## Starting ", cmdline)

    output_dtrace = subprocess.run(cmdline.split(" "))
        
    # The benchmark has completed - stop the DTrace instrumentation
    dtrace_thread.stop()
    dtrace_thread.join()

    # Display footer to indicate that the benchmarking has finished
    if output_dtrace.returncode == 0:
        print("## Finished ", cmdline)
    elif output_dtrace.returncode == 64: # EX_USAGE
        print("## Invalid command", cmdline)
    else:
        print("## Failed with the exit code {}".format(output_dtrace.returncode))
        
    # Explicitly free DTrace resources.
    # Python's Garbage Collector would free DTrace resources when
    # dtrace_thread is reassigned, e.g. when the cell is reexecuted.
    # This could be confusing when analysing kernel from a terminal
    # and the notebook at the same time.
    del dtrace_thread

## Collect TCP segment details and system-call counts

In order to collect both information on TCP segments (as mentioned in Advanced Operating Systems: Lab 3 – TCP, General Information) and system-call counts (as in Advanced Operating Systems: Lab 1 - Getting Started with Kernel Tracing), we define two actions: one that prints to output details on a TCP segment, and one that aggregates system-call counts.

Our `out` function called `tcp_out` parses bytes from one output line at the time and must decode information printed to output as opposed to the `walker` function `tcp_walker` that receives a list of keys.

Note that the `tcp_script` D script is an example D script that should be extended with appropriate predicates to only trace information relevant to our benchmark.

In [None]:
tcp_script = """
fbt::tcp_do_segment:entry
{
    printf("%u %u %s",
        (unsigned int)args[1]->th_seq,
        (unsigned int)args[1]->th_ack,
        tcp_state_string[args[3]->t_state]);
}

syscall:::entry
/execname == "ipc-benchmark"/
{
    @syscalls[probefunc] = count();
}
"""

from collections import defaultdict
syscall_count_values = defaultdict(int)
tcp_segments = []

def tcp_walker(action, identifier, keys, value):
    """
    action -- a type of action (sum, avg, ...)
    identifier -- the id
    keys -- list of keys
    value -- the value
    """
    syscall_count_values[keys[0]] += value

def tcp_out(value):
    """
    value -- the value, of the bytes type.
    """
    value = value.decode('ascii').split(' ')
    tcp_segments.append({'seq': int(value[0]), 'ack': int(value[1]), 'state': value[2]})

dtrace_synchronous(tcp_script, tcp_walker, tcp_out, "ipc/ipc-benchmark -j -v -i tcp -b 64 -t 64 2thread")

for x in syscall_count_values.keys():
    print("Number of ", x, " calls: ", syscall_count_values[x])

for segment in tcp_segments:
    print("seq={} ack={} string={}".format(segment['seq'], segment['ack'], segment['state']))
