# Lab 1: Configuring & Running Experiments in SimBricks

## 1. Minimal 2-Host Experiment
To get started, we will set up a simple experiment with two hosts that have one NIC each and are connected through a basic switch. The hosts run Linux with the regular network stack, and run the standard `iperf` network benchmark, with one host acting as the server, and the other as the client.

To keep simulation times low for testing here, we use unsynchronized and inaccurate simulator configurations: QEMU with KVM acceleration (if available), the behavioral Intel x710 (`i40e`) NIC model, and our simple behavioral switch.

In [1]:
import re
import asyncio
from simbricks.orchestration import system as sys_mod
from simbricks.orchestration import simulation
from simbricks.orchestration import instantiation
from simbricks.utils import base as utils_base
from simbricks.client.opus import base as opus_base

### 1.1. System Specification
We will start by specifiying the system from which we want to create a virtual prototype of. This is the usual way to start a SimBricks script.

Writing this system specification is about specifying **what** we want to simulate instead of making a choice on how (i.e. which simulator to use) to simulate 

The first step is to create an `System` object. This object contains pointers to all relevant components of the system we want to simulate. Later on we will use those components and decide for each which simulator we want to use.

In [2]:
syst = sys_mod.System()

Now we create a host specification for our client and add it to our system object. In this case we create Linux host that is supposed to have the driver for the IntelI40E nic available. Then we add two disk images. The `DistroDiskImage` is one of the linux images distributed alongside SimBricks that contains the required driver. The `LinuxConfigDiskImage` will later on store the actual commands that we want to execute during the simulation on this host.   

In [3]:
# create client
host0 = sys_mod.I40ELinuxHost(syst)
host0.add_disk(sys_mod.DistroDiskImage(h=host0, name="base"))
host0.add_disk(sys_mod.LinuxConfigDiskImage(h=host0))

After configuring the client host, we create a specification for an i40e NIC model, and connect it to both the host using a PCIe interface. Under the hood SimBricks system specifications use a notion of device interfaces that are connected through a channel. SImilar to the real world, we further assign an IP address to the NIC that will be made accessible to the host when connecting the NICs interface to the host. 

In [4]:
# create client NIC
nic0 = sys_mod.IntelI40eNIC(syst)
nic0.add_ipv4("10.0.0.1")
host0.connect_pcie_dev(nic0)

<simbricks.orchestration.system.pcie.PCIeChannel at 0x7e35181798a0>

Similar to the client, we create a server and attach a nic to the server.

In [5]:
# create server
host1 = sys_mod.I40ELinuxHost(syst)
host1.add_disk(sys_mod.DistroDiskImage(h=host1, name="base"))
host1.add_disk(sys_mod.LinuxConfigDiskImage(h=host1))
# create server NIC
nic1 = sys_mod.IntelI40eNIC(syst)
nic1.add_ipv4("10.0.0.2")
host1.connect_pcie_dev(nic1)

<simbricks.orchestration.system.pcie.PCIeChannel at 0x7e351817b100>

After creating and connecting the nic to our client host, we specify the application to run during the simulation. For the client we choose an iperf TCP client and pass it the server IP address to connect to. Further we specify the wait flag on that application. The wait flag is important to tell SimBricks to wait until this application ran through until the simualtion can be stopped and cleaned up (many simulators, for like a network simulator typically run until they are manually stopped when an experiment is executed. To basically tell SimBricks what applications and components we wait for, the wait flag is used).

In [6]:
# set client application
client_app = sys_mod.IperfTCPClient(h=host0, server_ip=nic1._ip)
client_app.wait = True
host0.add_app(client_app)

Again, similar to the client casae, we create a iperf server application and assign it to the server host we created before. Note that we do not need to specify the wait flag in this case, as we are interested in the client application to finish, not the server one.

In [7]:
# set server application
server_app = sys_mod.IperfTCPServer(h=host1)
host1.add_app(server_app)

Once we specified the client and server host/NICs we want to simulate, we create a specificatio of a simple switch that we connect to the ethernet interfaces of the previously created NICs to connect those with each other like in a real network.

In [8]:
# create switch and connect NICs to switch
switch = sys_mod.EthSwitch(syst)
switch.connect_eth_peer_if(nic0._eth_if)
switch.connect_eth_peer_if(nic1._eth_if)

<simbricks.orchestration.system.eth.EthChannel at 0x7e3518178ee0>

And that's it! We have assembled our first SimBricks system specification. We continue buy making a simulation choice.

### 1.2. Simulation Specification
In the previoius step we specified the system that we want to simulate. After we did this we now have to make a choice on **what simulators** we want to use to simulate this system.

When thinking about this, one realizes that e.g. a linux host might be simualted be either QEMU or Gem5 whereas a NIC might e.g. be simulated by using a behavioral model or by simulating the actual RTL. The next step is all about making this choice.

The first step is to create a Simulation object:

In [9]:
"""
Simulator Choice
"""
sim = simulation.Simulation(name="My-simple-simulation", system=syst)

In the next step, we go over the system components that we created before (i.e. the two hosts, the two nics and the switch) and create an simulator instance. Each of the system components is then added to the simulator that is supposed to simulate that component. 

Depending on whether simulator supports this, you can also choose to add multiple of those components to the same simulator instance, thus causing a single simulator to simulate mulstiple components at once. This can be very useful. An example of this can be seen in the __[networking-case-study](https://github.com/simbricks/simbricks-examples/tree/orchestration-framework-rework/networking-case-study)__ example within this repo, were we use a single ns3 instance to simulate multiple components (e.g. switches).

In [10]:
host_inst0 = simulation.QemuSim(sim)
host_inst0.add(host0)

host_inst1 = simulation.QemuSim(sim)
host_inst1.add(host1)

nic_inst0 = simulation.I40eNicSim(sim)
nic_inst0.add(nic0)

nic_inst1 = simulation.I40eNicSim(sim)
nic_inst1.add(nic1)

net_inst = simulation.SwitchNet(sim)
net_inst.add(switch)

If you read the SimBricks paper you know that SimBrciks connects simulator instances via shared memory queues to which we refer as `Channel` in the orchestration framework. Those channels can be used in synchronized mode or unsynchronized (default) mode. To e.g. enable to run an experiment synchronized (this is required for accurate time measurements) we need to enable synchronization for those channels:  

In [11]:
# if synchronized set, enable synchronization for all SimBricks channels
synchronized = False
if synchronized:
    sim.enable_synchronization(amount=500, ratio=utils_base.Time.Nanoseconds)

### 1.3 Instantiation

The last thing we need to take care of in order to simulate our virtual prototype is to create an instantiation of it. In our example we create a very simple instantiation and assign the previously created simulation to it. That's already it in this example and were ready for execution. 

Even though it might seem not useful at this point, in more sophisticated experiment setups the instantiation is used to specify if artifacts of the simulation shall be created, whether checkpointing shall be used to reduce simulation times or to configure distributed simulations. For **simplicity** we ignore these more advanced use cases in this example.

In [12]:
"""
Create an instatiation of your virtual prototype
"""
instantiations = []
instance = instantiation.Instantiation(sim)
instantiations.append(instance)

## 2 Running the Simulation

**TODO: talk about the general execution model of SimBRciks, i.e. client <--> backend <--> runner**

### 2.1 Running the simulation via CLI

### 2.2 Running the simulation through the python API

SimBricks does also offer a programmatic way to create and submit virtual prototypes to the SimBricks backend in order to schedule their execution on a runner. TODO

In [15]:
# create and send simulation run to the SimBricks backend
run_id = await opus_base.create_run(instance)

ClientResponseError: 404, message='Not Found', url=URL('https://app.simbricks.io/api/ns/Demo/-/systems')

**TODO: parse output etc, very easy `ConsoleLineGenerator` **

In [None]:
# helper function to create and parse the experiment output
async def iperf_throughput() -> None:
    # Regex to match output lines from iperf client
    tp_pat = re.compile(
        r"\[ *\d*\] *([0-9\.]*)- *([0-9\.]*) sec.*Bytes *([0-9\.]*) ([GM])bits.*"
    )
    throughputs = []
    # iterate through host output
    line_gen = opus_base.ConsoleLineGenerator(run_id=run_id, follow=True)
    async for _, line in line_gen.generate_lines():
        m = tp_pat.match(line)
        if not m:
            continue
        if m.group(4) == "G":
            throughputs.append(float(m.group(3)) * 1000)
        elif m.group(4) == "M":
            throughputs.append(float(m.group(3)))

    avg_throughput = sum(throughputs) / len(throughputs)
    print(f"Iperf Throughput : {avg_throughput} Mbps")

await iperf_throughput()

### 1.2. Running the Simulation

Now that we have assembled our simulation, the next step is running it. Typically we would save the experiment to a `.py` file, and then use the `run.py` script and run it from the terminal. (Details on how to do that below.) Here we will directly use our python orchestration mechanisms to run a simple local experiment.

For this we first specify the `LocalExecutor` to run our simulation locally directly by starting the commands from the python process. (other executors execute component simulators on remote hosts, particularly useful for distributed simulations spanning multiple hosts)

In [None]:
import simbricks.orchestration.experiments as exp
import simbricks.orchestration.experiment.experiment_environment as expenv
import simbricks.orchestration.experiment.experiment_output as expout
import simbricks.orchestration.exectools as exectools
import simbricks.orchestration.runtime as runtime
import simbricks.orchestration.runners as runners

executor = exectools.LocalExecutor()

Next, we set up an experiment environment, which specifies paths for simulator executables, unix sockets, shared memory regions, working copies of simulator disk images, and other working files of different component simulators. Here we just pass in the path to our SimBricks repo and a new working directory for this experiment. For the rest the `ExpEnv` has sensible defaults that can be customized if necessary.

In [None]:
workdir = './out/lab1_test1'
env = expenv.ExpEnv(repo_path='/simbricks', workdir=workdir, cpdir=workdir)

Now we create a `Run` object, which is intended for a single execution of an experiment, and stores the experiment, and the environment. For the run we then initialize the output and working directory.

In [None]:
run = runtime.Run(experiment=e, index=0, env=env, outpath=workdir, prereq=None)
await run.prep_dirs(executor)

Finally, we create a runner responsible for orchestrating the simulation run. The execution results in an output object that contains all simulator outputs, commands, executed, timestamps, and other metadata for the run. We create the runner with the verbose flag set here, so we also see the output on the console as the simulation executes, helpful for debugging but all this information is also contained in the output object.

In [None]:
runner = runners.ExperimentSimpleRunner(executor, exp=e, env=env, verbose=True)
await runner.prepare()
output = await runner.run()

#### Runing Simulations from the Terminal
To run the same simulation from the terminal, we would save the experiment configuration to  a `.py` file, which has to contain a list `experiments` of all experiment configurations in the file (here `experiments = [e]` at the bottom of the file would suffice):

In [None]:
# This code just displays nicely formatted contents of my-simple-experiment.py

import IPython


def display_source(code):

    def _jupyterlab_repr_html_(self):
        from pygments import highlight
        from pygments.formatters import HtmlFormatter

        fmt = HtmlFormatter()
        style = '<style>{}\n{}</style>'.format(
            fmt.get_style_defs('.output_html'),
            fmt.get_style_defs('.jp-RenderedHTML')
        )
        return style + highlight(self.data, self._get_lexer(), fmt)

    # Replace _repr_html_ with our own version that adds the 'jp-RenderedHTML' class
    # in addition to 'output_html'.
    IPython.display.Code._repr_html_ = _jupyterlab_repr_html_
    return IPython.display.Code(data=code, language='python3')


with open('my-simple-experiment.py', 'r') as f:
    test2_src = f.read()

display_source(test2_src)

This file can then be passed to `simbricks-run`. Assuming we save it as `my-simple-experiment.py` we would run this simulation with `simbricks-run --verbose my-simple-experiment.py`. By default `simbricks-run` will use `out/EXPERIMENT-NAME/N` as the working directory for the N-th run of an experiment, and store the output in json format as `out/EXPERIMENT-NAME-N.json`. If the json file for a run already exists execution of that run is skipped unless `--force` is also specified.

In [None]:
import os

os.system('simbricks-run --verbose --force my-simple-experiment.py')

The experiment output is now stored in `out/my-simple-experiment-1.json` as json from where we later load it for processing:

In [None]:
output_2 = expout.ExpOutput(e) # TODO: refactor so this does not take e as a parameter!
output_2.load('out/my-simple-experiment-1.json')

### 1.3. Processing the Output
After running a simulation, the `ExpOutput` objects contain timestamps for when the simulation started and stopped, the commands executed for each simulator, and per-simulator output, including the console for host simulator, as well as additional metadata. For long running simulations in particular, we recommend first running simulations and storing the output to json files (through `run.py` or by calling `ExpOutput.dumps()` from python), and post-processing the obtained data in a separate step. This avoids having to re-run potentially long-running simulations when tweaking the post processing.

Below is a simple post-processing example parsing the `iperf` throughput from the output of the two simulations run above:

In [None]:
import re


# Parse iperf throughput for specified host from output
def parse_iperf_throughput(out: expout.ExpOutput, host: sim.HostSim) -> float:
    # Regex to match output lines from iperf client
    tp_pat = re.compile(
        r'\[ *\d*\] *([0-9\.]*)- *([0-9\.]*) sec.*Bytes *([0-9\.]*) ([GM])bits.*'
    )
    throughputs = []
    # iterate through host output
    for l in out.sims[host.full_name()]['stdout']:
        m = tp_pat.match(l)
        if not m:
            continue
        if m.group(4) == 'G':
            throughputs.append(float(m.group(3)) * 1000)
        elif m.group(4) == 'M':
            throughputs.append(float(m.group(3)))
    return sum(throughputs) / len(throughputs)


print(
    'Throughput test 1: %.2f Mbps\nSimulation time: %.2fs\n' % (
        parse_iperf_throughput(output, client),
        output.end_time - output.start_time
    )
)
print(
    'Throughput test 2: %.2f Mbps\nSimulation time: %.2f' % (
        parse_iperf_throughput(output_2, client),
        output_2.end_time - output_2.start_time
    )
)
