# Functional Test 4.2.2 - MTU Test

This Jupyter notebook will allow you to create VMs on different sites and worker nodes consistent with requirements for test 4.2.2 for testing jumbo-frames with 9k MTU size. This applies on a per-link basis and there are some exceptions where a full 9k MTU doesn't work. Normally within FABRIC dataplane the switches have an MTU setting of 9100, which provides a substantial overhead for MPLS to allow users to use frames even larger than 9k, however on several links due to provider limitations we are very close to the 9k limit:

- UTAH-GPN: 9078 (9060)
- DALL-TACC 9080 (9062)
- LBNL-RENC: 9022 (9004)
- UKY-RENC: 9022 (9004)
- UTAH-UCSD: 9078 (9060)
- WASH-MASS: 9018  (9000)

At least 18 octets of overhead are needed (IPv4). 

## Step 1:  Configure the Environment

Before running this notebook, you will need to configure your environment using the [Configure Environment](../../fablib_api/configure_environment/configure_environment.ipynb) notebook. Please stop here, open and run that notebook, then return to this notebook.

**This only needs to be done once.**

## Step 2: Import the FABlib Library

This acceptance test is adapted from an experimenter-supplied example: https://learn.fabric-testbed.net/forums/topic/ipv6-on-fabric-a-hop-with-a-low-mtu/ 

In [None]:
import re
import shlex
import threading

from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager
from fabrictestbed_extensions.fablib.slice import Slice

from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager

fablib = fablib_manager()
                     
conf = fablib.show_config()

## Step 3: Check your existing slices

Since testing can get confusing, check what slices you actually have. It may print nothing if you have no active slices.

In [None]:
try:
    for slice in fablib.get_slices():
        print(f"{slice}")
except Exception as e:
    print(f"Exception: {e}")

## Step 4: Create the Test Slices

Unlike most other tests this test is run against multiple sites at once and builds a table of MTU sizes that pass. 

In [None]:
# slice name prefix
SLICE_PREFIX = 'mtu@'

# if non-empty, create slices on these sites only
SITES_ONLY = ['GATECH', 'CLEM', 'GPN']

# NIC model, 'NIC_Basic' or 'NIC_ConnectX_5' or 'NIC_ConnectX_6'
NIC_MODEL = 'NIC_Basic'

# MTUs to test
#PROBE_MTUS = [256, 1280, 1420, 1500, 8900, 8948, 9000]
PROBE_MTUS = [8900, 8948, 9000]

re_loss = re.compile('([\d]+)% packet loss')
re_rtt = re.compile('rtt.*([.\d]+)/([.\d]+)/([.\d]+)/[.\d]+ ms')
width_mtu = 4
width_rtt = 4
width_td = width_mtu + width_rtt

# some helper functions
def list_slices() -> dict[str, Slice]:
    slices = {}
    for slice in fablib.get_slices():
        if slice.get_name().startswith(SLICE_PREFIX):
            site = slice.get_name()[len(SLICE_PREFIX):]
            slices[site] = slice
    return slices

def process_ping_result(thread: threading.Thread) -> str:
    try:
        stdout, stderr = thread.result()
    except:
        return 'ERR-CMD'.ljust(width_td)
    matches_loss = list(re_loss.finditer(stdout))
    if len(matches_loss) != len(PROBE_MTUS):
        return 'ERR-RE'.ljust(width_td)
    pass_mtu = 0
    for mtu, m_loss in zip(PROBE_MTUS, matches_loss):
        if m_loss[1] == '0':
            pass_mtu = mtu
    max_avg_rtt = -1
    for m_rtt in re_rtt.finditer(stdout):
        max_avg_rtt = max(max_avg_rtt, float(m_rtt[2]))
    return str(pass_mtu).ljust(width_mtu) + str(int(max_avg_rtt)).rjust(width_rtt)

Check for existing slices, then create several slices, 1 per site named `mtu@SITE` 

In [None]:
slices = list_slices()
if len(slices) > 0:
    print(f"Found slices for sites {' '.join(slices)}")

If some slices found they should be deleted

In [None]:
for site, slice in slices.items():
    print(f'Deleting slice for site {site}')
    slice.delete()

Now we can proceed to create the slices

In [None]:
# submit slices
for site in SITES_ONLY:
    slice = fablib.new_slice(name=SLICE_PREFIX+site)
    node = slice.add_node(name='node', site=site, cores=1, ram=2, disk=10, image='default_ubuntu_22')
    intfs = node.add_component(
        model=NIC_MODEL, name='nic0').get_interfaces()
    if len(intfs) < 2:
        intfs += node.add_component(model=NIC_MODEL, name='nic1').get_interfaces()
    intf4, intf6 = intfs[:2]
    slice.add_l3network(name='net4', interfaces=[intf4], type='IPv4')
    slice.add_l3network(name='net6', interfaces=[intf6], type='IPv6')
    print(f'Creating slice for site {site}')
    try:
        slice.submit(wait=False)
        slices[site] = slice
    except Exception as e:
        print(e)

Wait for slices to come up.

In [None]:
failed_sites = []
addrs: dict[str, dict[int, str]] = {}
for site, slice in slices.items():
    print(f'Waiting for site {site}')
    if slice.get_state() not in ['StableOK', 'ModifyOK']:
        try:
            slice.wait()
            slice.update()
        except Exception as e:
            print(f'Error in slice for site {site}')
            print(e)
            failed_sites.append(site)
            continue

print('The following sites failed:')
for site in failed_sites:
    print(f'{site}')
    del slices[site]

print('The following sites succeeded:')
# complete configuration for successful slices
for site, slice in slices.items():
    print(f'Site {site} slice {slice.get_name()} in state {slice.get_state()}')
    slice.wait_ssh(progress=False)
    slice.post_boot_config()

Start the network configuration

In [None]:
for site, slice in slices.items():
    node = slice.get_node('node')
    [ip4addr] = slice.get_l3network('net4').get_available_ips(count=1)
    [ip6addr] = slice.get_l3network('net6').get_available_ips(count=1)
    addrs[site] = {4: str(ip4addr), 6: str(ip6addr)}
    print(f'{site} is ready, mgmt {node.get_management_ip()}, IPv4 {ip4addr}, IPv6 {ip6addr}')

## Step 5: Create interface configurations

In [None]:
print('Applying IP configs for IPv4 and IPv6')
execute_threads = {}
for site, slice in slices.items():
    print(f'Site {site}')
    node = slice.get_node('node')
    cmds: list[str] = []
    for af in [4, 6]:
        intf = node.get_interface(network_name=f'net{af}')
        devname = intf.get_os_interface()
        addr = addrs[site][af]
        net = intf.get_network()
        cmds += [
            f'sudo ip link set {shlex.quote(devname)} up',
            f'sudo ip link set {shlex.quote(devname)} mtu 9000',
            f'sudo ip -{af} addr flush dev {shlex.quote(devname)}',
            f'sudo ip -{af} addr add {shlex.quote(addr)}/{net.get_subnet().prefixlen} dev {shlex.quote(devname)}'
        ]
        for dst in slices:
            if dst != site:
                cmds.append(
                    f'sudo ip -{af} route replace {addrs[dst][af]} via {net.get_gateway()}')
    execute_threads[site] = node.execute_thread('\n'.join(cmds))

for site, thread in execute_threads.items():
    stdout, stderr = thread.result()
    if stderr != '':
        print(f'IP config for {site} error:\n{stderr}')

## Step 6: Run ping tests with different MTUs and print results

In [None]:
for af, overhead in {4: 28, 6: 48}.items():
    print('')
    print(f'IPv{af} ping max MTU and max RTT')
    print('src\\dst'.ljust(width_td), end='')
    for dst in slices:
        print(' | ' + dst.center(width_td), end='')
    print('')
    print('-'*(width_td+1) + ('|'+'-'*(width_td+2))*len(slices))
    for src in slices:
        node = slices[src].get_node('node')
        execute_threads = {}
        for dst in slices:
            execute_threads[dst] = node.execute_thread('\n'.join([
                f'ping -I {shlex.quote(addrs[src][af])} -c 4 -i 0.2 -W 0.8 -M do -s {mtu-overhead} {shlex.quote(addrs[dst][af])}'
                for mtu in PROBE_MTUS
            ]))
        print(src.ljust(width_td), end='')
        for dst in slices:
            print(' | ' + process_ping_result(execute_threads[dst]), end='')
        print('')


## Step 7: Delete the Slices

Please delete your slices when you are done.

In [None]:
try:
    for site, slice in slices.items():
        print(f'Deleting slice in site {site}')
        slice.delete()
except Exception as e:
    print(f"Exception: {e}")