# EJFAT LB Control Plane Tester

This notebook stands up a slice of 3 nodes - sender, receiver and cpnode. The control plane daemon is deployed on the `cpnode` node in a 'mock' configuration (with no FPGA). DAQ and worker node code can be deployed on `sender` and `receiver` nodes for testing. The slice uses 'shared' and is created within a single FABRIC site for simplicity. It uses a single L2 bridge connection with RFC1918 IPv4 addressing, allowing all nodes to talk to each other. It is possible to run the dataplane assuming a single worker can keep up with a single sender since no actual load balancer is present in this configuration.

This notebook uses the E2SAR release debian pacakges found at [Github releases](https://github.com/JeffersonLab/E2SAR/releases). We need this release to create our shared JNI native library

Slice example:

<div>
    <img src="figs/UDP LB Control Plane Testing slice.png" width=500>
</div>

## Preparation and overview

- Be sure to [generate a keypair for Jupyter Hub](GitHubSSH.ipynb) and register it with GitHub - the keys will be used to check out the code from private repositories, like [UDPLBd](https://github.com/esnet/udplbd) and [E2SAR](https://github.com/JeffersonLab/E2SAR).
- Note that for E2SAR development and testing sender and receiver node compile/build environments will be setup via post-boot scripts ([sender](post-boot/sender.sh) and [receiver](post-boot/recver.sh))
- The E2SAR [debian/rpm package]((https://github.com/JeffersonLab/E2SAR/releases)) comes with all necessary dependencies.
- This does not setup the control plane node for anything, but testing a specific version - you can set which branch of UDPLBd to check out and a containerized version is built and stood up.

# EJFAT LB Control Plane Tester

This notebook stands up a slice of 3 nodes - sender, receiver and cpnode. The control plane daemon is deployed on the `cpnode` node in a 'mock' configuration (with no FPGA). DAQ and worker node code can be deployed on `sender` and `receiver` nodes for testing. The slice uses 'shared' and is created within a single FABRIC site for simplicity. It uses a single L2 bridge connection with RFC1918 IPv4 addressing, allowing all nodes to talk to each other. It is possible to run the dataplane assuming a single worker can keep up with a single sender since no actual load balancer is present in this configuration.

This notebook uses the E2SAR release debian pacakges found at [Github releases](https://github.com/JeffersonLab/E2SAR/releases). If you want to build E2SAR from scratch, you can use the [E2SAR Dev tester](E2SAR-development-tester.ipynb) notebook.

Slice example:

<div>
    <img src="figs/UDP LB Control Plane Testing slice.png" width=500>
</div>

## Preparation and overview

- Be sure to [generate a keypair for Jupyter Hub](https://github.com/JeffersonLab/E2SAR/blob/main/scripts/notebooks/EJFAT/GitHubSSH.ipynb) and register it with GitHub - the keys will be used to check out the code from private repositories, like [UDPLBd](https://github.com/esnet/udplbd) and [E2SAR](https://github.com/JeffersonLab/E2SAR).
- Note that for E2SAR development and testing sender and receiver node compile/build environments will be setup via post-boot scripts ([sender](post-boot/sender.sh) and [receiver](post-boot/recver.sh))
- The E2SAR [debian package]((https://github.com/JeffersonLab/E2SAR/releases)) comes with all necessary dependencies.
- This does not setup the control plane node for anything, but testing a specific version - you can set which branch of UDPLBd to check out and a containerized version is built and stood up.

## Preamble

This cell must be executed whether you are creating a new slice or continuing work on the old one. If you are continuing work, you then skip the slice create section and proceed to wherever you left off.

In [16]:
#
# EDIT THIS
#
# if you want to force a site instead of using random
# Pick 'UCSD', 'SRI', 'FIU' or 'TOKY' - these sites have
# IPv4. Other sites use IPv6 management and have trouble
# retrieving git-lfs artifacts.

 ## NEED TO USE IPV4 HERE AS MAVEN CENTRAL REPO DOES NOT WORK WITH IPV6. THERE IS ALSO AN ISSUE WHERE GRADLE CANNOT BE DOWNLOADED ON IPV6

site_override = 'SRI'
#site_override = None

# GitHub SSH key file (private) registered using the GitHubSSH.ipynb notebook referenced above
github_key = '/home/fabric/work/fabric_config/github_ecdsa'
signing_key = '/home/fabric/work/fabric_config/signing'

# branches for UDPLBd and E2SAR that we want checked out on the VMs
udplbd_branch = 'main'
e2sar_branch = 'e2sar-java' #there is no branch called release, this just keeps this notebooks slice separate.

# which of the available config files to use with UDPLBd
udplbd_config = 'lb_mock-tls.yml'

#base distro type - either default or docker
distro_types = ['default','docker']
distro_type = distro_types[0]

# base distro 'ubuntu' or 'rocky'
distro_name = 'ubuntu'

#base distro version, currently only for ubuntu 20,22,24. E2SAR dependencies will be 
#downloaded for the appropriate versions.
distro_version = '22'

# note that the below is distribution specific ('ubuntu' for ubuntu and so on)
home_location = {
    'ubuntu': '/home/ubuntu',
    'rocky' : '/home/rocky'
}[distro_name]

vm_key_location = f'{home_location}/.ssh/github_ecdsa'
sign_key_location = f'{home_location}/.ssh/signing'

# which test suites in E2SAR to run (leave empty to run all)
# you can set 'unit' or 'live' to run unit or live tests only
e2sar_test_suite = ''

# name of the network connecting the nodes
net_name = 'site_bridge_net'

# url of e2sar deb. Find the appropriate version for the OS at https://github.com/JeffersonLab/E2SAR/releases
static_release_url = 'https://github.com/JeffersonLab/E2SAR/releases/download/' # don't need to change this
e2sar_release_ver = 'E2SAR-main-0.1.5'
e2sar_release_artifact = "e2sar_0.1.5_amd64.deb"
e2sar_release_url = static_release_url + e2sar_release_ver + "-" + distro_name + "-" + distro_version + ".04/" + e2sar_release_artifact

#
# SHOULDN'T NEED TO EDIT BELOW
#
# Preamble
import json
from datetime import datetime
from datetime import timezone
from datetime import timedelta

from fabrictestbed_extensions.fablib.fablib import FablibManager as fablib_manager

from ipaddress import ip_address, IPv4Address, IPv6Address, IPv4Network, IPv6Network
import ipaddress

fablib = fablib_manager()             
fablib.show_config();

# Using docker image for cpnode by default
distro_image = distro_type + '_' + distro_name + '_' + distro_version
cp_distro_image = distro_types[1] + '_' + distro_name + '_' + distro_version

# variable settings
slice_name = f'UDP LB Control Plane Testing with udplbd[{udplbd_branch}], e2sar[{e2sar_branch}] on {distro_name}'
slice_name = "UDP LB Control Plane Testing with udplbd[develop], e2sar[e2sar-java] on ubuntu"

# for each node specify IP address (assuming /24), OS image
# note that most of the keys in these dictionaries map directly
# onto parameters to add_node()
node_config = {
    'sender': {
        'ip':'192.168.0.1', 
        'image': distro_image,
        'cores': 8,
        'ram': 24,
        'disk': 100 },
    'recver': {
        'ip':'192.168.0.2', 
        'image':distro_image,
        'cores':8,
        'ram': 24,
        'disk': 100 },
    'cpnode': {
        'ip':'192.168.0.3', 
        'image':distro_image,
        'cores':8,
        'ram': 8,
        'disk': 100 },
}
# skip these keys as they are not part of add_node params
skip_keys = ['ip']
# this is the NIC to use
nic_model = 'NIC_Basic'
# the subnet should match IPs
subnet = IPv4Network("192.168.1.0/24")

def execute_single_node(node, commands):
    for command in commands:
        print(f'\tExecuting "{command}" on node {node.get_name()}')
        #stdout, stderr = node.execute(command, quiet=True, output_file=node.get_name() + '_install.log')
        stdout, stderr = node.execute(command)
    if not stderr and len(stderr) > 0:
        print(f'Error encountered with "{command}": {stderr}')
        
def execute_commands(node, commands):
    if isinstance(node, list):
        for n in node:
            execute_single_node(n, commands)
    else:
        execute_single_node(node, commands)

def get_management_os_interface(node) -> str or None:
        """
        Gets the name of the management interface used by the node's
        operating system. 

        :return: interface name
        :rtype: String
        """
        stdout, stderr = node.execute("sudo ip -j route list", quiet=True)
        stdout_json = json.loads(stdout)

        for i in stdout_json:
            if i["dst"] == "default":
                return i["dev"]

        stdout, stderr = node.execute("sudo ip -6 -j route list", quiet=True)
        stdout_json = json.loads(stdout)

        for i in stdout_json:
            if i["dst"] == "default":
                return i["dev"]

        return None

0,1
Orchestrator,orchestrator.fabric-testbed.net
Credential Manager,cm.fabric-testbed.net
Core API,uis.fabric-testbed.net
Token File,/home/fabric/.tokens.json
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
Bastion Host,bastion.fabric-testbed.net
Bastion Username,srinivas_0000202712
Bastion Private Key File,/home/fabric/work/fabric_config/fabric-bastion-key
Slice Public Key File,/home/fabric/work/fabric_config/slice_key.pub
Slice Private Key File,/home/fabric/work/fabric_config/slice_key


## Create the slice

In [2]:
# list all slices I have running
output_dataframe = fablib.list_slices(output='pandas')
if output_dataframe:
    print(output_dataframe)
else:
    print('No active slices under this project')

ID,Name,Lease Expiration (UTC),Lease Start (UTC),Project ID,State
5210f023-b755-498a-ba3a-956056527da8,4-node U280 LB Tester Slice using ubuntu22 1,2025-02-24 01:30:08 +0000,2025-01-16 21:08:36 +0000,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca,StableOK


<pandas.io.formats.style.Styler object at 0x7a3b84f11d10>


If your slice is already active you can skip to the 'Get Slice Details' section.

In [3]:
# List available images (this step is optional)
available_images = fablib.get_image_names()

print(f'Available images are: {available_images}')

Available images are: ['default_centos8_stream', 'default_centos9_stream', 'default_centos_7', 'default_centos_8', 'default_debian_11', 'default_debian_12', 'default_fedora_39', 'default_fedora_40', 'default_freebsd_13_zfs', 'default_freebsd_14_zfs', 'default_kali', 'default_openbsd_7', 'default_rocky_8', 'default_rocky_9', 'default_ubuntu_20', 'default_ubuntu_22', 'default_ubuntu_24', 'docker_rocky_8', 'docker_rocky_9', 'docker_ubuntu_20', 'docker_ubuntu_22']


In [4]:
# find an available site in continental US
lon_west=-124.3993243
lon_east=-69.9721573

# getting a random site make take a bit of time
if not site_override:
    selected_site = fablib.get_random_site(filter_function=lambda x: x['location'][1] < lon_east
                                              and x['location'][1] > lon_west) 
else:
    selected_site = site_override

if selected_site:
    print(f'Selected site is {selected_site}')
else:
    print('Unable to find a site matching the requirements')

# write selected site into node attributes
for n in node_config:
    node_config[n]['site'] = selected_site
    

Selected site is SRI


In [5]:
# build a slice
slice = fablib.new_slice(name=slice_name)

# create a network
net1 = slice.add_l2network(name=net_name, subnet=subnet)

nodes = dict()
# create  nodes for sending and receiving with a selected network card
# use subnet address assignment
for node_name, node_attribs in node_config.items():
    print(f"{node_name=} {node_attribs['ip']}")
    nodes[node_name] = slice.add_node(name=node_name, **{x: node_attribs[x] for x in node_attribs if x not in skip_keys})
    nic_interface = nodes[node_name].add_component(model=nic_model, name='_'.join([node_name, nic_model, 'nic'])).get_interfaces()[0]
    net1.add_interface(nic_interface)
    nic_interface.set_mode('config')
    nic_interface.set_ip_addr(node_attribs['ip'])
    # postboot configuration is under 'post-boot' directory
    nodes[node_name].add_post_boot_upload_directory('post-boot','.')
    nodes[node_name].add_post_boot_execute(f'chmod +x post-boot/{node_name}.sh && ./post-boot/{node_name}.sh')

print(f'Creating a {distro_name} based slice named "{slice_name}" with nodes in {selected_site}')

# Submit the slice
slice.submit();


Retry: 8, Time: 305 sec


0,1
ID,8a1bc512-463e-4e7e-963b-c7469c6b5dd8
Name,"UDP LB Control Plane Testing with udplbd[develop], e2sar[e2sar-java] on ubuntu"
Lease Expiration (UTC),2025-02-13 20:22:31 +0000
Lease Start (UTC),2025-02-12 20:22:31 +0000
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
State,StableOK


ID,Name,Cores,RAM,Disk,Image,Image Type,Host,Site,Username,Management IP,State,Error,SSH Command,Public SSH Key File,Private SSH Key File
1390ba42-d8d1-4f00-95b7-4a2d7e494389,cpnode,8,8,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.229,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.229,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
24f8eaf7-5144-41b9-ad8c-62fba5e1216b,recver,8,32,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.55,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.55,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
0f803535-e18b-4d6c-be91-df2d92edeba0,sender,8,32,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.24,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.24,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key


ID,Name,Layer,Type,Site,Subnet,Gateway,State,Error
3a274596-f078-49f8-ae23-13c94fb26480,site_bridge_net,L2,L2Bridge,SRI,192.168.1.0/24,,Active,


Name,Short Name,Node,Network,Bandwidth,Mode,VLAN,MAC,Physical Device,Device,IP Address,Numa Node,Switch Port
sender-sender_NIC_Basic_nic-p1,p1,sender,site_bridge_net,100,config,,06:C9:AC:19:B1:E0,enp7s0,enp7s0,192.168.0.1,6,HundredGigE0/0/0/5
recver-recver_NIC_Basic_nic-p1,p1,recver,site_bridge_net,100,config,,06:F5:D5:C8:1C:BE,enp7s0,enp7s0,192.168.0.2,6,HundredGigE0/0/0/5
cpnode-cpnode_NIC_Basic_nic-p1,p1,cpnode,site_bridge_net,100,config,,12:BA:FF:01:AE:FE,enp7s0,enp7s0,192.168.0.3,6,HundredGigE0/0/0/5



Time to print interfaces 305 seconds


## Get Slice Details

If not creating a new slice, and just continuing work on an existing one, execute this cell (in addition to the preamble) and then any of the cells below will work.

In [3]:
# get slice details (if not creating new)
slice = fablib.get_slice(name=slice_name)
a = slice.show()
nets = slice.list_networks()
nodes = slice.list_nodes()

cpnode = slice.get_node(name="cpnode")    
sender = slice.get_node(name="sender")
recver = slice.get_node(name="recver")


# get node dataplane addresses
cpnode_addr = cpnode.get_interface(network_name=net_name).get_ip_addr()
sender_addr = sender.get_interface(network_name=net_name).get_ip_addr()
recver_addr = recver.get_interface(network_name=net_name).get_ip_addr()

sender_iface = sender.get_interface(network_name=net_name)
recver_iface = recver.get_interface(network_name=net_name)

0,1
ID,8a1bc512-463e-4e7e-963b-c7469c6b5dd8
Name,"UDP LB Control Plane Testing with udplbd[develop], e2sar[e2sar-java] on ubuntu"
Lease Expiration (UTC),2025-02-25 22:37:15 +0000
Lease Start (UTC),2025-02-12 20:22:31 +0000
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
State,StableOK


ID,Name,Layer,Type,Site,Subnet,Gateway,State,Error
3a274596-f078-49f8-ae23-13c94fb26480,site_bridge_net,L2,L2Bridge,SRI,192.168.1.0/24,,Active,


ID,Name,Cores,RAM,Disk,Image,Image Type,Host,Site,Username,Management IP,State,Error,SSH Command,Public SSH Key File,Private SSH Key File
1390ba42-d8d1-4f00-95b7-4a2d7e494389,cpnode,8,8,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.229,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.229,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
24f8eaf7-5144-41b9-ad8c-62fba5e1216b,recver,8,32,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.55,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.55,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
0f803535-e18b-4d6c-be91-df2d92edeba0,sender,8,32,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.24,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.24,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key


## Start the UDPLBd container

In [21]:
# check if any dockers are running already and that we have compose and buildx installed by post-boot script
commands = [
    'docker container ls',
    'docker compose version',
    'docker buildx version'
]
execute_commands(cpnode, commands)

	Executing "docker container ls" on node cpnode
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
	Executing "docker compose version" on node cpnode
Docker Compose version v2.27.0
	Executing "docker buildx version" on node cpnode
github.com/docker/buildx v0.14.0 171fcbeb69d67c90ba7f44f41a9e418f6a6ec1da


In [58]:
# upload the mock config file for UDPLBd 
result = cpnode.upload_file(f'config/{udplbd_config}','lb_mock.yml')

# upload the GitHub SSH key onto the VM
result = cpnode.upload_file(github_key, vm_key_location)

# checkout UDPLBd (including the right branch) using that key
commands = [
    f"chmod go-rwx {vm_key_location}",
    f"GIT_SSH_COMMAND='ssh -i {vm_key_location} -o IdentitiesOnly=yes -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' git clone -b {udplbd_branch} git@github.com:esnet/udplbd.git",
]

execute_commands(cpnode, commands)

	Executing "chmod go-rwx /home/ubuntu/.ssh/github_ecdsa" on node cpnode
	Executing "GIT_SSH_COMMAND='ssh -i /home/ubuntu/.ssh/github_ecdsa -o IdentitiesOnly=yes -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' git clone -b main git@github.com:esnet/udplbd.git" on node cpnode
[31m fatal: destination path 'udplbd' already exists and is not an empty directory.
 [0m

In [11]:
# copy configuration file into place, generate self-signed cert and start the UDPLBd container
commands = [
    f'cp lb_mock.yml ./udplbd/etc/config.yml',
    f'openssl req -x509 -newkey rsa:4096 -keyout udplbd/etc/server_key.pem -out udplbd/etc/server_cert.pem -sha256 -days 365 -nodes -subj "/CN=cpnode/subjectAltName=IP:{cpnode_addr}" -nodes',
    f'cd udplbd; docker compose up -d'
]

execute_commands(cpnode, commands)

	Executing "cp lb_mock.yml ./udplbd/etc/config.yml" on node cpnode
	Executing "openssl req -x509 -newkey rsa:4096 -keyout udplbd/etc/server_key.pem -out udplbd/etc/server_cert.pem -sha256 -days 365 -nodes -subj "/CN=cpnode/subjectAltName=IP:192.168.0.3" -nodes" on node cpnode
[31m .............................+...+.....+...+.+......+...+......+......+.....+.......+..+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*.......+....+...........+.......+...+...+.....+...............+.........+...+.......+..+.+...+.........+..+....+...........+....+.....+.+...+.....+......+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*....+....+........+.......+..+......+....+...+........+.........+.+...+........................+..+.........+.......+...+.................+.......+.....+.+..+..........+..+.+...........+...+......+...............+......+.+........+.+.....+.....................+..................+.+.......................+.+............+...+.....+............+.

In [None]:
# check the logs
commands = [
    'docker compose ls',
    'cd udplbd; docker compose logs'
]

execute_commands(cpnode, commands)

In [10]:
# if you need to restart it, this is the stop part
commands = [
    'cd udplbd; docker compose stop; docker compose rm -f; docker image rm udplbd'
]

execute_commands(cpnode, commands)

	Executing "cd udplbd; docker compose stop; docker compose rm -f; docker image rm udplbd" on node cpnode
Going to remove udplbd-udplbd-1
 Container udplbd-udplbd-1  Stopping
 Container udplbd-udplbd-1  Stopped
 Container udplbd-udplbd-1  Removing
 [0mUntagged: udplbd:latest
Deleted: sha256:bc7f1561cecbf2122fb35eee09b9b96e0c84b6d22231662458fa55ce7fef3ba7
[31m  Container udplbd-udplbd-1  Removed
 [0m

## Download and install E2SAR deb

In [20]:
# install github ssh key and set up build environment variables for interactive logins
commands = [
    f"chmod go-rwx {vm_key_location}",
    f"echo 'export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64' >> ~/.profile",
    f"echo 'export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64' >> ~/.bashrc",
]

for node in [sender, recver]:    
    # upload the GitHub SSH key onto the VM
    result = node.upload_file(github_key, vm_key_location)
    execute_commands(node, commands)

	Executing "chmod go-rwx /home/ubuntu/.ssh/github_ecdsa" on node sender
	Executing "echo 'export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64' >> ~/.profile" on node sender
	Executing "echo 'export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64' >> ~/.bashrc" on node sender
	Executing "chmod go-rwx /home/ubuntu/.ssh/github_ecdsa" on node recver
	Executing "echo 'export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64' >> ~/.profile" on node recver
	Executing "echo 'export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64' >> ~/.bashrc" on node recver


In [21]:
##This block is to upload a commit sign key. Not needed 

# commands = [
#     f"chmod go-rwx {sign_key_location}",
#     f"git config --global gpg.format ssh",
#     f"git config --global user.signingkey {sign_key_location}",
# ]
# for node in [sender, recver]:    
#     # upload the GitHub Signing SSH key onto the VM
#     result = node.upload_file(signing_key, sign_key_location)
#     print(result)
#     execute_commands(node, commands)

-rw-------   1 1000     1000          419 14 Feb 01:47 ?
	Executing "chmod go-rwx /home/ubuntu/.ssh/signing" on node sender
	Executing "git config --global gpg.format ssh" on node sender
	Executing "git config --global user.signingkey /home/ubuntu/.ssh/signing" on node sender
-rw-------   1 1000     1000          419 14 Feb 01:47 ?
	Executing "chmod go-rwx /home/ubuntu/.ssh/signing" on node recver
	Executing "git config --global gpg.format ssh" on node recver
	Executing "git config --global user.signingkey /home/ubuntu/.ssh/signing" on node recver


In [None]:
#download boost and grpc dependencies from releases
commands = [
    f"wget -q -O e2sar-release.deb {e2sar_release_url}",
    f"sudo apt -yq install ./e2sar-release.deb",
]
 
execute_commands([sender, recver], commands)

## Download and build E2SAR-JAVA

In [11]:
#Clone E2SAR-java repo and set enviroment variables
commands = [
    f"GIT_SSH_COMMAND='ssh -i {vm_key_location} -o IdentitiesOnly=yes -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' git clone git@github.com:JeffersonLab/e2sar-java.git",
]
 
execute_commands([sender, recver], commands)

	Executing "GIT_SSH_COMMAND='ssh -i /home/ubuntu/.ssh/github_ecdsa -o IdentitiesOnly=yes -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' git clone git@github.com:JeffersonLab/e2sar-java.git" on node sender
[31m Cloning into 'e2sar-java'...
 [0m	Executing "GIT_SSH_COMMAND='ssh -i /home/ubuntu/.ssh/github_ecdsa -o IdentitiesOnly=yes -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no' git clone git@github.com:JeffersonLab/e2sar-java.git" on node recver
[31m Cloning into 'e2sar-java'...
 [0m

In [12]:
#set JAVA_HOME enviroment variables
java_home = '/usr/lib/jvm/java-1.17.0-openjdk-amd64' ## probably have to change this depending on OS. Tested on Ubuntu 22.04
commands = [
    f"echo 'export JAVA_HOME={java_home}' >> ~/.profile",
    f"echo 'export JAVA_HOME={java_home}' >> ~/.bashrc",
]
 
execute_commands([sender, recver], commands)

	Executing "echo 'export JAVA_HOME=/usr/lib/jvm/java-1.17.0-openjdk-amd64' >> ~/.profile" on node sender
	Executing "echo 'export JAVA_HOME=/usr/lib/jvm/java-1.17.0-openjdk-amd64' >> ~/.bashrc" on node sender
	Executing "echo 'export JAVA_HOME=/usr/lib/jvm/java-1.17.0-openjdk-amd64' >> ~/.profile" on node recver
	Executing "echo 'export JAVA_HOME=/usr/lib/jvm/java-1.17.0-openjdk-amd64' >> ~/.bashrc" on node recver


In [None]:
#Building the JNI Shared library
#Might have to set PKG_CONFIG_PATH for cmake build to work. Works without pkg_config_path on 22.04
commands = [
    f"cd e2sar-java; JAVA_HOME={java_home} cmake -S . -B build",
    f"cd e2sar-java; JAVA_HOME={java_home} cmake --build build",
]

execute_commands([sender, recver], commands)

In [19]:
#Compiling Java code
commands = [
    f"cd e2sar-java; mvn clean compile",
]

execute_commands([sender, recver], commands)

	Executing "cd e2sar-java; mvn clean compile" on node sender
[[1;34mINFO[m] Scanning for projects...
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m----------------------< [0;36morg.jlab.hpdf:e2sar-java[0;1m >----------------------[m
[[1;34mINFO[m] [1mBuilding e2sar-java 0.0.1[m
[[1;34mINFO[m] [1m--------------------------------[ jar ]---------------------------------[m
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-clean-plugin:2.5:clean[m [1m(default-clean)[m @ [36me2sar-java[0;1m ---[m
[[1;34mINFO[m] Deleting /home/ubuntu/e2sar-java/target
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-resources-plugin:2.6:resources[m [1m(default-resources)[m @ [36me2sar-java[0;1m ---[m
[[1;34mINFO[m] Using 'UTF-8' encoding to copy filtered resources.
[[1;34mINFO[m] skip non existing resourceDirectory /home/ubuntu/e2sar-java/src/main/resources
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-compiler-plugin:3.8.1:compile[m [1m(default-compile)[m @ 

## Running Unit tests

In [20]:
##IMPORTANT MAVEN ARGUMENTS NEEDED FOR TESTS
# Need to specify -Djava.library.path for linking with the jnie2sar.so built in the step above. Maven surefire tests do not pick this up directly so it has to be encapsulated as -DargLine='-Djava.library.path=build/'
# -Dtest="" is needed to specify class/package of tests with pattern. If this is not specified then all classes with *TEST* will be run by maven surefire

commands = [
    f"cd e2sar-java; LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64 mvn -DargLine='-Djava.library.path=build/' test -Dtest='org.jlab.hpdf.unit.**'",
]

execute_commands([sender, recver], commands)

	Executing "cd e2sar-java; LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64 mvn -DargLine='-Djava.library.path=build/' test -Dtest='org.jlab.hpdf.unit.**'" on node sender
[[1;34mINFO[m] Scanning for projects...
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m----------------------< [0;36morg.jlab.hpdf:e2sar-java[0;1m >----------------------[m
[[1;34mINFO[m] [1mBuilding e2sar-java 0.0.1[m
[[1;34mINFO[m] [1m--------------------------------[ jar ]---------------------------------[m
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-resources-plugin:2.6:resources[m [1m(default-resources)[m @ [36me2sar-java[0;1m ---[m
[[1;34mINFO[m] Using 'UTF-8' encoding to copy filtered resources.
[[1;34mINFO[m] skip non existing resourceDirectory /home/ubuntu/e2sar-java/src/main/resources
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-compiler-plugin:3.8.1:compile[m [1m(default-compile)[m @ [36me2sar-java[0;1m ---[m
[[1;34mINFO[m] Nothing to compile - all classes are up

## Running Live test

In [13]:
##IMPORTANT MAVEN ARGUMENTS NEEDED FOR TESTS
# Need to specify -Djava.library.path for linking with the jnie2sar.so built in the step above. Maven surefire tests do not pick this up directly so it has to be encapsulated as -DargLine='-Djava.library.path=build/'
# -Dtest="" is needed to specify class/package of tests with pattern. If this is not specified then all classes with *TEST* will be run by maven surefire

commands = [
    f"cd e2sar-java; EJFAT_URI='ejfats://udplbd@{cpnode_addr}:18347/lb/1?data=127.0.0.1&sync=192.168.88.199:1234' LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64 mvn -DargLine='-Djava.library.path=build/' test -Dtest='org.jlab.hpdf.live.**'",
]

execute_commands(sender, commands)

	Executing "cd e2sar-java; EJFAT_URI='ejfats://udplbd@192.168.0.3:18347/lb/1?data=127.0.0.1&sync=192.168.88.199:1234' LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64 mvn -DargLine='-Djava.library.path=build/' test -Dtest='org.jlab.hpdf.live.**'" on node sender
[[1;34mINFO[m] Scanning for projects...
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m----------------------< [0;36morg.jlab.hpdf:e2sar-java[0;1m >----------------------[m
[[1;34mINFO[m] [1mBuilding e2sar-java 0.0.1[m
[[1;34mINFO[m] [1m--------------------------------[ jar ]---------------------------------[m
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-resources-plugin:2.6:resources[m [1m(default-resources)[m @ [36me2sar-java[0;1m ---[m
[[1;34mINFO[m] Using 'UTF-8' encoding to copy filtered resources.
[[1;34mINFO[m] skip non existing resourceDirectory /home/ubuntu/e2sar-java/src/main/resources
[[1;34mINFO[m] 
[[1;34mINFO[m] [1m--- [0;32mmaven-compiler-plugin:3.8.1:compile[m [1m(default-compile)

## Testing LBmon
The following blocks are to install lbadm and lbmon in the default location ($HOME/e2sar-install/) and a simple test to reserve a LB from the sender and verify that the Lbmon works. (If this block is run later the LBid might be different, you would have to change the parameter in LBmon)

If the Lbid is specified, the status of that reserved LB will be obtained, otherwise the Lb overview will be obtained. The appropriate instance/admin token must be given. For the status you need to copy the EJFAT_URI given by the Lbadm reserve operation to Lbmon. 

In [None]:
commands = [
    f"EJFAT_URI='ejfats://udplbd@{cpnode_addr}:18347?sync=192.168.100.10:19020&data=192.168.101.10:18020' LD_LIBRARY_PATH=/usr/local/lib lbadm --reserve -e -v -6 -l myLib -d 01 -a 192.168.0.3",
]
execute_commands(sender,commands)

In [None]:
#using option e to suppress messages
commands = [
    f"EJFAT_URI='ejfats://udplbd@{cpnode_addr}:18347?sync=192.168.100.10:19020&data=192.168.101.10:18020' BOOST_ROOT=/usr/local/ LD_LIBRARY_PATH=/usr/local/lib lbadm --reserve -e -v -6 -l myLib -d 01 -a 192.168.0.3",
]
execute_commands(sender,commands)

In [None]:
#need to replace the EJFAT_URI created in the last step
ejfat_uri = "ejfats://999e421bf36382c8cb07f1ac3a355afccf9ed0e9e0e0d1e91947bc21d57170e6@192.168.0.3:18347/lb/1?sync=192.168.0.3:19531&data=192.0.2.1&data=[2001:db8::1]"
commands = [
    f"EJFAT_URI='{ejfat_uri}' BOOST_ROOT=/usr/local/ LD_LIBRARY_PATH=/usr/local/lib lbmon -v -6"
]
execute_commands(sender,commands)

In [None]:
#No need to replace uri, because we are using admin token and getting the overview of the LB
commands = [
    f"EJFAT_URI='ejfats://udplbd@192.168.0.3:18347' BOOST_ROOT=/usr/local/ LD_LIBRARY_PATH=/usr/local/lib lbmon -v -6"
]
execute_commands(sender,commands)

# Performance Testing

We use `e2sar_perf` program located under bin/ to test performance of segmenting and reassembly code. To reach higher rates we must update some system parameters first.

Set up large socket buffers for receive and send (512M), 9k MTU on both sender and receiver and test that it worked. For rates over 1Gbps this is a must.

In [None]:
# set system-wide send and receive socket buffer limits to 512MB. e2sar_perf then will set SO_RCVBUF and SO_SNDBUF options on sending and receiving sockets
# this is system specific, so we don't do it through a file, but on command line. Normally this goes into /etc/sysctl.conf or /etc/sysctl.d/90-local.conf 
# or similar
commands = [
    f"sudo sysctl net.core.rmem_max=536870912",
    f"sudo sysctl net.core.wmem_max=536870912",
    f"sysctl net.core.wmem_max net.core.rmem_max"
]
execute_commands([sender, recver], commands);

In [None]:
# note  that in this slice we are guaranteed to have path MTU to be at least 9k, because FABRIC
# switches are configured for jumbo frames. In real life you need to consult your network administrator
# as simply setting MTU on sender and receiver may be insufficient.
mtu = '9000'
sender.execute(f"sudo ip link set dev {sender_iface.get_os_interface()} mtu {mtu}")
recver.execute(f"sudo ip link set dev {recver_iface.get_os_interface()} mtu {mtu}")

# test with no-defragment (DF=1) ping packets that path indeed supports MTU of 9000 
# (ping  packet  of 8972 payload length)
# send 10 packets and expect all of them to make it
stdout, stderr = sender.execute(f"sudo ping -f -s 8972 -c 10 -M do {recver_addr}")

In [None]:
# We need to setup the firewall to allow traffic to pass to the receiver

mgmt_iface_name = get_management_os_interface(recver)
data_iface = recver.get_interface(network_name=net_name)
data_iface_name = data_iface.get_os_interface()

print(f'Adding {mgmt_iface_name} and lo and data interface to trusted zone')
commands = [
    f'sudo firewall-cmd --permanent --zone=trusted --add-interface={data_iface_name}',
    f'sudo firewall-cmd --permanent --zone=trusted --add-interface=lo',
    f'sudo firewall-cmd --permanent --zone=trusted --add-interface={mgmt_iface_name}',
    f'for i in $(sudo firewall-cmd --zone=public --list-services); do sudo firewall-cmd --zone=public --permanent --remove-service=$i; done',
]
commands.append(f'sudo firewall-cmd --reload')
commands.append(f'sudo firewall-cmd --list-all --zone=public')

execute_commands([recver], commands)

In [None]:
import time

# for e2sar_perf only the data= part of the query is meaningful. sync= must be present but is ignored
# same for gRPC token, address and port (and lb id)
e2sarPerfURI = f"ejfat://useless@10.10.10.10:1234/lb/1?data={recver_addr}&sync=192.168.77.7:1234"
recverDuration = 20
mtu = 9000
rate = 15 # Gbps
length = 1000000 # event length in bytes
numEvents = 10000 # number of events to send
bufSize = 300 * 1024 * 1024 # 100MB send and receive buffers

recv_command = f"LD_LIBRARY_PATH=/usr/local/lib  e2sar_perf -r -u '{e2sarPerfURI}' -d {recverDuration} -b {bufSize} --ip {recver_addr} --port 19522"
send_command = f"LD_LIBRARY_PATH=/usr/local/lib  e2sar_perf -s -u '{e2sarPerfURI}' --mtu {mtu} --rate {rate} --length {length} -n {numEvents} -b {bufSize}"

# start the receiver for 10 seconds and log its output
print(f'Executing command {recv_command} on receiver')
recver.execute_thread(recv_command, output_file=f"{recver.get_name()}.perf.log")

# sleep 2 seconds to let receiver get going
time.sleep(2)

# start the sender in the foreground
print(f'Executing command {send_command} on sender')
stdout_send, stderr_send = sender.execute(send_command, output_file=f"{sender.get_name()}.perf.log")

print(f"Inspect {recver.get_name()}.perf.log file in your Jupyter container to see the results")

## You can SSH into the sender/receiver nodes and run lbadm, lbmon, or compile, link and test your application against E2SAR

## Manage the slice

### Extend

In [71]:
# Set end host to now plus 14 days
end_date = (datetime.now(timezone.utc) + timedelta(days=14)).strftime("%Y-%m-%d %H:%M:%S %z")

try:
    slice = fablib.get_slice(name=slice_name)

    slice.renew(end_date)
except Exception as e:
    print(f"Exception: {e}")


Retry: 0, Time: 27 sec


0,1
ID,8a1bc512-463e-4e7e-963b-c7469c6b5dd8
Name,"UDP LB Control Plane Testing with udplbd[develop], e2sar[e2sar-java] on ubuntu"
Lease Expiration (UTC),2025-02-25 22:37:15 +0000
Lease Start (UTC),2025-02-12 20:22:31 +0000
Project ID,bbe0d94c-736b-477a-a2e6-fef9fe7ac9ca
State,StableOK


ID,Name,Cores,RAM,Disk,Image,Image Type,Host,Site,Username,Management IP,State,Error,SSH Command,Public SSH Key File,Private SSH Key File
1390ba42-d8d1-4f00-95b7-4a2d7e494389,cpnode,8,8,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.229,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.229,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
24f8eaf7-5144-41b9-ad8c-62fba5e1216b,recver,8,32,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.55,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.55,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key
0f803535-e18b-4d6c-be91-df2d92edeba0,sender,8,32,100,default_ubuntu_22,qcow2,sri-w1.fabric-testbed.net,SRI,ubuntu,192.5.67.24,Active,,ssh -i /home/fabric/work/fabric_config/slice_key -F /home/fabric/work/fabric_config/ssh_config ubuntu@192.5.67.24,/home/fabric/work/fabric_config/slice_key.pub,/home/fabric/work/fabric_config/slice_key


ID,Name,Layer,Type,Site,Subnet,Gateway,State,Error
3a274596-f078-49f8-ae23-13c94fb26480,site_bridge_net,L2,L2Bridge,SRI,192.168.1.0/24,,Active,


Name,Short Name,Node,Network,Bandwidth,Mode,VLAN,MAC,Physical Device,Device,IP Address,Numa Node,Switch Port
sender-sender_NIC_Basic_nic-p1,p1,sender,site_bridge_net,100,config,,06:C9:AC:19:B1:E0,enp7s0,enp7s0,192.168.0.1,6,HundredGigE0/0/0/5
recver-recver_NIC_Basic_nic-p1,p1,recver,site_bridge_net,100,config,,06:F5:D5:C8:1C:BE,enp7s0,enp7s0,192.168.0.2,6,HundredGigE0/0/0/5
cpnode-cpnode_NIC_Basic_nic-p1,p1,cpnode,site_bridge_net,100,config,,12:BA:FF:01:AE:FE,enp7s0,enp7s0,192.168.0.3,6,HundredGigE0/0/0/5



Time to print interfaces 27 seconds


### Delete

In [None]:
slice = fablib.get_slice(slice_name)
slice.delete()